no code implementations • 22 May 2025 • Zeping Yu, Sophia Ananiadou
Although multimodal large language models (MLLMs) have achieved impressive performance, the multimodal instruction tuning stage often causes catastrophic forgetting of the base LLM's language ability, even in strong models like Llama3.
no code implementations • 15 Feb 2025 • Zeping Yu, Yonatan Belinkov, Sophia Ananiadou
We investigate how large language models perform latent multi-hop reasoning in prompts like "Wolfgang Amadeus Mozart's mother's spouse is".
no code implementations • 24 Jan 2025 • Zeping Yu, Sophia Ananiadou
Existing methods to mitigate bias lack a comprehensive understanding of its mechanisms or compromise the model's core capabilities.
1 code implementation • 17 Nov 2024 • Zeping Yu, Sophia Ananiadou
Understanding the mechanisms behind Large Language Models (LLMs) is crucial for designing improved models and strategies.
2 code implementations • 21 Sep 2024 • Zeping Yu, Sophia Ananiadou
We find arithmetic ability resides within a limited number of attention heads, with each head specializing in distinct operations.
2 code implementations • 5 Feb 2024 • Zeping Yu, Sophia Ananiadou
We investigate the mechanism of in-context learning (ICL) on sentence classification tasks with semantically-unrelated labels ("foo"/"bar").
3 code implementations • 19 Dec 2023 • Zeping Yu, Sophia Ananiadou
Additionally, since most static methods typically only identify "value neurons" directly contributing to the final prediction, we propose a method for identifying "query neurons" which activate these "value neurons".
no code implementations • 1 Nov 2023 • Zhiwei Liu, Tianlin Zhang, Kailai Yang, Paul Thompson, Zeping Yu, Sophia Ananiadou
The emotions and sentiments of netizens, as expressed in social media posts and news, constitute important factors that can help to distinguish fake news from genuine news and to understand the spread of rumors.
1 code implementation • NeurIPS 2020 • Zeping Yu, Wenxin Zheng, Jiaqi Wang, Qiyi Tang, Sen Nie, Shi Wu
We adopt Deep Pyramid Convolutional Neural Network (DPCNN) for source code feature extraction and Graph Neural Network (GNN) for binary code feature extraction.
2 code implementations • IJCAI 2019 • Zeping Yu, Jianxun Lian, Ahmad Mahmoody, Gongshen Liu, Xing Xie
User modeling is an essential task for online rec- ommender systems.
Ranked #2 on
Recommendation Systems
on Amazon Product Data
3 code implementations • COLING 2018 • Zeping Yu, Gongshen Liu
In this paper, we introduce sliced recurrent neural networks (SRNNs), which could be parallelized by slicing the sequences into many subsequences.
Ranked #6 on
Sentiment Analysis
on Amazon Review Full