Search Results for author: Yiqiao Zhong

Found 13 papers, 6 papers with code

Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning

no code implementations24 May 2025 Haolin Yang, Hakaze Cho, Yiqiao Zhong, Naoya Inoue

The unusual properties of in-context learning (ICL) have prompted investigations into the internal mechanisms of large language models.

In-Context Learning

A statistical theory of overfitting for imbalanced classification

1 code implementation17 Feb 2025 Jingyang Lyu, Kangjie Zhou, Yiqiao Zhong

Classification with imbalanced data is a common challenge in data analysis, where certain classes (minority classes) account for a small fraction of the training data compared with other classes (majority classes).

Classification imbalanced classification

Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective

2 code implementations22 Oct 2024 Zhexuan Liu, Rong Ma, Yiqiao Zhong

We find that the manifold learning interpretations from many prior works are inaccurate and that the misuse stems from a lack of data-independent notions of embedding maps, which project high-dimensional data into a lower-dimensional space.

Diagnostic

Out-of-distribution generalization via composition: a lens through induction heads in Transformers

1 code implementation18 Aug 2024 Jiajun Song, Zhuoyan Xu, Yiqiao Zhong

We empirically examined the training dynamics of Transformers on a synthetic example and conducted extensive experiments on a variety of pretrained LLMs, focusing on a type of components known as induction heads.

In-Context Learning Out-of-Distribution Generalization

Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity

2 code implementations22 May 2024 Rheeya Uppaal, Apratim Dey, Yiting He, Yiqiao Zhong, Junjie Hu

Furthermore, these tuning-based methods require large-scale preference data for training and are susceptible to noisy preference data.

Language Modelling Model Editing

How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes

1 code implementation4 Apr 2024 Harmon Bhasin, Timothy Ossowski, Yiqiao Zhong, Junjie Hu

Large language models (LLM) have recently shown the extraordinary ability to perform unseen tasks based on few-shot examples provided as text, also known as in-context learning (ICL).

In-Context Learning Multi-Task Learning

Uncovering hidden geometry in Transformers via disentangling position and context

1 code implementation7 Oct 2023 Jiajun Song, Yiqiao Zhong

Given embedding vector $\boldsymbol{h}_{c, t} \in \mathbb{R}^d$ at sequence position $t \le T$ in a sequence (or context) $c \le C$, extracting the mean effects yields the decomposition \[ \boldsymbol{h}_{c, t} = \boldsymbol{\mu} + \mathbf{pos}_t + \mathbf{ctx}_c + \mathbf{resid}_{c, t} \] where $\boldsymbol{\mu}$ is the global mean vector, $\mathbf{pos}_t$ and $\mathbf{ctx}_c$ are the mean vectors across contexts and across positions respectively, and $\mathbf{resid}_{c, t}$ is the residual vector.

Dictionary Learning POS +1

Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage

no code implementations6 Jun 2023 Yu Gui, Cong Ma, Yiqiao Zhong

Firstly, through empirical and theoretical analysis, we identify two crucial effects -- expansion and shrinkage -- induced by the contrastive loss on the projectors.

Contrastive Learning

Tractability from overparametrization: The example of the negative perceptron

no code implementations28 Oct 2021 Andrea Montanari, Yiqiao Zhong, Kangjie Zhou

In the negative perceptron problem we are given $n$ data points $({\boldsymbol x}_i, y_i)$, where ${\boldsymbol x}_i$ is a $d$-dimensional vector and $y_i\in\{+1,-1\}$ is a binary label.

A Selective Overview of Deep Learning

no code implementations10 Apr 2019 Jianqing Fan, Cong Ma, Yiqiao Zhong

Deep learning has arguably achieved tremendous success in recent years.

Deep Learning

Robust high dimensional factor models with applications to statistical machine learning

no code implementations12 Aug 2018 Jianqing Fan, Kaizheng Wang, Yiqiao Zhong, Ziwei Zhu

Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance.

BIG-bench Machine Learning Model Selection +1

Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

no code implementations6 Jan 2014 Chi Jin, Ziteng Wang, Junliang Huang, Yiqiao Zhong, Li-Wei Wang

We develop an $\epsilon$-differentially private mechanism for the class of $K$-smooth queries.

Cannot find the paper you are looking for? You can Submit a new open access paper.