no code implementations • 24 May 2025 • Haolin Yang, Hakaze Cho, Yiqiao Zhong, Naoya Inoue
The unusual properties of in-context learning (ICL) have prompted investigations into the internal mechanisms of large language models.
1 code implementation • 17 Feb 2025 • Jingyang Lyu, Kangjie Zhou, Yiqiao Zhong
Classification with imbalanced data is a common challenge in data analysis, where certain classes (minority classes) account for a small fraction of the training data compared with other classes (majority classes).
2 code implementations • 22 Oct 2024 • Zhexuan Liu, Rong Ma, Yiqiao Zhong
We find that the manifold learning interpretations from many prior works are inaccurate and that the misuse stems from a lack of data-independent notions of embedding maps, which project high-dimensional data into a lower-dimensional space.
1 code implementation • 18 Aug 2024 • Jiajun Song, Zhuoyan Xu, Yiqiao Zhong
We empirically examined the training dynamics of Transformers on a synthetic example and conducted extensive experiments on a variety of pretrained LLMs, focusing on a type of components known as induction heads.
2 code implementations • 22 May 2024 • Rheeya Uppaal, Apratim Dey, Yiting He, Yiqiao Zhong, Junjie Hu
Furthermore, these tuning-based methods require large-scale preference data for training and are susceptible to noisy preference data.
1 code implementation • 4 Apr 2024 • Harmon Bhasin, Timothy Ossowski, Yiqiao Zhong, Junjie Hu
Large language models (LLM) have recently shown the extraordinary ability to perform unseen tasks based on few-shot examples provided as text, also known as in-context learning (ICL).
1 code implementation • 7 Oct 2023 • Jiajun Song, Yiqiao Zhong
Given embedding vector $\boldsymbol{h}_{c, t} \in \mathbb{R}^d$ at sequence position $t \le T$ in a sequence (or context) $c \le C$, extracting the mean effects yields the decomposition \[ \boldsymbol{h}_{c, t} = \boldsymbol{\mu} + \mathbf{pos}_t + \mathbf{ctx}_c + \mathbf{resid}_{c, t} \] where $\boldsymbol{\mu}$ is the global mean vector, $\mathbf{pos}_t$ and $\mathbf{ctx}_c$ are the mean vectors across contexts and across positions respectively, and $\mathbf{resid}_{c, t}$ is the residual vector.
no code implementations • 6 Jun 2023 • Yu Gui, Cong Ma, Yiqiao Zhong
Firstly, through empirical and theoretical analysis, we identify two crucial effects -- expansion and shrinkage -- induced by the contrastive loss on the projectors.
no code implementations • 28 Oct 2021 • Andrea Montanari, Yiqiao Zhong, Kangjie Zhou
In the negative perceptron problem we are given $n$ data points $({\boldsymbol x}_i, y_i)$, where ${\boldsymbol x}_i$ is a $d$-dimensional vector and $y_i\in\{+1,-1\}$ is a binary label.
no code implementations • 25 Jul 2020 • Andrea Montanari, Yiqiao Zhong
We assume that both the sample size $n$ and the dimension $d$ are large, and they are polynomially related.
no code implementations • 10 Apr 2019 • Jianqing Fan, Cong Ma, Yiqiao Zhong
Deep learning has arguably achieved tremendous success in recent years.
no code implementations • 12 Aug 2018 • Jianqing Fan, Kaizheng Wang, Yiqiao Zhong, Ziwei Zhu
Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance.
no code implementations • 6 Jan 2014 • Chi Jin, Ziteng Wang, Junliang Huang, Yiqiao Zhong, Li-Wei Wang
We develop an $\epsilon$-differentially private mechanism for the class of $K$-smooth queries.