1 code implementation • 13 Aug 2024 • Bo-Wen Zhang, Liangdong Wang, Ye Yuan, Jijie Li, Shuhao Gu, Mengdi Zhao, Xinya Wu, Guang Liu, ChengWei Wu, Hanyu Zhao, Li Du, Yiming Ju, Quanyue Ma, Yulong Ao, Yingli Zhao, Songhe Zhu, Zhou Cao, Dong Liang, Yonghua Lin, Ming Zhang, Shunfei Wang, Yanxin Zhou, Min Ye, Xuekai Chen, Xinyang Yu, Xiangjun Huang, Jian Yang
In this paper, we present AquilaMoE, a cutting-edge bilingual 8*16B Mixture of Experts (MoE) language model that has 8 experts with 16 billion parameters each and is developed using an innovative training methodology called EfficientScale.
1 code implementation • 15 Jun 2021 • Xiangyu Chen, Min Ye
In the same paper, a list decoding procedure was also introduced for two widely used classes of cyclic codes -- BCH codes and punctured Reed-Muller (RM) codes.
1 code implementation • 12 May 2021 • Xiangyu Chen, Min Ye
Finally, we propose a list decoding procedure that can significantly reduce the decoding error probability for BCH codes and punctured RM codes.
no code implementations • 13 Apr 2020 • Min Ye
We show that when $m\ge m^\ast$, one can recover the clusters from $m$ samples in $O(n)$ time as the number of vertices $n$ goes to infinity.
no code implementations • 16 Oct 2018 • Min Ye, Alexander Barg
In this paper, we sharpen this result by showing asymptotic optimality of the proposed scheme under the $\ell_p^p$ loss for all $1\le p\le 2.$ More precisely, we show that for any $p\in[1, 2]$ and any $k$ and $\epsilon,$ the ratio between the worst-case $\ell_p^p$ estimation loss of our scheme and the optimal value approaches $1$ as the number of samples tends to infinity.
no code implementations • ICML 2018 • Min Ye, Emmanuel Abbe
This paper develops coding techniques to reduce the running time of distributed learning tasks.
no code implementations • 31 Jul 2017 • Min Ye, Alexander Barg
In other words, for a large number of samples the worst-case estimation loss of our scheme was shown to differ from the optimal value by at most a constant factor.
no code implementations • 2 Feb 2017 • Min Ye, Alexander Barg
For a given $\epsilon,$ we consider the problem of constructing optimal privatization schemes with $\epsilon$-privacy level, i. e., schemes that minimize the expected estimation loss for the worst-case distribution.