1 code implementation • 25 Jun 2024 • Wenyu Du, Shuang Cheng, Tongxu Luo, Zihan Qiu, Zeyu Huang, Ka Chun Cheung, Reynold Cheng, Jie Fu
To address this limitation, we introduce $\textbf{MIGU}$ ($\textbf{M}$agn$\textbf{I}$tude-based $\textbf{G}$radient $\textbf{U}$pdating for continual learning), a rehearsal-free and task-label-free method that only updates the model parameters with large magnitudes of output in LMs' linear layers.
no code implementations • 24 May 2024 • Wenyu Du, Tongxu Luo, Zihan Qiu, Zeyu Huang, Yikang Shen, Reynold Cheng, Yike Guo, Jie Fu
For example, compared to a conventionally trained 7B model using 300B tokens, our $G_{\text{stack}}$ model converges to the same loss with 194B tokens, resulting in a 54. 6\% speedup.
1 code implementation • 26 Feb 2024 • Ka Man Lo, Yiming Liang, Wenyu Du, Yuantao Fan, Zili Wang, Wenhao Huang, Lei Ma, Jie Fu
Additionally, the V-MoE-Base model trained with m2mKD achieves 3. 5% higher accuracy than end-to-end training on ImageNet-1k.
1 code implementation • 27 Jul 2023 • Yuqiao Wen, Zichao Li, Wenyu Du, Lili Mou
Experiments across four datasets show that our methods outperform existing KD approaches, and that our symmetric distilling losses can better force the student to learn from the teacher distribution.
1 code implementation • 18 Jan 2023 • Jinyang Li, Binyuan Hui, Reynold Cheng, Bowen Qin, Chenhao Ma, Nan Huo, Fei Huang, Wenyu Du, Luo Si, Yongbin Li
Recently, the pre-trained text-to-text transformer model, namely T5, though not specialized for text-to-SQL parsing, has achieved state-of-the-art performance on standard benchmarks targeting domain generalization.
Ranked #4 on Semantic Parsing on spider
no code implementations • 29 Nov 2022 • Zheng Cao, Raymond Guo, Wenyu Du, Jiayi Gao, Kirill V. Golubnichiy
This paper introduced key aspects of applying Machine Learning (ML) models, improved trading strategies, and the Quasi-Reversibility Method (QRM) to optimize stock option forecasting and trading results.
no code implementations • 25 Aug 2022 • Zheng Cao, Wenyu Du, Kirill V. Golubnichiy
Following results from the paper Quasi-Reversibility Method and Neural Network Machine Learning to Solution of Black-Scholes Equations (appeared on the AMS Contemporary Mathematics journal), we create and evaluate new empirical mathematical models for the Black-Scholes equation to analyze data for 92, 846 companies.
1 code implementation • ACL 2021 • Qiankun Fu, Linfeng Song, Wenyu Du, Yue Zhang
Although parsing to Abstract Meaning Representation (AMR) has become very popular and AMR has been shown effective on the many sentence-level downstream tasks, little work has studied how to generate AMRs that can represent multi-sentence information.
1 code implementation • EMNLP 2021 • Jacob Louis Hoover, Alessandro Sordoni, Wenyu Du, Timothy J. O'Donnell
Are pairs of words that tend to occur together also likely to stand in a linguistic dependency?
1 code implementation • ACL 2020 • Wenyu Du, Zhouhan Lin, Yikang Shen, Timothy J. O'Donnell, Yoshua Bengio, Yue Zhang
It is commonly believed that knowledge of syntactic structure should improve language modeling.
1 code implementation • 7 Mar 2018 • Wenyu Du, Shuai Yu, Min Yang, Qiang Qu, Jia Zhu
Finally, we concatenate the projective vectors from bipartite subnetworks with the ones learned from homogeneous subnetworks to form the final representation of the heterogeneous network.