1 code implementation • 29 Apr 2022 • Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig
One of the most impressive results of recent NLP history is the ability of pre-trained language models to solve new tasks in a zero-shot setting.
1 code implementation • ICLR 2022 • Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig
Furthermore, our unified framework enables the transfer of design elements across different approaches, and as a result we are able to instantiate new parameter-efficient fine-tuning methods that tune less parameters than previous methods while being more effective, achieving comparable results to fine-tuning all parameters on all four tasks.
1 code implementation • EMNLP 2021 • Chunting Zhou, Daniel Levy, Xian Li, Marjan Ghazvininejad, Graham Neubig
Multilingual neural machine translation (MNMT) learns to translate multiple language pairs with a single model, potentially improving both the accuracy and the memory-efficiency of deployed models.
1 code implementation • 14 Jun 2021 • Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig
Group distributionally robust optimization (DRO) provides an effective tool to alleviate covariate shift by minimizing the worst-case training loss over a set of pre-defined groups.
2 code implementations • NeurIPS 2021 • Xuezhe Ma, Xiang Kong, Sinong Wang, Chunting Zhou, Jonathan May, Hao Ma, Luke Zettlemoyer
Specifically, with the first attention function, Luna packs the input sequence into a sequence of fixed length.
no code implementations • 27 May 2021 • Jinhui Yuan, Fei Pan, Chunting Zhou, Tao Qin, Tie-Yan Liu
We further establish connections between this principle and the theory of Bayesian optimal classification, and empirically verify that larger entropy of the outputs of a deep neural network indeed corresponds to a better classification accuracy.
1 code implementation • Findings (ACL) 2021 • Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, Marjan Ghazvininejad
Neural sequence models can generate highly fluent sentences, but recent studies have also shown that they are also prone to hallucinate additional content not supported by the input.
no code implementations • ICLR 2020 • Chunting Zhou, Graham Neubig, Jiatao Gu
We find that knowledge distillation can reduce the complexity of data sets and help NAT to model the variations in the output data.
2 code implementations • IJCNLP 2019 • Xuezhe Ma, Chunting Zhou, Xi-An Li, Graham Neubig, Eduard Hovy
Most sequence-to-sequence (seq2seq) models are autoregressive; they generate each token by conditioning on previously generated tokens.
Ranked #3 on
Machine Translation
on WMT2016 English-Romanian
1 code implementation • IJCNLP 2019 • Chunting Zhou, Xuezhe Ma, Junjie Hu, Graham Neubig
Despite impressive empirical successes of neural machine translation (NMT) on standard benchmarks, limited parallel data impedes the application of NMT models to many language pairs.
1 code implementation • NAACL 2019 • Chunting Zhou, Xuezhe Ma, Di Wang, Graham Neubig
Recent approaches to cross-lingual word embedding have generally been based on linear transformations between the sets of embedding vectors in the two languages.
no code implementations • 24 Feb 2019 • Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown
This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).
no code implementations • ICLR 2019 • Xuezhe Ma, Chunting Zhou, Eduard Hovy
Variational Autoencoder (VAE), a simple and effective deep generative model, has led to a number of impressive empirical successes and spawned many advanced variants and theoretical investigations.
1 code implementation • EMNLP 2018 • Aditi Chaudhary, Chunting Zhou, Lori Levin, Graham Neubig, David R. Mortensen, Jaime G. Carbonell
Much work in Natural Language Processing (NLP) has been for resource-rich languages, making generalization to new, less-resourced languages challenging.
6 code implementations • ACL 2018 • Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig
Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.
no code implementations • ACL 2017 • Chunting Zhou, Graham Neubig
Labeled sequence transduction is a task of transforming one sequence into another sequence that satisfies desiderata specified by a set of labels.
no code implementations • 27 Nov 2015 • Chunting Zhou, Chonglin Sun, Zhiyuan Liu, Francis C. M. Lau
In this paper, we incorporate category information of documents in the learning of word representations and to learn the proposed models in a document-wise manner.
10 code implementations • 27 Nov 2015 • Chunting Zhou, Chonglin Sun, Zhiyuan Liu, Francis C. M. Lau
In this work, we combine the strengths of both architectures and propose a novel and unified model called C-LSTM for sentence representation and text classification.
Ranked #9 on
Text Classification
on TREC-6