no code implementations • ACL 2022 • Jiatao Gu, Xu Tan
Non-autoregressive sequence generation (NAR) attempts to generate the entire or partial output sequences in parallel to speed up the generation process and avoid potential issues (e. g., label bias, exposure bias) in autoregressive generation.
1 code implementation • ICML 2020 • Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.
1 code implementation • EMNLP (ACL) 2021 • Changhan Wang, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Ann Lee, Peng-Jen Chen, Jiatao Gu, Juan Pino
This paper presents fairseq Sˆ2, a fairseq extension for speech synthesis.
no code implementations • WMT (EMNLP) 2020 • Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu
We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.
no code implementations • 5 Jun 2023 • Yizhe Zhang, Jiatao Gu, Zhuofeng Wu, Shuangfei Zhai, Josh Susskind, Navdeep Jaitly
Autoregressive models for text sometimes generate repetitive and low-quality output because errors accumulate during the steps of generation.
no code implementations • 13 Apr 2023 • Jiatao Gu, Qingzhe Gao, Shuangfei Zhai, Baoquan Chen, Lingjie Liu, Josh Susskind
To address these challenges, We present Control3Diff, a 3D diffusion model that combines the strengths of diffusion models and 3D GANs for versatile, controllable 3D-aware image synthesis for single-view datasets.
no code implementations • 13 Apr 2023 • Hansheng Chen, Jiatao Gu, Anpei Chen, Wei Tian, Zhuowen Tu, Lingjie Liu, Hao Su
3D-aware image synthesis encompasses a variety of tasks, such as scene generation and novel view synthesis from images.
no code implementations • 11 Mar 2023 • Shuangfei Zhai, Tatiana Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, Yizhe Zhang, Jiatao Gu, Josh Susskind
We show that $\sigma$Reparam provides stability and robustness with respect to the choice of hyperparameters, going so far as enabling training (a) a Vision Transformer to competitive performance without warmup, weight decay, layer normalization or adaptive optimizers; (b) deep architectures in machine translation and (c) speech recognition to competitive performance without warmup and adaptive optimizers.
no code implementations • 7 Mar 2023 • Chen Huang, Hanlin Goh, Jiatao Gu, Josh Susskind
We do so by Masked Augmentation Subspace Training (or MAST) to encode in the single feature space the priors from different data augmentations in a factorized way.
no code implementations • 1 Mar 2023 • Peiye Zhuang, Samira Abnar, Jiatao Gu, Alex Schwing, Joshua M. Susskind, Miguel Ángel Bautista
Diffusion probabilistic models have quickly become a major approach for generative modeling of images, 3D geometry, video and other domains.
no code implementations • 20 Feb 2023 • Jiatao Gu, Alex Trevithick, Kai-En Lin, Josh Susskind, Christian Theobalt, Lingjie Liu, Ravi Ramamoorthi
Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input.
no code implementations • 10 Oct 2022 • Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Miguel Angel Bautista, Josh Susskind
In this work, we propose f-DM, a generalized family of DMs which allows progressive signal transformation.
no code implementations • 10 Jul 2022 • Peng Wang, YuAn Liu, Guying Lin, Jiatao Gu, Lingjie Liu, Taku Komura, Wenping Wang
ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.
no code implementations • EACL 2021 • Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li
Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.
no code implementations • Findings (ACL) 2022 • Khalil Mrini, Shaoliang Nie, Jiatao Gu, Sinong Wang, Maziar Sanjabi, Hamed Firooz
Without the use of a knowledge base or candidate sets, our model sets a new state of the art in two benchmark datasets of entity linking: COMETA in the biomedical domain, and AIDA-CoNLL in the news domain.
no code implementations • ACL 2022 • Yun Tang, Hongyu Gong, Ning Dong, Changhan Wang, Wei-Ning Hsu, Jiatao Gu, Alexei Baevski, Xian Li, Abdelrahman Mohamed, Michael Auli, Juan Pino
Two pre-training configurations for speech translation and recognition, respectively, are presented to alleviate subtask interference.
no code implementations • NAACL 2022 • Zhuofeng Wu, Sinong Wang, Jiatao Gu, Rui Hou, Yuxiao Dong, V. G. Vinod Vydiswaran, Hao Ma
Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.
no code implementations • 6 Apr 2022 • Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee
Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there exists little parallel S2ST data, compared to the amount of data available for conventional cascaded systems that consist of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS) synthesis.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
7 code implementations • Preprint 2022 • Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli
While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.
Ranked #1 on
Paraphrase Identification
on Quora Question Pairs
(Accuracy metric)
no code implementations • NAACL 2022 • Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu
To our knowledge, we are the first to establish a textless S2ST technique that can be trained with real-world data and works for multiple language pairs.
1 code implementation • ICLR 2022 • Jiatao Gu, Lingjie Liu, Peng Wang, Christian Theobalt
We perform volume rendering only to produce a low-resolution feature map and progressively apply upsampling in 2D to address the first issue.
1 code implementation • 14 Sep 2021 • Changhan Wang, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Ann Lee, Peng-Jen Chen, Jiatao Gu, Juan Pino
This paper presents fairseq S^2, a fairseq extension for speech synthesis.
no code implementations • ACL 2022 • Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu
When target text transcripts are available, we design a joint speech and text training framework that enables the model to generate dual modality output (speech and text) simultaneously in the same inference pass.
3 code implementations • NeurIPS 2021 • Lior Yariv, Jiatao Gu, Yoni Kasten, Yaron Lipman
Accurate sampling is important to provide a precise coupling of geometry and radiance; and (iii) it allows efficient unsupervised disentanglement of shape and appearance in volume rendering.
no code implementations • 3 Jun 2021 • Lingjie Liu, Marc Habermann, Viktor Rudnev, Kripasindhu Sarkar, Jiatao Gu, Christian Theobalt
To address this problem, we utilize a coarse body model as the proxy to unwarp the surrounding 3D space into a canonical pose.
1 code implementation • ACL 2021 • Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier
Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP.
Ranked #1 on
Speech-to-Text Translation
on MuST-C EN->ES
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 31 Dec 2020 • Zhuofeng Wu, Sinong Wang, Jiatao Gu, Madian Khabsa, Fei Sun, Hao Ma
Pre-trained language models have proven their unique powers in capturing implicit language features.
Ranked #5 on
Question Answering
on Quora Question Pairs
1 code implementation • Findings (ACL) 2021 • Jiatao Gu, Xiang Kong
Fully non-autoregressive neural machine translation (NAT) is proposed to simultaneously predict tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline.
no code implementations • 16 Nov 2020 • Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu
We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.
2 code implementations • Findings (ACL) 2021 • Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, Marjan Ghazvininejad
Neural sequence models can generate highly fluent sentences, but recent studies have also shown that they are also prone to hallucinate additional content not supported by the input.
1 code implementation • COLING 2020 • Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier
We propose two variants of these architectures corresponding to two different levels of dependencies between the decoders, called the parallel and cross dual-decoder Transformers, respectively.
Ranked #1 on
Speech-to-Text Translation
on MuST-C EN->FR
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
5 code implementations • 2 Aug 2020 • Yuqing Tang, Chau Tran, Xi-An Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan
Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages.
no code implementations • EMNLP 2020 • Xutai Ma, Mohammad Javad Dousti, Changhan Wang, Jiatao Gu, Juan Pino
We also adapt latency metrics from text simultaneous translation to the speech task.
1 code implementation • NeurIPS 2020 • Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, Christian Theobalt
We also demonstrate several challenging tasks, including multi-scene learning, free-viewpoint rendering of a moving human, and large-scale scene rendering.
1 code implementation • ECCV 2020 • Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas, Or Litany
To this end, we select a suite of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes.
no code implementations • ACL 2020 • Arya D. McCarthy, Xi-An Li, Jiatao Gu, Ning Dong
This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs).
no code implementations • WS 2020 • Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang
The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.
no code implementations • 22 Jun 2020 • Anne Wu, Changhan Wang, Juan Pino, Jiatao Gu
End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity.
1 code implementation • NeurIPS 2020 • Chau Tran, Yuqing Tang, Xi-An Li, Jiatao Gu
Recent studies have demonstrated the cross-lingual alignment ability of multilingual pretrained language models.
no code implementations • 9 Jun 2020 • Changhan Wang, Juan Pino, Jiatao Gu
Even with pseudo-labels from low-resource MT (200K examples), ST-enhanced transfer brings up to 8. 9% WER reduction to direct transfer.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+7
1 code implementation • LREC 2020 • Changhan Wang, Juan Pino, Anne Wu, Jiatao Gu
Spoken language translation has recently witnessed a resurgence in popularity, thanks to the development of end-to-end models and the creation of new corpora, such as Augmented LibriSpeech and MuST-C.
5 code implementations • 22 Jan 2020 • Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.
1 code implementation • 15 Jan 2020 • Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.
no code implementations • ICLR 2020 • Chunting Zhou, Graham Neubig, Jiatao Gu
We find that knowledge distillation can reduce the complexity of data sets and help NAT to model the variations in the output data.
no code implementations • ICLR 2020 • Maha Elbayad, Jiatao Gu, Edouard Grave, Michael Auli
State of the art sequence-to-sequence models for large scale tasks perform a fixed number of computations for each input sequence regardless of whether it is easy or hard to process.
1 code implementation • ICLR 2020 • Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato
In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks.
no code implementations • EACL 2021 • Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato
While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.
3 code implementations • ICLR 2020 • Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu
Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence.
no code implementations • 19 Sep 2019 • Arya D. McCarthy, Xi-An Li, Jiatao Gu, Ning Dong
Posterior collapse plagues VAEs for text, especially for conditional text generation with strong autoregressive decoders.
no code implementations • EMNLP (IWSLT) 2019 • Juan Pino, Liezl Puzon, Jiatao Gu, Xutai Ma, Arya D. McCarthy, Deepak Gopinath
In this work, we evaluate several data augmentation and pretraining approaches for AST, by comparing all on the same datasets.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • IJCNLP 2019 • Changhan Wang, Anirudh Jain, Danlu Chen, Jiatao Gu
Automatic evaluation of text generation tasks (e. g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE.
1 code implementation • 7 Sep 2019 • Changhan Wang, Kyunghyun Cho, Jiatao Gu
Representing text at the level of bytes and using the 256 byte set as vocabulary is a potential solution to this issue.
no code implementations • ACL 2019 • Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li
Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings.
2 code implementations • NeurIPS 2019 • Jiatao Gu, Changhan Wang, Jake Zhao
We further confirm the flexibility of our model by showing a Levenshtein Transformer trained by machine translation can straightforwardly be used for automatic post-editing.
Ranked #5 on
Machine Translation
on WMT2016 Romanian-English
no code implementations • TACL 2019 • Jiatao Gu, Qi Liu, Kyunghyun Cho
Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal.
no code implementations • EMNLP 2018 • Jiatao Gu, Yong Wang, Yun Chen, Kyunghyun Cho, Victor O. K. Li
We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks.
no code implementations • 8 Jul 2018 • Yong Wang, Xiao-Ming Wu, Qimai Li, Jiatao Gu, Wangmeng Xiang, Lei Zhang, Victor O. K. Li
The key issue of few-shot learning is learning to generalize.
no code implementations • NAACL 2018 • Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O. K. Li
Our proposed approach utilizes a transfer-learning approach to share lexical and sentence level representations across multiple source languages into one target language.
2 code implementations • ICLR 2018 • Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
Existing approaches to neural machine translation condition each output word on previously generated outputs.
Ranked #3 on
Machine Translation
on IWSLT2015 English-German
no code implementations • 22 Jun 2017 • Jiatao Gu, Daniel Jiwoong Im, Victor O. K. Li
Previous neural machine translation models used some heuristic search algorithms (e. g., beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test time.
no code implementations • 20 May 2017 • Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li
In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training.
1 code implementation • EMNLP 2017 • Jiatao Gu, Kyunghyun Cho, Victor O. K. Li
Instead of trying to build a new decoding algorithm for any specific decoding objective, we propose the idea of trainable decoding algorithm in which we train a decoding algorithm to find a translation that maximizes an arbitrary decoding objective.
1 code implementation • EACL 2017 • Jiatao Gu, Graham Neubig, Kyunghyun Cho, Victor O. K. Li
Translating in real-time, a. k. a.
7 code implementations • ACL 2016 • Jiatao Gu, Zhengdong Lu, Hang Li, Victor O. K. Li
CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence.
no code implementations • IJCNLP 2015 • Jiatao Gu, Victor O. K. Li
Replicated Softmax model, a well-known undirected topic model, is powerful in extracting semantic representations of documents.