no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.
no code implementations • INLG (ACL) 2020 • Hongyu Gong, Linfeng Song, Suma Bhat
Text style transfer aims to change an input sentence to an output sentence by changing its text style while preserving the content.
no code implementations • ACL (MWE) 2021 • Jianing Zhou, Hongyu Gong, Suma Bhat
Idiomatic expressions (IE) play an important role in natural language, and have long been a “pain in the neck” for NLP systems.
no code implementations • 24 May 2022 • Paul-Ambroise Duquenne, Hongyu Gong, Benoît Sagot, Holger Schwenk
We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks.
no code implementations • ACL 2022 • Yun Tang, Hongyu Gong, Ning Dong, Changhan Wang, Wei-Ning Hsu, Jiatao Gu, Alexei Baevski, Xian Li, Abdelrahman Mohamed, Michael Auli, Juan Pino
Two pre-training configurations for speech translation and recognition, respectively, are presented to alleviate subtask interference.
no code implementations • 14 Mar 2022 • Ping Yu, Mikel Artetxe, Myle Ott, Sam Shleifer, Hongyu Gong, Ves Stoyanov, Xian Li
All-MLP architectures have attracted increasing interest as an alternative to attention-based models.
Ranked #1 on
Zero-Shot Learning
on COPA
no code implementations • 16 Dec 2021 • Jianing Zhou, Ziheng Zeng, Hongyu Gong, Suma Bhat
In this paper, we study the task of idiomatic sentence paraphrasing (ISP), which aims to paraphrase a sentence with an IE by replacing the IE with its literal paraphrase.
no code implementations • 15 Dec 2021 • Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu
To our knowledge, we are the first to establish a textless S2ST technique that can be trained with real-world data and works for multiple language pairs.
1 code implementation • NeurIPS 2021 • Paul-Ambroise Duquenne, Hongyu Gong, Holger Schwenk
Using a similarity metric in that multimodal embedding space, we perform mining of audio in German, French, Spanish and English from Librivox against billions of sentences from Common Crawl.
no code implementations • 15 Oct 2021 • Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino
We present a direct simultaneous speech-to-speech translation (Simul-S2ST) model, Furthermore, the generation of translation is independent from intermediate text representations.
no code implementations • 15 Oct 2021 • Danni Liu, Changhan Wang, Hongyu Gong, Xutai Ma, Yun Tang, Juan Pino
Speech-to-speech translation (S2ST) converts input speech to speech in another language.
no code implementations • ICLR 2022 • Xuan-Phi Nguyen, Hongyu Gong, Yun Tang, Changhan Wang, Philipp Koehn, Shafiq Joty
Modern unsupervised machine translation systems mostly train their models by generating synthetic parallel training data from large unlabeled monolingual corpora of different languages through various means, such as iterative back-translation.
no code implementations • ACL (IWSLT) 2021 • Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal
In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task.
no code implementations • NeurIPS 2021 • Hongyu Gong, Yun Tang, Juan Pino, Xian Li
We further propose attention sharing strategies to facilitate parameter sharing and specialization in multilingual and multi-domain sequence modeling.
no code implementations • 7 Jun 2021 • Hongyu Gong, Vishrav Chaudhary, Yuqing Tang, Francisco Guzmán
Cross-lingual document representations enable language understanding in multilingual contexts and allow transfer learning from high-resource to low-resource languages at the document level.
1 code implementation • 24 May 2021 • Hongyu Gong, Alberto Valido, Katherine M. Ingram, Giulia Fanti, Suma Bhat, Dorothy L. Espelage
Abusive language is a massive problem in online social platforms.
no code implementations • NeurIPS 2021 • Xian Li, Hongyu Gong
We show that common training method which upsamples low resources can not robustly optimize population loss with risks of either underfitting high resource languages or overfitting low resource ones.
no code implementations • 15 Apr 2021 • Hongyu Gong, Xian Li, Dmitriy Genzel
Based on these insights, we propose an adaptive and sparse architecture for multilingual modeling, and train the model to learn shared and language-specific parameters to improve the positive transfer and mitigate the interference.
no code implementations • 13 Apr 2021 • Jianing Zhou, Hongyu Gong, Srihari Nanniyur, Suma Bhat
We study a new application for text generation -- idiomatic sentence generation -- which aims to transfer literal phrases in sentences into their idiomatic counterparts.
1 code implementation • 31 Mar 2021 • Wanzheng Zhu, Hongyu Gong, Rohan Bansal, Zachary Weinberg, Nicolas Christin, Giulia Fanti, Suma Bhat
It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy.
1 code implementation • CONLL 2020 • Hongyu Gong, Suma Bhat, Pramod Viswanath
The meaning of a word is closely linked to sociocultural factors that can change over time and location, resulting in corresponding meaning changes.
no code implementations • WS 2020 • Hongyu Gong, Kshitij Gupta, Akriti Jain, Suma Bhat
Metaphors are rhetorical use of words based on the conceptual mapping as opposed to their literal use.
1 code implementation • ACL 2020 • Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu
In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.
1 code implementation • 10 Oct 2019 • Wanzheng Zhu, Hongyu Gong, Jiaming Shen, Chao Zhang, Jingbo Shang, Suma Bhat, Jiawei Han
In this paper, we study the task of multi-faceted set expansion, which aims to capture all semantic facets in the seed set and return multiple sets of entities, one for each semantic facet.
no code implementations • IJCNLP 2019 • Omer Anjum, Hongyu Gong, Suma Bhat, Wen-mei Hwu, JinJun Xiong
Finding the right reviewers to assess the quality of conference submissions is a time consuming process for conference organizers.
no code implementations • WS 2019 • Tarek Sakakini, Hongyu Gong, Jong Yoon Lee, Robert Schloss, JinJun Xiong, Suma Bhat
One of the challenges of building natural language processing (NLP) applications for education is finding a large domain-specific corpus for the subject of interest (e. g., history or science).
6 code implementations • EACL 2021 • Holger Schwenk, Vishrav Chaudhary, Shuo Sun, Hongyu Gong, Francisco Guzmán
We present an approach based on multilingual sentence embeddings to automatically extract parallel sentences from the content of Wikipedia articles in 85 languages, including several dialects or low-resource languages.
no code implementations • NAACL 2019 • Hongyu Gong, Suma Bhat, Lingfei Wu, JinJun Xiong, Wen-mei Hwu
Our generator employs an attention-based encoder-decoder to transfer a sentence from the source style to the target style.
1 code implementation • ACL 2018 • Hongyu Gong, Tarek Sakakini, Suma Bhat, JinJun Xiong
This is because of the lexical, contextual and the abstraction gaps between a long document of rich details and its concise summary of abstract information.
no code implementations • 23 Jan 2019 • Hongyu Gong, Yuchen Li, Suma Bhat, Pramod Viswanath
Misspelled words of the malicious kind work by changing specific keywords and are intended to thwart existing automated applications for cyber-environment control such as harassing content detection on the Internet and email spam detection.
1 code implementation • EMNLP 2018 • Hongyu Gong, Jiaqi Mu, Suma Bhat, Pramod Viswanath
Prepositions are highly polysemous, and their variegated senses encode significant semantic information.
no code implementations • NAACL 2018 • Hongyu Gong, Suma Bhat, Pramod Viswanath
Prepositions are among the most frequent words in English and play complex roles in the syntax and semantics of sentences.
no code implementations • 5 Feb 2017 • Hongyu Gong, Jiaqi Mu, Suma Bhat, Pramod Viswanath
Prepositions are highly polysemous, and their variegated senses encode significant semantic information.
1 code implementation • 29 Nov 2016 • Hongyu Gong, Suma Bhat, Pramod Viswanath
This paper proposes a simple test for compositionality (i. e., literal usage) of a word or phrase in a context-specific way.