no code implementations • WMT (EMNLP) 2020 • Tanfang Chen, Weiwei Wang, Wenyang Wei, Xing Shi, Xiangang Li, Jieping Ye, Kevin Knight
This paper describes the DiDi AI Labs’ submission to the WMT2020 news translation shared task.
no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
no code implementations • EMNLP (ACL) 2021 • Arkady Arkhangorodsky, Christopher Chu, Scot Fang, Yiqi Huang, Denglin Jiang, Ajay Nagesh, Boliang Zhang, Kevin Knight
We use the re-translation strategy to translate the streamed speech, resulting in caption flicker.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 20 Sep 2021 • Arkady Arkhangorodsky, Scot Fang, Victoria Knight, Ajay Nagesh, Maria Ryskina, Kevin Knight
Task-oriented dialog systems are often trained on human/human dialogs, such as collected from Wizard-of-Oz interfaces.
3 code implementations • EMNLP (BlackboxNLP) 2021 • Maria Ryskina, Kevin Knight
Embedding words in high-dimensional vector spaces has proven valuable in many natural language applications.
1 code implementation • 8 Feb 2021 • Boliang Zhang, Ying Lyu, Ning Ding, Tianhao Shen, Zhaoyang Jia, Kun Han, Kevin Knight
This paper describes our submission for the End-to-end Multi-domain Task Completion Dialog shared task at the 9th Dialog System Technology Challenge (DSTC-9).
no code implementations • 24 Dec 2020 • Xing Shi, Yijun Xiao, Kevin Knight
Using different EoS types in target sentences of different lengths exposes and eliminates this implicit smoothing.
1 code implementation • 9 Nov 2020 • Xiaodan Hu, Pengfei Yu, Kevin Knight, Heng Ji, Bo Li, Honghui Shi
Experiments show that our approach can accurately illustrate 78% textual attributes, which also help MUSE capture the subject in a more creative and expressive way.
no code implementations • 16 Oct 2020 • Tanfang Chen, Weiwei Wang, Wenyang Wei, Xing Shi, Xiangang Li, Jieping Ye, Kevin Knight
This paper describes DiDi AI Labs' submission to the WMT2020 news translation shared task.
1 code implementation • INLG (ACL) 2020 • Qingyun Wang, Qi Zeng, Lifu Huang, Kevin Knight, Heng Ji, Nazneen Fatema Rajani
To assist human review process, we build a novel ReviewRobot to automatically assign a review score and write comments for multiple categories such as novelty and meaningful comparison.
1 code implementation • 9 Oct 2020 • Arkady Arkhangorodsky, Amittai Axelrod, Christopher Chu, Scot Fang, Yiqi Huang, Ajay Nagesh, Xing Shi, Boliang Zhang, Kevin Knight
We create a new task-oriented dialog platform (MEEP) where agents are given considerable freedom in terms of utterances and API calls, but are constrained to work within a push-button environment.
no code implementations • EMNLP 2020 • Christopher Chu, Scot Fang, Kevin Knight
We demonstrate a program that learns to pronounce Chinese text in Mandarin, without a pronunciation dictionary.
no code implementations • EMNLP 2020 • Christopher Chu, Raphael Valenti, Kevin Knight
We solve difficult word-based substitution codes by constructing a decoding lattice and searching that lattice with a neural language model.
no code implementations • WS 2020 • Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang
The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.
no code implementations • ACL 2020 • Boliang Zhang, Ajay Nagesh, Kevin Knight
Web-crawled data provides a good source of parallel corpora for training machine translation models.
Ranked #1 on
Machine Translation
on WMT2019 English-Japanese
no code implementations • ACL 2019 • Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, Jonathan May
Given a rough, word-by-word gloss of a source language sentence, target language natives can uncover the latent, fully-fluent rendering of the translation.
2 code implementations • ACL 2019 • Qingyun Wang, Lifu Huang, Zhiying Jiang, Kevin Knight, Heng Ji, Mohit Bansal, Yi Luan
We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper.
2 code implementations • 14 Nov 2018 • Lili Yao, Nanyun Peng, Ralph Weischedel, Kevin Knight, Dongyan Zhao, Rui Yan
Automatic storytelling is challenging since it requires generating long, coherent natural language to describes a sensible sequence of events.
1 code implementation • 9 Oct 2018 • Xusen Yin, Nada Aldarrab, Beáta Megyesi, Kevin Knight
European libraries and archives are filled with enciphered manuscripts from the early modern period.
1 code implementation • WS 2018 • Qingyun Wang, Xiaoman Pan, Lifu Huang, Boliang Zhang, Zhiying Jiang, Heng Ji, Kevin Knight
We aim to automatically generate natural language descriptions about an input structured knowledge base (KB).
no code implementations • 16 Aug 2018 • Nelson F. Liu, Jonathan May, Michael Pust, Kevin Knight
Most statistical machine translation systems cannot translate words that are unseen in the training data.
no code implementations • COLING 2018 • Tim O{'}Gorman, Michael Regan, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Martha Palmer
There are few corpora that endeavor to represent the semantic content of entire documents.
no code implementations • COLING 2018 • Heng Ji, Kevin Knight
People often create obfuscated language for online communication to avoid Internet censorship, share sensitive information, express strong sentiment or emotion, plan for secret actions, trade illegal products, or simply hold interesting conversations.
no code implementations • ACL 2018 • Ulf Hermjakob, Jonathan May, Kevin Knight
We present uroman, a tool for converting text in myriads of languages and scripts such as Chinese, Arabic and Cyrillic into a common Latin-script representation.
no code implementations • ACL 2018 • Ulf Hermjakob, Jonathan May, Michael Pust, Kevin Knight
In a corruption of John Searle{'}s famous AI thought experiment, the Chinese Room (Searle, 1980), we twist its original intent by enabling humans to translate text, e. g. from Uyghur to English, even if they don{'}t have any prior knowledge of the source language.
no code implementations • 2 Jun 2018 • Xing Shi, Shizhen Xu, Kevin Knight
We present a GPU-based Locality Sensitive Hashing (LSH) algorithm to speed up beam search for sequence models.
no code implementations • NAACL 2018 • Nima Pourdamghani, Marjan Ghazvininejad, Kevin Knight
We present a method for improving word alignments using word similarities.
no code implementations • NAACL 2018 • Marjan Ghazvininejad, Yejin Choi, Kevin Knight
We present the first neural poetry translation system.
no code implementations • NAACL 2018 • Boliang Zhang, Ying Lin, Xiaoman Pan, Di Lu, Jonathan May, Kevin Knight, Heng Ji
We demonstrate ELISA-EDL, a state-of-the-art re-trainable system to extract entity mentions from low-resource languages, link them to external English knowledge bases, and visualize locations related to disaster topics on a world heatmap.
no code implementations • WS 2018 • Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight
We present a general framework of analyzing existing story corpora to generate controllable and creative new stories.
no code implementations • ACL 2018 • Hannah Rashkin, Antoine Bosselut, Maarten Sap, Kevin Knight, Yejin Choi
Understanding a narrative requires reading between the lines and reasoning about the unspoken but obvious implications about events and people's mental states - a capability that is trivial for humans but remarkably hard for machines.
Ranked #2 on
Emotion Classification
on ROCStories
2 code implementations • ACL 2018 • Qingyun Wang, Zhi-Hao Zhou, Lifu Huang, Spencer Whitehead, Boliang Zhang, Heng Ji, Kevin Knight
We present a paper abstract writing system based on an attentive neural sequence-to-sequence model that can take a title as input and automatically generate an abstract.
Ranked #1 on
Paper generation
on ACL Title and Abstract Dataset
no code implementations • EMNLP 2018 • Lifu Huang, Kyunghyun Cho, Boliang Zhang, Heng Ji, Kevin Knight
We construct a multilingual common semantic space based on distributional semantics, where words from multiple languages are projected into a shared space to enable knowledge and resource transfer across languages.
1 code implementation • 7 Feb 2018 • Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou
In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can learn from the aggregate errors of all the independent modules constituting the ASR and attempt to invert those.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • NAACL 2018 • Yining Chen, Sorcha Gilroy, Andreas Maletti, Jonathan May, Kevin Knight
We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages.
no code implementations • IJCNLP 2017 • Boliang Zhang, Di Lu, Xiaoman Pan, Ying Lin, Halidanmu Abudukelimu, Heng Ji, Kevin Knight
Current supervised name tagging approaches are inadequate for most low-resource languages due to the lack of annotated data and actionable linguistic knowledge.
no code implementations • EMNLP 2017 • Nima Pourdamghani, Kevin Knight
We present a method for translating texts between close language pairs.
no code implementations • WS 2017 • Sudha Rao, Daniel Marcu, Kevin Knight, Hal Daum{\'e} III
We propose a novel, Abstract Meaning Representation (AMR) based approach to identifying molecular events/interactions in biomedical text.
no code implementations • ACL 2017 • Xing Shi, Kevin Knight
Compared with Locality Sensitive Hashing (LSH), decoding with word alignments is GPU-friendly, orthogonal to existing speedup methods and more robust across language pairs.
no code implementations • ACL 2017 • Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji
The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia.
2 code implementations • WS 2016 • Ke Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, Kevin Knight
In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model.
no code implementations • LREC 2016 • Eunsol Choi, Matic Horvat, Jonathan May, Kevin Knight, Daniel Marcu
Understanding the experimental results of a scientific paper is crucial to understanding its contribution and to comparing it with related work.
1 code implementation • EMNLP 2016 • Barret Zoph, Deniz Yuret, Jonathan May, Kevin Knight
Ensembling and unknown word replacement add another 2 Bleu which brings the NMT performance on low-resource machine translation close to a strong syntax based machine translation (SBMT) system, exceeding its performance on one language pair.
1 code implementation • NAACL 2016 • Barret Zoph, Kevin Knight
We build a multi-source machine translation model and train it to maximize the probability of a target English string given French and German sources.
no code implementations • 24 Apr 2015 • Michael Pust, Ulf Hermjakob, Kevin Knight, Daniel Marcu, Jonathan May
To make this work, we transform the AMR structure into a form suitable for the mechanics of SBMT and useful for modeling.
no code implementations • LREC 2014 • Fabienne Braune, Daniel Bauer, Kevin Knight
We investigate formalisms for capturing the relation between semantic graphs and English strings.