Search Results for author: Colin Cherry

Found 44 papers, 4 papers with code

Inverted Projection for Robust Speech Translation

no code implementations ACL (IWSLT) 2021 Dirk Padfield, Colin Cherry

Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications.

Translation

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

no code implementations AMTA 2016 Boxing Chen, Roland Kuhn, George Foster, Colin Cherry, Fei Huang

In this paper, we propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus.

Machine Translation Translation

Can Multilinguality benefit Non-autoregressive Machine Translation?

no code implementations16 Dec 2021 Sweta Agrawal, Julia Kreutzer, Colin Cherry

Non-autoregressive (NAR) machine translation has recently achieved significant improvements, and now outperforms autoregressive (AR) models on some benchmarks, providing an efficient alternative to AR inference.

Machine Translation Translation

Assessing Reference-Free Peer Evaluation for Machine Translation

no code implementations NAACL 2021 Sweta Agrawal, George Foster, Markus Freitag, Colin Cherry

Reference-free evaluation has the potential to make machine translation evaluation substantially more scalable, allowing us to pivot easily to new languages or domains.

Machine Translation Translation

Simultaneous Translation

no code implementations EMNLP 2020 Liang Huang, Colin Cherry, Mingbo Ma, Naveen Arivazhagan, Zhongjun He

Simultaneous translation, which performs translation concurrently with the source speech, is widely useful in many scenarios such as international conferences, negotiations, press releases, legal proceedings, and medicine.

Machine Translation Speech Recognition +2

Sentence Boundary Augmentation For Neural Machine Translation Robustness

no code implementations21 Oct 2020 Daniel Li, Te I, Naveen Arivazhagan, Colin Cherry, Dirk Padfield

Specifically, in the context of long-form speech translation systems, where the input transcripts come from Automatic Speech Recognition (ASR), the NMT models have to handle errors including phoneme substitutions, grammatical structure, and sentence boundaries, all of which pose challenges to NMT robustness.

Data Augmentation Machine Translation +2

Human-Paraphrased References Improve Neural Machine Translation

1 code implementation WMT (EMNLP) 2020 Markus Freitag, George Foster, David Grangier, Colin Cherry

When used in place of original references, the paraphrased versions produce metric scores that correlate better with human judgment.

Machine Translation Translation

Inference Strategies for Machine Translation with Conditional Masking

no code implementations EMNLP 2020 Julia Kreutzer, George Foster, Colin Cherry

Conditional masked language model (CMLM) training has proven successful for non-autoregressive and semi-autoregressive sequence generation tasks, such as machine translation.

Language Modelling Machine Translation +1

Re-translation versus Streaming for Simultaneous Translation

no code implementations WS 2020 Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, George Foster

There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available.

Data Augmentation Machine Translation +1

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation

1 code implementation6 Dec 2019 Naveen Arivazhagan, Colin Cherry, Te I, Wolfgang Macherey, Pallavi Baljekar, George Foster

As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows.

Machine Translation Speech Recognition +1

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

no code implementations ACL 2019 Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, Colin Raffel

Simultaneous machine translation begins to translate each source sentence before the source speaker is finished speaking, with applications to live and streaming scenarios.

Machine Translation Translation

Thinking Slow about Latency Evaluation for Simultaneous Machine Translation

no code implementations31 May 2019 Colin Cherry, George Foster

Simultaneous machine translation attempts to translate a source sentence before it is finished being spoken, with applications to translation of spoken language for live streaming and conversation.

Machine Translation Translation

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

3 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue

no code implementations31 Jan 2019 Kory W. Mathewson, Pablo Samuel Castro, Colin Cherry, George Foster, Marc G. Bellemare

We consider the problem of designing an artificial agent capable of interacting with humans in collaborative dialogue to produce creative, engaging narratives.

Efficient Sequence Labeling with Actor-Critic Training

1 code implementation30 Sep 2018 Saeed Najafi, Colin Cherry, Grzegorz Kondrak

We set out to establish RNNs as an attractive alternative to CRFs for sequence labeling.

Decision Making NER +1

Revisiting Character-Based Neural Machine Translation with Capacity and Compression

no code implementations EMNLP 2018 Colin Cherry, George Foster, Ankur Bapna, Orhan Firat, Wolfgang Macherey

Translating characters instead of words or word-fragments has the potential to simplify the processing pipeline for neural machine translation (NMT), and improve results by eliminating hyper-parameters and manual feature engineering.

Feature Engineering Machine Translation +1

Cost Weighting for Neural Machine Translation Domain Adaptation

no code implementations WS 2017 Boxing Chen, Colin Cherry, George Foster, Samuel Larkin

We compare cost weighting to two traditional domain adaptation techniques developed for statistical machine translation: data selection and sub-corpus weighting.

Domain Adaptation Machine Translation +1

End-to-End Multi-View Networks for Text Classification

no code implementations19 Apr 2017 Hongyu Guo, Colin Cherry, Jiang Su

For a bag-of-words representation, each view focuses on a different subset of the text's words.

General Classification Text Classification

A Dataset for Detecting Stance in Tweets

no code implementations LREC 2016 Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, Colin Cherry

Apart from stance, the tweets are also annotated for whether the target of interest is the target of opinion in the tweet.

Cannot find the paper you are looking for? You can Submit a new open access paper.