Search Results for author: Ciprian Chelba

Found 19 papers, 2 papers with code

Towards Computationally Verifiable Semantic Grounding for Language Models

no code implementations16 Nov 2022 Chris Alberti, Kuzman Ganchev, Michael Collins, Sebastian Gehrmann, Ciprian Chelba

Compared to a baseline that generates text using greedy search, we demonstrate two techniques that improve the fluency and semantic accuracy of the generated text: The first technique samples multiple candidate text sequences from which the semantic parser chooses.

Language Modelling

Data Troubles in Sentence Level Confidence Estimation for Machine Translation

no code implementations26 Oct 2020 Ciprian Chelba, Junpei Zhou, Yuezhang, Li, Hideto Kazawa, Jeff Klingner, Mengmeng Niu

For an English-Spanish translation model operating at $SACC = 0. 89$ according to a non-expert annotator pool we can derive a confidence estimate that labels 0. 5-0. 6 of the $good$ translations in an "in-domain" test set with 0. 95 Precision.

Machine Translation Sentence +1

Multi-Stage Influence Function

no code implementations NeurIPS 2020 Hongge Chen, Si Si, Yang Li, Ciprian Chelba, Sanjiv Kumar, Duane Boning, Cho-Jui Hsieh

With this score, we can identify the pretraining examples in the pretraining task that contribute most to a prediction in the finetuning task.

Transfer Learning

Practical Perspectives on Quality Estimation for Machine Translation

no code implementations2 May 2020 Junpei Zhou, Ciprian Chelba, Yuezhang, Li

Sentence level quality estimation (QE) for machine translation (MT) attempts to predict the translation edit rate (TER) cost of post-editing work required to correct MT output.

Binary Classification General Classification +4

Faster Transformer Decoding: N-gram Masked Self-Attention

no code implementations14 Jan 2020 Ciprian Chelba, Mia Chen, Ankur Bapna, Noam Shazeer

Motivated by the fact that most of the information relevant to the prediction of target tokens is drawn from the source sentence $S=s_1, \ldots, s_S$, we propose truncating the target-side window used for computing self-attention by making an $N$-gram assumption.


Tagged Back-Translation

no code implementations WS 2019 Isaac Caswell, Ciprian Chelba, David Grangier

Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data.

Machine Translation NMT +1

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

2 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection

no code implementations WS 2018 Wei Wang, Taro Watanabe, Macduff Hughes, Tetsuji Nakagawa, Ciprian Chelba

Measuring domain relevance of data and identifying or selecting well-fit domain data for machine translation (MT) is a well-studied topic, but denoising is not yet.

Denoising Machine Translation +2

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

no code implementations NeurIPS 2018 Patrick H. Chen, Si Si, Yang Li, Ciprian Chelba, Cho-Jui Hsieh

Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses.

Language Modelling Model Compression +1

N-gram Language Modeling using Recurrent Neural Network Estimation

no code implementations31 Mar 2017 Ciprian Chelba, Mohammad Norouzi, Samy Bengio

Experiments on a small corpus (UPenn Treebank, one million words of training data and 10k vocabulary) have found the LSTM cell with dropout to be the best model for encoding the $n$-gram state when compared with feed-forward and vanilla RNN models.

Language Modelling Sentence

Sparse Non-negative Matrix Language Modeling

no code implementations TACL 2016 Joris Pelemans, Noam Shazeer, Ciprian Chelba

We evaluate SNM language models on two corpora: the One Billion Word Benchmark and a subset of the LDC English Gigaword corpus.

Automatic Speech Recognition (ASR) Language Modelling +1

Multinomial Loss on Held-out Data for the Sparse Non-negative Matrix Language Model

no code implementations5 Nov 2015 Ciprian Chelba, Fernando Pereira

In experiments on the one billion words language modeling benchmark, we are able to slightly improve on our previous results which use a different loss function, and employ leave-one-out training on a subset of the main training set.

Language Modelling

Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation

no code implementations3 Dec 2014 Noam Shazeer, Joris Pelemans, Ciprian Chelba

We present a novel family of language model (LM) estimation techniques named Sparse Non-negative Matrix (SNM) estimation.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.