1 code implementation • 16 Dec 2022 • Mingda Chen, Paul-Ambroise Duquenne, Pierre Andrews, Justine Kao, Alexandre Mourachko, Holger Schwenk, Marta R. Costa-jussà
In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 21 Jul 2022 • Mingda Chen
In this thesis, we describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
no code implementations • NAACL 2022 • Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva
Self-supervised pretraining has made few-shot learning possible for many NLP tasks.
1 code implementation • 18 Sep 2021 • Mingda Chen, Kevin Gimpel
We introduce TVStoryGen, a story generation dataset that requires generating detailed TV show episode recaps from a brief summary and a set of documents describing the characters involved.
Ranked #1 on
Story Generation
on Fandom dev
1 code implementation • ACL 2022 • Mingda Chen, Zewei Chu, Sam Wiseman, Kevin Gimpel
Since characters are fundamental to TV series, we also propose two entity-centric evaluation metrics.
1 code implementation • Findings (ACL) 2021 • Mingda Chen, Sam Wiseman, Kevin Gimpel
Datasets for data-to-text generation typically focus either on multi-domain, single-sentence generation or on single-domain, long-form generation.
1 code implementation • 12 Oct 2020 • Mingda Chen, Sam Wiseman, Kevin Gimpel
Our experimental results show that our models achieve competitive results on controlled paraphrase generation and strong performance on controlled machine translation.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Mingda Chen, Zewei Chu, Karl Stratos, Kevin Gimpel
Accurate lexical entailment (LE) and natural language inference (NLI) often require large quantities of costly annotations.
no code implementations • 13 Jul 2020 • Huimei Han, Wenchao Zhai, Zhefu Wu, Ying Li, Jun Zhao, Mingda Chen
Simulation results show that, compared to the exiting random access scheme for the crowded asynchronous massive MIMO systems, the proposed scheme can improve the uplink throughput and estimate the effective timing offsets accurately at the same time.
no code implementations • WS 2020 • Mingda Chen, Kevin Gimpel
Probabilistic word embeddings have shown effectiveness in capturing notions of generality and entailment, but there is very little work on doing the analogous type of investigation for sentences.
1 code implementation • 21 Nov 2019 • Zewei Chu, Mingda Chen, Jing Chen, Miaosen Wang, Kevin Gimpel, Manaal Faruqui, Xiance Si
We present a large-scale dataset for the task of rewriting an ill-formed natural language question to a well-formed one.
43 code implementations • ICLR 2020 • Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.
Ranked #1 on
Natural Language Inference
on QNLI
2 code implementations • IJCNLP 2019 • Mingda Chen, Zewei Chu, Kevin Gimpel
Prior work on pretrained sentence embeddings and benchmarks focus on the capabilities of stand-alone sentences.
2 code implementations • IJCNLP 2019 • Mingda Chen, Zewei Chu, Yang Chen, Karl Stratos, Kevin Gimpel
Rich entity representations are useful for a wide class of problems involving entities.
1 code implementation • NAACL 2018 • Mingda Chen, Kevin Gimpel
Word embedding parameters often dominate overall model sizes in neural methods for natural language processing.
1 code implementation • EMNLP 2018 • Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel
Our model family consists of a latent-variable generative model and a discriminative labeler.
Ranked #70 on
Named Entity Recognition (NER)
on CoNLL 2003 (English)
no code implementations • ACL 2019 • Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel
Prior work on controllable text generation usually assumes that the controlled attribute can take on one of a small set of values known a priori.
no code implementations • ICLR 2019 • Qingming Tang, Mingda Chen, Weiran Wang, Karen Livescu
Existing variational recurrent models typically use stochastic recurrent connections to model the dependence among neighboring latent variables, while generation assumes independence of generated data per time step given the latent sequence.
1 code implementation • NAACL 2019 • Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel
We propose a generative model for a sentence that uses two latent variables, with one intended to represent the syntax of the sentence and the other to represent its semantics.