Search Results for author: Marine Carpuat

Found 77 papers, 25 papers with code

Machine Translation Believability

1 code implementation EACL (HCINLP) 2021 Marianna Martindale, Kevin Duh, Marine Carpuat

Successful Machine Translation (MT) deployment requires understanding not only the intrinsic qualities of MT output, such as fluency and adequacy, but also user perceptions.

Machine Translation Translation

The UMD Machine Translation Systems at IWSLT 2016: English-to-French Translation of Speech Transcripts

no code implementations IWSLT 2016 Xing Niu, Marine Carpuat

We describe the University of Maryland machine translation system submitted to the IWSLT 2016 Microsoft Speech Language Translation (MSLT) English-French task.

Machine Translation Translation

Models and Tasks for Human-Centered Machine Translation

no code implementations MMTLRL (RANLP) 2021 Marine Carpuat

In this talk, I will describe current research directions in my group that aim to make machine translation (MT) more human-centered.

Machine Translation Sentence +2

How often are errors in natural language reasoning due to paraphrastic variability?

no code implementations17 Apr 2024 Neha Srikanth, Marine Carpuat, Rachel Rudinger

We propose a metric for evaluating the paraphrastic consistency of natural language reasoning models based on the probability of a model achieving the same correctness on two paraphrases of the same problem.

Natural Language Inference

Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations

2 code implementations11 Apr 2024 Dayeon Ki, Marine Carpuat

Machine Translation (MT) remains one of the last NLP tasks where large language models (LLMs) have not yet replaced dedicated supervised systems.

Machine Translation Translation

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

no code implementations21 Mar 2024 Hyojung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang

It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre-training schemes.

Audio-Visual Speech Recognition Representation Learning +4

Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators

1 code implementation23 Oct 2023 Tin Nguyen, Jiannan Xu, Aayushi Roy, Hal Daumé III, Marine Carpuat

We apply this method in the context of content moderation of potential hate speech, and its differential impact on Asian vs. non-Asian proxy moderators, across explanation approaches (saliency map and counterfactual explanation).

counterfactual Counterfactual Explanation +1

Controlling Pre-trained Language Models for Grade-Specific Text Simplification

no code implementations24 May 2023 Sweta Agrawal, Marine Carpuat

Based on these insights, we introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis.

Text Simplification

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

6 code implementations9 Nov 2022 BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.

Language Modelling Multilingual NLP

Facilitating Global Team Meetings Between Language-Based Subgroups: When and How Can Machine Translation Help?

no code implementations7 Sep 2022 Yongle Zhang, Dennis Asamoah Owusu, Marine Carpuat, Ge Gao

We manipulated the exchange of subgroup conversation logs prior to team meetings: with MT-mediated exchanges versus without.

Machine Translation

Controlling Translation Formality Using Pre-trained Multilingual Language Models

no code implementations IWSLT (ACL) 2022 Elijah Rippeth, Sweta Agrawal, Marine Carpuat

This paper describes the University of Maryland's submission to the Special Task on Formality Control for Spoken Language Translation at \iwslt, which evaluates translation from English into 6 languages with diverse grammatical formality markers.

Language Modelling Translation

An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models

no code implementations ACL 2022 Sweta Agrawal, Marine Carpuat

We propose a framework for training non-autoregressive sequence-to-sequence models for editing tasks, where the original input sequence is iteratively edited to produce the output.

Abstractive Text Summarization Imitation Learning +3

Can Synthetic Translations Improve Bitext Quality?

no code implementations ACL 2022 Eleftheria Briakou, Marine Carpuat

Synthetic translations have been used for a wide range of NLP tasks primarily as a means of data augmentation.

Data Augmentation NMT

Rule-based Morphological Inflection Improves Neural Terminology Translation

1 code implementation EMNLP 2021 Weijia Xu, Marine Carpuat

Current approaches to incorporating terminology constraints in machine translation (MT) typically assume that the constraint terms are provided in their correct morphological forms.

Domain Adaptation LEMMA +4

A Review of Human Evaluation for Style Transfer

1 code implementation ACL (GEM) 2021 Eleftheria Briakou, Sweta Agrawal, Ke Zhang, Joel Tetreault, Marine Carpuat

However, in style transfer papers, we find that protocols for human evaluations are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods.

Style Transfer

Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation

2 code implementations ACL 2021 Eleftheria Briakou, Marine Carpuat

While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise.

Machine Translation NMT +1

How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?

no code implementations Findings (ACL) 2021 Weijia Xu, Shuming Ma, Dongdong Zhang, Marine Carpuat

While non-autoregressive (NAR) models are showing great promise for machine translation, their use is limited by their dependence on knowledge distillation from autoregressive models.

Knowledge Distillation Machine Translation +1

EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints

1 code implementation13 Nov 2020 Weijia Xu, Marine Carpuat

We introduce an Edit-Based Transformer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in output lexical choice.

Imitation Learning Machine Translation +1

Incorporating Terminology Constraints in Automatic Post-Editing

1 code implementation WMT (EMNLP) 2020 David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, Kathleen McKeown

In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating that our approach enables preservation of 95% of the terminologies and also improves translation quality on English-German benchmarks.

Automatic Post-Editing Data Augmentation +1

Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank

1 code implementation EMNLP 2020 Eleftheria Briakou, Marine Carpuat

Detecting fine-grained differences in content conveyed in different languages matters for cross-lingual NLP and multilingual corpora analysis, but it is a challenging machine learning problem since annotation is expensive and hard to scale.

Learning-To-Rank Sentence

Dual Reconstruction: a Unifying Objective for Semi-Supervised Neural Machine Translation

no code implementations Findings of the Association for Computational Linguistics 2020 Weijia Xu, Xing Niu, Marine Carpuat

While Iterative Back-Translation and Dual Learning effectively incorporate monolingual training data in neural machine translation, they use different objectives and heuristic gradient approximation strategies, and have not been extensively compared.

Machine Translation Translation

Generating Diverse Translations via Weighted Fine-tuning and Hypotheses Filtering for the Duolingo STAPLE Task

no code implementations WS 2020 Sweta Agrawal, Marine Carpuat

This paper describes the University of Maryland{'}s submission to the Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Evaluating a Bi-LSTM Model for Metaphor Detection in TOEFL Essays

no code implementations WS 2020 Kevin Kuo, Marine Carpuat

However, the Bi-LSTM models lag behind the best performing systems in the shared task.

Multitask Models for Controlling the Complexity of Neural Machine Translation

no code implementations WS 2020 Sweta Agrawal, Marine Carpuat

We introduce a machine translation task where the output is aimed at audiences of different levels of target language proficiency.

Machine Translation Translation

Flexible Non-Autoregressive Neural Machine Translation via Repositioning Edit Operations

no code implementations WS 2020 Weijia Xu, Marine Carpuat

We introduce an iterative text refinement model to reduce the decoding space of non-autoregressive models by disentangling the token prediction and relative position prediction.

Machine Translation Position +1

Controlling Neural Machine Translation Formality with Synthetic Supervision

1 code implementation20 Nov 2019 Xing Niu, Marine Carpuat

This work aims to produce translations that convey source language content at a formality level that is appropriate for a particular audience.

Machine Translation Sentence +1

Controlling Text Complexity in Neural Machine Translation

1 code implementation IJCNLP 2019 Sweta Agrawal, Marine Carpuat

This work introduces a machine translation task where the output is aimed at audiences of different levels of target language proficiency.

Machine Translation Translation

Weakly Supervised Cross-lingual Semantic Relation Classification via Knowledge Distillation

no code implementations IJCNLP 2019 Yogarshi Vyas, Marine Carpuat

Our classifier relies on a novel attention-based distillation approach to account for translation ambiguity when transferring knowledge from English to cross-lingual settings.

Classification Cross-Lingual Transfer +5

Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation

1 code implementation NAACL 2019 Weijia Xu, Xing Niu, Marine Carpuat

Despite some empirical success at correcting exposure bias in machine translation, scheduled sampling algorithms suffer from a major drawback: they incorrectly assume that words in the reference translations and in sampled sequences are aligned at each time step.

Machine Translation Translation

Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation

1 code implementation NAACL 2019 Xing Niu, Weijia Xu, Marine Carpuat

We aim to better exploit the limited amounts of parallel text available in low-resource settings by introducing a differentiable reconstruction loss for neural machine translation (NMT).

Low-Resource Neural Machine Translation NMT +1

The University of Maryland's Chinese-English Neural Machine Translation Systems at WMT18

no code implementations WS 2018 Weijia Xu, Marine Carpuat

This paper describes the University of Maryland{'}s submission to the WMT 2018 Chinese↔English news translation tasks.

Machine Translation Translation

UMD at SemEval-2018 Task 10: Can Word Embeddings Capture Discriminative Attributes?

no code implementations SEMEVAL 2018 Alex Zhang, er, Marine Carpuat

We describe the University of Maryland{'}s submission to SemEval-018 Task 10, {``}Capturing Discriminative Attributes{''}: given word triples (w1, w2, d), the goal is to determine whether d is a discriminating attribute belonging to w1 but not w2.

Attribute Binary Classification +2

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

no code implementations WS 2018 Xing Niu, Michael Denkowski, Marine Carpuat

Despite impressive progress in high-resource settings, Neural Machine Translation (NMT) still struggles in low-resource and out-of-domain scenarios, often failing to match the quality of phrase-based translation.

Machine Translation NMT +1

Robust Cross-lingual Hypernymy Detection using Dependency Context

1 code implementation NAACL 2018 Shyam Upadhyay, Yogarshi Vyas, Marine Carpuat, Dan Roth

We propose BISPARSE-DEP, a family of unsupervised approaches for cross-lingual hypernymy detection, which learns sparse, bilingual word embeddings based on dependency contexts.

Natural Language Inference Word Embeddings

Identifying Semantic Divergences in Parallel Text without Annotations

1 code implementation NAACL 2018 Yogarshi Vyas, Xing Niu, Marine Carpuat

Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation.

Machine Translation Semantic Similarity +3

Fluency Over Adequacy: A Pilot Study in Measuring User Trust in Imperfect MT

no code implementations WS 2018 Marianna J. Martindale, Marine Carpuat

Although measuring intrinsic quality has been a key factor in the advancement of Machine Translation (MT), successfully deploying MT requires considering not just intrinsic quality but also the user experience, including aspects such as trust.

Machine Translation Translation

A Study of Style in Machine Translation: Controlling the Formality of Machine Translation Output

no code implementations EMNLP 2017 Xing Niu, Marianna Martindale, Marine Carpuat

Stylistic variations of language, such as formality, carry speakers{'} intention beyond literal meaning and should be conveyed adequately in translation.

Domain Adaptation Machine Translation +1

Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation

no code implementations WS 2017 Marine Carpuat, Yogarshi Vyas, Xing Niu

Parallel corpora are often not as parallel as one might assume: non-literal translations and noisy translations abound, even in curated corpora routinely used for training and evaluation.

Domain Adaptation Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.