no code implementations • WMT (EMNLP) 2021 • Saptarashmi Bandyopadhyay, Tasnim Kabir, Zizhen Lian, Marine Carpuat
This paper describes the system submitted to Large-Scale Multilingual Shared Task (Small Task #2) at WMT 2021.
no code implementations • WMT (EMNLP) 2020 • Calvin Bao, Yow-Ting Shiue, Chujun Song, Jie Li, Marine Carpuat
This paper describes the University of Maryland’s submissions to the WMT20 Shared Task on Chat Translation.
no code implementations • COLING 2022 • Elsbeth Turcan, David Wan, Faisal Ladhak, Petra Galuscakova, Sukanta Sen, Svetlana Tchistiakova, Weijia Xu, Marine Carpuat, Kenneth Heafield, Douglas Oard, Kathleen McKeown
Query-focused summaries of foreign-language, retrieved documents can help a user understand whether a document is actually relevant to the query term.
no code implementations • MMTLRL (RANLP) 2021 • Marine Carpuat
In this talk, I will describe current research directions in my group that aim to make machine translation (MT) more human-centered.
1 code implementation • EACL (HCINLP) 2021 • Marianna Martindale, Kevin Duh, Marine Carpuat
Successful Machine Translation (MT) deployment requires understanding not only the intrinsic qualities of MT output, such as fluency and adequacy, but also user perceptions.
no code implementations • IWSLT 2016 • Xing Niu, Marine Carpuat
We describe the University of Maryland machine translation system submitted to the IWSLT 2016 Microsoft Speech Language Translation (MSLT) English-French task.
1 code implementation • 23 Feb 2025 • Dayeon Ki, Marine Carpuat
We show that text simplification is the most effective MT-agnostic rewrite strategy and that it can be improved further when using quality estimation to assess translatability.
no code implementations • 7 Nov 2024 • Ibrahim Said Ahmad, Antonios Anastasopoulos, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, Barry Haddow, Dávid Javorský, Mateusz Krubiński, Tsz Kin Lam, Xutai Ma, Prashant Mathur, Evgeny Matusov, Chandresh Maurya, John McCrae, Kenton Murray, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, Atul Kr. Ojha, John Ortega, Sara Papi, Peter Polák, Adam Pospíšil, Pavel Pecina, Elizabeth Salesky, Nivedita Sethiya, Balaram Sarkar, Jiatong Shi, Claytone Sikasote, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Brian Thompson, Marco Turchi, Alex Waibel, Shinji Watanabe, Patrick Wilken, Petr Zemánek, Rodolfo Zevallos
This paper reports on the shared tasks organized by the 21st IWSLT Conference.
1 code implementation • 28 Oct 2024 • Hyojung Han, Kevin Duh, Marine Carpuat
Recent advances in automatic quality estimation for machine translation have exclusively focused on written language, leaving the speech modality underexplored.
1 code implementation • 12 Oct 2024 • Hyojung Han, Akiko Eriguchi, Haoran Xu, Hieu Hoang, Marine Carpuat, Huda Khayrallah
We propose VocADT, a novel method for vocabulary adaptation using adapter modules that are trained to learn the optimal linear combination of existing embeddings while keeping the model's weights fixed.
1 code implementation • 6 Oct 2024 • Shramay Palta, Nishant Balepur, Peter Rankel, Sarah Wiegreffe, Marine Carpuat, Rachel Rudinger
Questions involving commonsense reasoning about everyday situations often admit many $\textit{possible}$ or $\textit{plausible}$ answers.
3 code implementations • 6 Jun 2024 • Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Kahadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, Hyojung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyadhara, Dayeon Ki, Sweta Agrawal, Chau Pham, Gerson Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Saloni Gupta, Megan L. Rogers, Inna Goncearenco, Giuseppe Sarli, Igor Galynker, Denis Peskoff, Marine Carpuat, Jules White, Shyamal Anadkat, Alexander Hoyle, Philip Resnik
Generative Artificial Intelligence (GenAI) systems are increasingly being deployed across diverse industries and research domains.
no code implementations • 30 May 2024 • Aquia Richburg, Marine Carpuat
To address these questions, we conduct an extensive empirical evaluation of the translation quality of the TOWER family of language models (Alves et al., 2024) on 132 translation tasks from the multi-parallel FLORES-200 data.
1 code implementation • 16 May 2024 • Calvin Bao, Marine Carpuat
Authorship obfuscation techniques hold the promise of helping people protect their privacy in online communications by automatically rewriting text to hide the identity of the original author.
no code implementations • 17 Apr 2024 • Neha Srikanth, Marine Carpuat, Rachel Rudinger
We propose a metric for evaluating the paraphrastic consistency of natural language reasoning models based on the probability of a model achieving the same correctness on two paraphrases of the same problem.
1 code implementation • 11 Apr 2024 • Dayeon Ki, Marine Carpuat
Machine Translation (MT) remains one of the last NLP tasks where large language models (LLMs) have not yet replaced dedicated supervised systems.
no code implementations • 21 Mar 2024 • Hyojung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang
It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre-training schemes.
1 code implementation • 15 Dec 2023 • Sweta Agrawal, Marine Carpuat
With this framework, we conduct a thorough human evaluation of texts by humans and by nine automatic systems.
1 code implementation • 4 Dec 2023 • Eleftheria Briakou, Navita Goyal, Marine Carpuat
Explainable NLP techniques primarily explain by answering "Which tokens in the input are responsible for this prediction?''.
1 code implementation • 3 Dec 2023 • Hyojung Han, Jordan Lee Boyd-Graber, Marine Carpuat
Translations help people understand content written in another language.
no code implementations • 27 Nov 2023 • Elijah Rippeth, Marine Carpuat, Kevin Duh, Matt Post
Lexical ambiguity is a challenging and pervasive problem in machine translation (\mt).
no code implementations • 16 Nov 2023 • Jiayi Wang, David Ifeoluwa Adelani, Sweta Agrawal, Marek Masiak, Ricardo Rei, Eleftheria Briakou, Marine Carpuat, Xuanli He, Sofia Bourhim, Andiswa Bukula, Muhidin Mohamed, Temitayo Olatoye, Tosin Adewumi, Hamam Mokayed, Christine Mwase, Wangui Kimotho, Foutse Yuehgoh, Anuoluwapo Aremu, Jessica Ojo, Shamsuddeen Hassan Muhammad, Salomey Osei, Abdul-Hakeem Omotayo, Chiamaka Chukwuneke, Perez Ogayo, Oumaima Hourrane, Salma El Anigri, Lolwethu Ndolela, Thabiso Mangwana, Shafie Abdi Mohamed, Ayinde Hassan, Oluwabusayo Olufunke Awoyomi, Lama Alkhaled, sana al-azzawi, Naome A. Etori, Millicent Ochieng, Clemencia Siro, Samuel Njoroge, Eric Muchiri, Wangari Kimotho, Lyse Naomi Wamba Momo, Daud Abolade, Simbiat Ajao, Iyanuoluwa Shode, Ricky Macharm, Ruqayya Nasir Iro, Saheed S. Abdullahi, Stephen E. Moore, Bernard Opoku, Zainab Akinjobi, Abeeb Afolabi, Nnaemeka Obiefuna, Onyekachi Raphael Ogbu, Sam Brian, Verrah Akinyi Otiende, Chinedu Emmanuel Mbonu, Sakayo Toadoum Sari, Yao Lu, Pontus Stenetorp
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments.
1 code implementation • 25 Oct 2023 • Nikita Mehandru, Sweta Agrawal, Yimin Xiao, Elaine C Khoong, Ge Gao, Marine Carpuat, Niloufar Salehi
A major challenge in the practical use of Machine Translation (MT) is that users lack guidance to make informed decisions about when to rely on outputs.
1 code implementation • 23 Oct 2023 • Tin Nguyen, Jiannan Xu, Aayushi Roy, Hal Daumé III, Marine Carpuat
We apply this method in the context of content moderation of potential hate speech, and its differential impact on Asian vs. non-Asian proxy moderators, across explanation approaches (saliency map and counterfactual explanation).
no code implementations • 24 May 2023 • Sweta Agrawal, Marine Carpuat
Based on these insights, we introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis.
no code implementations • 23 May 2023 • Navita Goyal, Eleftheria Briakou, Amanda Liu, Connor Baumler, Claire Bonial, Jeffrey Micher, Clare R. Voss, Marine Carpuat, Hal Daumé III
In this work, we study how users interact with QA systems in the absence of sufficient information to assess their predictions.
1 code implementation • 18 Jan 2023 • Weijia Xu, Sweta Agrawal, Eleftheria Briakou, Marianna J. Martindale, Marine Carpuat
Neural sequence generation models are known to "hallucinate", by producing outputs that are unrelated to the source text.
7 code implementations • 9 Nov 2022 • BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.
no code implementations • 7 Sep 2022 • Yongle Zhang, Dennis Asamoah Owusu, Marine Carpuat, Ge Gao
We manipulated the exchange of subgroup conversation logs prior to team meetings: with MT-mediated exchanges versus without.
no code implementations • IWSLT (ACL) 2022 • Elijah Rippeth, Sweta Agrawal, Marine Carpuat
This paper describes the University of Maryland's submission to the Special Task on Formality Control for Spoken Language Translation at \iwslt, which evaluates translation from English into 6 languages with diverse grammatical formality markers.
no code implementations • ACL 2022 • Sweta Agrawal, Marine Carpuat
We propose a framework for training non-autoregressive sequence-to-sequence models for editing tasks, where the original input sequence is iteratively edited to produce the output.
no code implementations • ACL 2022 • Eleftheria Briakou, Marine Carpuat
Synthetic translations have been used for a wide range of NLP tasks primarily as a means of data augmentation.
2 code implementations • EMNLP 2021 • Eleftheria Briakou, Sweta Agrawal, Joel Tetreault, Marine Carpuat
While the field of style transfer (ST) has been growing rapidly, it has been hampered by a lack of standardized practices for automatic evaluation.
1 code implementation • EMNLP 2021 • Weijia Xu, Marine Carpuat
Current approaches to incorporating terminology constraints in machine translation (MT) typically assume that the constraint terms are provided in their correct morphological forms.
1 code implementation • ACL (GEM) 2021 • Eleftheria Briakou, Sweta Agrawal, Ke Zhang, Joel Tetreault, Marine Carpuat
However, in style transfer papers, we find that protocols for human evaluations are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods.
2 code implementations • ACL 2021 • Eleftheria Briakou, Marine Carpuat
While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise.
no code implementations • Findings (ACL) 2021 • Weijia Xu, Shuming Ma, Dongdong Zhang, Marine Carpuat
While non-autoregressive (NAR) models are showing great promise for machine translation, their use is limited by their dependence on knowledge distillation from autoregressive models.
1 code implementation • 13 Nov 2020 • Weijia Xu, Marine Carpuat
We introduce an Edit-Based Transformer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in output lexical choice.
1 code implementation • WMT (EMNLP) 2020 • David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, Kathleen McKeown
In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating that our approach enables preservation of 95% of the terminologies and also improves translation quality on English-German benchmarks.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Weijia Xu, Xing Niu, Marine Carpuat
While Iterative Back-Translation and Dual Learning effectively incorporate monolingual training data in neural machine translation, they use different objectives and heuristic gradient approximation strategies, and have not been extensively compared.
1 code implementation • EMNLP 2020 • Eleftheria Briakou, Marine Carpuat
Detecting fine-grained differences in content conveyed in different languages matters for cross-lingual NLP and multilingual corpora analysis, but it is a challenging machine learning problem since annotation is expensive and hard to scale.
no code implementations • WS 2020 • Sweta Agrawal, Marine Carpuat
We introduce a machine translation task where the output is aimed at audiences of different levels of target language proficiency.
no code implementations • WS 2020 • Aquia Richburg, Esk, Ramy er, Smar Muresan, a, Marine Carpuat
Byte-Pair Encoding (BPE) (Sennrich et al., 2016) has become a standard pre-processing step when building neural machine translation systems.
no code implementations • WS 2020 • Weijia Xu, Marine Carpuat
We introduce an iterative text refinement model to reduce the decoding space of non-autoregressive models by disentangling the token prediction and relative position prediction.
no code implementations • WS 2020 • Kevin Kuo, Marine Carpuat
However, the Bi-LSTM models lag behind the best performing systems in the shared task.
no code implementations • WS 2020 • Sweta Agrawal, Marine Carpuat
This paper describes the University of Maryland{'}s submission to the Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE).
1 code implementation • 20 Nov 2019 • Xing Niu, Marine Carpuat
This work aims to produce translations that convey source language content at a formality level that is appropriate for a particular audience.
1 code implementation • IJCNLP 2019 • Sweta Agrawal, Marine Carpuat
This work introduces a machine translation task where the output is aimed at audiences of different levels of target language proficiency.
no code implementations • IJCNLP 2019 • Yogarshi Vyas, Marine Carpuat
Our classifier relies on a novel attention-based distillation approach to account for translation ambiguity when transferring knowledge from English to cross-lingual settings.
no code implementations • WS 2019 • Eleftheria Briakou, Marine Carpuat
This paper describes the University of Maryland{'}s submission to the WMT 2019 Kazakh-English news translation task.
no code implementations • NAACL 2019 • Xuan Zhang, Pamela Shapiro, Gaurav Kumar, Paul McNamee, Marine Carpuat, Kevin Duh
We introduce a curriculum learning approach to adapt generic neural machine translation models to a specific domain.
1 code implementation • NAACL 2019 • Weijia Xu, Xing Niu, Marine Carpuat
Despite some empirical success at correcting exposure bias in machine translation, scheduled sampling algorithms suffer from a major drawback: they incorrectly assume that words in the reference translations and in sampled sequences are aligned at each time step.
1 code implementation • 2 Nov 2018 • Xuan Zhang, Gaurav Kumar, Huda Khayrallah, Kenton Murray, Jeremy Gwinnup, Marianna J. Martindale, Paul McNamee, Kevin Duh, Marine Carpuat
Machine translation systems based on deep neural networks are expensive to train.
1 code implementation • NAACL 2019 • Xing Niu, Weijia Xu, Marine Carpuat
We aim to better exploit the limited amounts of parallel text available in low-resource settings by introducing a differentiable reconstruction loss for neural machine translation (NMT).
Low Resource Neural Machine Translation
Low-Resource Neural Machine Translation
+2
no code implementations • WS 2018 • Weijia Xu, Marine Carpuat
This paper describes the University of Maryland{'}s submission to the WMT 2018 Chinese↔English news translation tasks.
2 code implementations • COLING 2018 • Xing Niu, Sudha Rao, Marine Carpuat
Generating natural language requires conveying content in an appropriate style.
no code implementations • SEMEVAL 2018 • Alex Zhang, er, Marine Carpuat
We describe the University of Maryland{'}s submission to SemEval-018 Task 10, {``}Capturing Discriminative Attributes{''}: given word triples (w1, w2, d), the goal is to determine whether d is a discriminating attribute belonging to w1 but not w2.
no code implementations • WS 2018 • Xing Niu, Michael Denkowski, Marine Carpuat
Despite impressive progress in high-resource settings, Neural Machine Translation (NMT) still struggles in low-resource and out-of-domain scenarios, often failing to match the quality of phrase-based translation.
1 code implementation • NAACL 2018 • Shyam Upadhyay, Yogarshi Vyas, Marine Carpuat, Dan Roth
We propose BISPARSE-DEP, a family of unsupervised approaches for cross-lingual hypernymy detection, which learns sparse, bilingual word embeddings based on dependency contexts.
1 code implementation • NAACL 2018 • Yogarshi Vyas, Xing Niu, Marine Carpuat
Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation.
no code implementations • WS 2018 • Marianna J. Martindale, Marine Carpuat
Although measuring intrinsic quality has been a key factor in the advancement of Machine Translation (MT), successfully deploying MT requires considering not just intrinsic quality but also the user experience, including aspects such as trust.
no code implementations • EMNLP 2017 • Xing Niu, Marianna Martindale, Marine Carpuat
Stylistic variations of language, such as formality, carry speakers{'} intention beyond literal meaning and should be conveyed adequately in translation.
no code implementations • WS 2017 • Xing Niu, Marine Carpuat
Detecting and analyzing stylistic variation in language is relevant to diverse Natural Language Processing applications.
no code implementations • WS 2017 • Marine Carpuat, Yogarshi Vyas, Xing Niu
Parallel corpora are often not as parallel as one might assume: non-literal translations and noisy translations abound, even in curated corpora routinely used for training and evaluation.
no code implementations • SEMEVAL 2017 • Yogarshi Vyas, Marine Carpuat
We introduce WHiC, a challenging testbed for detecting hypernymy, an asymmetric relation between words.
no code implementations • TACL 2013 • Ann Irvine, John Morgan, Marine Carpuat, Hal Daum{\'e} III, Dragos Munteanu
We develop two techniques for analyzing the effect of porting a machine translation system to a new domain.