Search Results for author: Kelly Marchisio

Found 13 papers, 4 papers with code

Embedding-Enhanced GIZA++: Improving Low-Resource Word Alignment Using Embeddings

no code implementations • AMTA 2022 • Kelly Marchisio, Conghao Xiong, Philipp Koehn

A popular natural language processing task decades ago, word alignment has been dominated until recently by GIZA++, a statistical method based on the 30-year-old IBM models.

Machine Translation Translation +1

Paper
Add Code

An Alignment-Based Approach to Semi-Supervised Bilingual Lexicon Induction with Small Parallel Corpora

1 code implementation • MTSummit 2021 • Kelly Marchisio, Philipp Koehn, Conghao Xiong

Aimed at generating a seed lexicon for use in downstream natural language tasks and unsupervised methods for bilingual lexicon induction have received much attention in the academic literature recently.

Bilingual Lexicon Induction Translation

Paper
Code

Improving Language Plasticity via Pretraining with Active Forgetting

no code implementations • NeurIPS 2023 • Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, Mikel Artetxe

Pretrained language models (PLMs) are today the primary model for natural language processing.

Meta-Learning

Paper
Add Code

Learning a Formality-Aware Japanese Sentence Representation

no code implementations • 17 Jan 2023 • Henry Li Xinyuan, Ray Lee, Jerry Chen, Kelly Marchisio

On the other hand, downstream tasks such as translation would benefit from working with a sentence representation that preserves formality in addition to semantics, so as to generate sentences with the appropriate level of social formality -- the difference between speaking to a friend versus speaking with a supervisor.

Sentence

Paper
Add Code

Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training

no code implementations • 20 Dec 2022 • Kelly Marchisio, Patrick Lewis, Yihong Chen, Mikel Artetxe

Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new languages by learning a new set of embeddings, while keeping the transformer body frozen.

Cross-Lingual Transfer

Paper
Add Code

Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport

no code implementations • 25 Oct 2022 • Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, Philipp Koehn

Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval.

Bilingual Lexicon Induction Graph Matching +3

Paper
Add Code

IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces

1 code implementation • 11 Oct 2022 • Kelly Marchisio, Neha Verma, Kevin Duh, Philipp Koehn

The ability to extract high-quality translation dictionaries from monolingual word embedding spaces depends critically on the geometric similarity of the spaces -- their degree of "isomorphism."

Bilingual Lexicon Induction Translation

Paper
Code

An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces

1 code implementation • Findings (EMNLP) 2021 • Kelly Marchisio, Youngser Park, Ali Saad-Eldin, Anton Alyakin, Kevin Duh, Carey Priebe, Philipp Koehn

Alternatively, word embeddings may be understood as nodes in a weighted graph.

Bilingual Lexicon Induction Graph Matching +1

Paper
Code

On Systematic Style Differences between Unsupervised and Supervised MT and an Application for High-Resource Machine Translation

no code implementations • NAACL 2022 • Kelly Marchisio, Markus Freitag, David Grangier

Modern unsupervised machine translation (MT) systems reach reasonable translation quality under clean and controlled data conditions.

Translation Unsupervised Machine Translation

Paper
Add Code

Embedding-Enhanced Giza++: Improving Alignment in Low- and High- Resource Scenarios Using Embedding Space Geometry

1 code implementation • 18 Apr 2021 • Kelly Marchisio, Conghao Xiong, Philipp Koehn

In the lowest-resource setting, we outperform GIZA++ by 8. 5, 10. 9, and 12 AER for Ro-En, De-En, and En-Fr, respectively.

Machine Translation Translation +1

Paper
Code

When Does Unsupervised Machine Translation Work?

no code implementations • WMT (EMNLP) 2020 • Kelly Marchisio, Kevin Duh, Philipp Koehn

We additionally find that unsupervised MT performance declines when source and target languages use different scripts, and observe very poor performance on authentic low-resource language pairs.

Translation Unsupervised Machine Translation

Paper
Add Code

Johns Hopkins University Submission for WMT News Translation Task

no code implementations • WS 2019 • Kelly Marchisio, Yash Kumar Lal, Philipp Koehn

We describe the work of Johns Hopkins University for the shared task of news translation organized by the Fourth Conference on Machine Translation (2019).

Machine Translation Translation

Paper
Add Code

Controlling the Reading Level of Machine Translation Output

no code implementations • WS 2019 • Kelly Marchisio, Jialiang Guo, Cheng-I Lai, Philipp Koehn

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.