Document Translation

11 papers with code • 3 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Document Translation

Dataset	Best Model	Compare
CodeXGLUE - Microsoft Docs	Pretrained Transformer	See all
WMT 2020	DOCmT5	See all
IWSLT2015	DOCmT5	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

microsoft/CodeXGLUE • • 9 Feb 2021

Benchmark datasets have a significant impact on accelerating research in programming language tasks.

Paper
Code

Pre-training via Paraphrasing

lucidrains/marge-pytorch • • NeurIPS 2020

The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.

Paper
Code

Contextual Neural Model for Translating Bilingual Multi-Speaker Conversations

sameenmaruf/Bi-MSMT • WS 2018

In this work, we propose the task of translating Bilingual Multi-Speaker Conversations, and explore neural architectures which exploit both source and target-side conversation histories for this task.

Paper
Code

CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task

ssun32/CLIReval • ACL 2020

We present CLIReval, an easy-to-use toolkit for evaluating machine translation (MT) with the proxy task of cross-lingual information retrieval (CLIR).

Paper
Code

Rethinking Document-level Neural Machine Translation

sunzewei2715/Doc2Doc_NMT • Findings (ACL) 2022

This paper does not aim at introducing a novel model for document-level neural machine translation.

Paper
Code

UDAAN: Machine Learning based Post-Editing tool for Document Translation

IITB-OpenOCRCorrect/iitb-openocr-digit-tool • 3 Mar 2022

UDAAN has an end-to-end Machine Translation (MT) plus post-editing pipeline wherein users can upload a document to obtain raw MT output.

Paper
Code

Neural Approaches to Multilingual Information Retrieval

hltcoe/colbert-x • • 3 Sep 2022

Providing access to information across languages has been a goal of Information Retrieval (IR) for decades.

Paper
Code

Modeling Context With Linear Attention for Scalable Document-Level Translation

zhaofengwu/rfa-doc-mt • • 16 Oct 2022

Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations.

Paper
Code

TransDocs: Optical Character Recognition with word to word translation

abhishekbamotra/transdocs • • 15 Apr 2023

While OCR has been used in various applications, its output is not always accurate, leading to misfit words.

Paper
Code

NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages

indonlp/nusa-writes • • 19 Sep 2023

We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.

Paper
Code

Document Translation

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result