Search Results for author: Shaomu Tan

Found 7 papers, 4 papers with code

Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation

no code implementations • 17 Apr 2024 • Shaomu Tan, Di wu, Christof Monz

Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference.

Cross-Lingual Transfer Machine Translation +2

Paper
Add Code

How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

1 code implementation • 22 Jan 2024 • Di wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz

Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem.

Machine Translation Translation

Paper
Code

Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance

1 code implementation • 16 Oct 2023 • Shaomu Tan, Christof Monz

Our findings highlight that the target side translation quality is the most influential factor, with vocabulary overlap consistently impacting ZS performance.

Machine Translation NMT +1

Paper
Code

UvA-MT's Participation in the WMT23 General Translation Shared Task

no code implementations • 15 Oct 2023 • Di wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz

This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.

Machine Translation Translation

Paper
Add Code

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

1 code implementation • 29 Aug 2023 • Sotirios Kastanas, Shaomu Tan, Yi He

In this study, we aim to fill these gaps by conducting a comparative evaluation of state-of-the-art models in document layout analysis and investigating the potential of cross-lingual layout analysis by utilizing machine translation techniques.

Document AI Document Layout Analysis +2

Paper
Code

Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning

1 code implementation • NeurIPS 2023 • Baohao Liao, Shaomu Tan, Christof Monz

One effective way to reduce the activation memory is to apply a reversible model, so the intermediate activations are not necessary to be cached and can be recomputed.

Image Classification Question Answering

Paper
Code

Towards leveraging latent knowledge and Dialogue context for real-world conversational question answering

no code implementations • 17 Dec 2022 • Shaomu Tan, Denis Paperno

In many real-world scenarios, the absence of external knowledge source like Wikipedia restricts question answering systems to rely on latent internal knowledge in limited dialogue data.

Conversational Question Answering Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.