Cross-Lingual Document Classification

12 papers with code • 10 benchmarks • 2 datasets

Cross-lingual document classification refers to the task of using data and models available for one language for which ample such resources are available (e.g., English) to solve classification tasks in another, commonly low-resource, language.

Latest papers with no code

Margin-aware Unsupervised Domain Adaptation for Cross-lingual Text Labeling

no code yet • Findings of the Association for Computational Linguistics 2020

Unsupervised domain adaptation addresses the problem of leveraging labeled data in a source domain to learn a well-performing model in a target domain where labels are unavailable.

Wasserstein distances for evaluating cross-lingual embeddings

no code yet • 24 Oct 2019

Word embeddings are high dimensional vector representations of words that capture their semantic similarity in the vector space.

Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification

no code yet • 22 Dec 2018

Text classification must sometimes be applied in a low-resource language with no labeled training data.

Variational learning across domains with triplet information

no code yet • 22 Oct 2018

The work investigates deep generative models, which allow us to use training data from one domain to build a model for another domain.

NMT-based Cross-lingual Document Embeddings

no code yet • 29 Jul 2018

This paper further adds a distance constraint to the training objective function of NV so that the two embeddings of a parallel document are required to be as close as possible.

A Multi-task Approach to Learning Multilingual Representations

no code yet • ACL 2018

We present a novel multi-task modeling approach to learning multilingual distributed representations of text.

Multilingual Seq2seq Training with Similarity Loss for Cross-Lingual Document Classification

no code yet • WS 2018

In this paper we continue experiments where neural machine translation training is used to produce joint cross-lingual fixed-dimensional sentence embeddings.

Variational learning across domains with triplet information

no code yet • 22 Jun 2018

The work investigates deep generative models, which allow us to use training data from one domain to build a model for another domain.