Natural Language Processing

Multilingual NLP

34 papers with code • 0 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Multilingual NLP

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Multilingual NLP models and implementations

nlp-uoregon/trankit

2 papers

707

Datasets

Most implemented papers

Most implemented Social Latest No code

fugashi, a Tool for Tokenizing Japanese in Python

polm/fugashi • EMNLP (NLPOSS) 2020

Recent years have seen an increase in the number of large-scale multilingual NLP projects.

Paper
Code

Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis

om304/multi-spa-verb • COLING 2020

We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on verb semantics.

Paper
Code

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

nlp-uoregon/trankit • • EACL 2021

Finally, we create a demo video for Trankit at: https://youtu. be/q0KGP3zGjGc.

Paper
Code

SICKNL: A Dataset for Dutch Natural Language Inference

gijswijnholds/sick_nl • • 14 Jan 2021

We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in Dutch.

Paper
Code

SICK-NL: A Dataset for Dutch Natural Language Inference

gijswijnholds/sick_nl • • EACL 2021

We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in Dutch.

Paper
Code

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

AI4Bharat/indicTrans • • 12 Apr 2021

We mine the parallel sentences from the web by combining many corpora, tools, and methods: (a) web-crawled monolingual corpora, (b) document OCR for extracting sentences from scanned documents, (c) multilingual representation models for aligning sentences, and (d) approximate nearest neighbor search for searching in a large collection of sentences.

Paper
Code

Analysing The Impact Of Linguistic Features On Cross-Lingual Transfer

blazejdolicki/multilingual-analysis • 12 May 2021

As a result, one should not expect that for a target language $L_1$ there is a single language $L_2$ that is the best choice for any NLP task (for instance, for Bulgarian, the best source language is French on POS tagging, Russian on NER and Thai on NLI).

Paper
Code

Cultural and Geographical Influences on Image Translatability of Words across Languages

nikzadkhani/MMID-CNN-Analysis • NAACL 2021

We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i. e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity.

Paper
Code

HONEST: Measuring Hurtful Sentence Completion in Language Models

milanlproc/honest • NAACL 2021

Our results show that 4. 3{\%} of the time, language models complete a sentence with a hurtful word.

Paper
Code

Improving Word Translation via Two-Stage Contrastive Learning

cambridgeltl/contrastivebli • • ACL ARR November 2021

As Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.

Paper
Code

Multilingual NLP

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result