Multilingual NLP

34 papers with code • 0 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Multilingual NLP models and implementations

Most implemented papers

fugashi, a Tool for Tokenizing Japanese in Python

polm/fugashi EMNLP (NLPOSS) 2020

Recent years have seen an increase in the number of large-scale multilingual NLP projects.

Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis

om304/multi-spa-verb COLING 2020

We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on verb semantics.

SICKNL: A Dataset for Dutch Natural Language Inference

gijswijnholds/sick_nl 14 Jan 2021

We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in Dutch.

SICK-NL: A Dataset for Dutch Natural Language Inference

gijswijnholds/sick_nl EACL 2021

We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in Dutch.

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

AI4Bharat/indicTrans 12 Apr 2021

We mine the parallel sentences from the web by combining many corpora, tools, and methods: (a) web-crawled monolingual corpora, (b) document OCR for extracting sentences from scanned documents, (c) multilingual representation models for aligning sentences, and (d) approximate nearest neighbor search for searching in a large collection of sentences.

Analysing The Impact Of Linguistic Features On Cross-Lingual Transfer

blazejdolicki/multilingual-analysis 12 May 2021

As a result, one should not expect that for a target language $L_1$ there is a single language $L_2$ that is the best choice for any NLP task (for instance, for Bulgarian, the best source language is French on POS tagging, Russian on NER and Thai on NLI).

Cultural and Geographical Influences on Image Translatability of Words across Languages

nikzadkhani/MMID-CNN-Analysis NAACL 2021

We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i. e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity.

HONEST: Measuring Hurtful Sentence Completion in Language Models

milanlproc/honest NAACL 2021

Our results show that 4. 3{\%} of the time, language models complete a sentence with a hurtful word.

Improving Word Translation via Two-Stage Contrastive Learning

cambridgeltl/contrastivebli ACL ARR November 2021

As Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.