Native Language Identification

5 papers with code • 1 benchmarks • 2 datasets

Native Language Identification (NLI) is the task of determining an author's native language (L1) based only on their writings in a second language (L2).

Benchmarks

Add a Result

These leaderboards are used to track progress in Native Language Identification

Trend	Dataset	Best Model	Paper	Code	Compare
	italki NLI	Tubasfs			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

A study of N-gram and Embedding Representations for Native Language Identification

nishkalavallabhi/NLIST2017 • WS 2017

We report on our experiments with N-gram and embedding based feature representations for Native Language Identification (NLI) as a part of the NLI Shared Task 2017 (team name: NLI-ISU).

Paper
Code

On the Development of a Large Scale Corpus for Native Language Identification

ghomasHudson/italkiCorpus • TLT17 2018

It can be used for training machine learning based systems for classifying and identifying the native language of authors of English text.

Paper
Code

Towards Ethical Content-Based Detection of Online Influence Campaigns

ecrows/l2-reddit-experiment • • 29 Aug 2019

The detection of clandestine efforts to influence users in online communities is a challenging problem with significant active development.

Paper
Code

Topics to Avoid: Demoting Latent Confounds in Text Classification

Sachin19/adversarial-classify • • IJCNLP 2019

Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well.

Paper
Code

Native Language Identification with Big Bird Embeddings

sergeykramp/mthesis-bigbird-embeddings • • 13 Sep 2023

Native Language Identification (NLI) intends to classify an author's native language based on their writing in another language.

Paper
Code

Native Language Identification

Benchmarks Add a Result

Datasets

Most implemented papers

A study of N-gram and Embedding Representations for Native Language Identification

On the Development of a Large Scale Corpus for Native Language Identification

Towards Ethical Content-Based Detection of Online Influence Campaigns

Topics to Avoid: Demoting Latent Confounds in Text Classification

Native Language Identification with Big Bird Embeddings

Content

Benchmarks

Add a Result