Search Results for author: Keita Kurita

Found 4 papers, 3 papers with code

Weight Poisoning Attacks on Pretrained Models

no code implementations • ACL 2020 • Keita Kurita, Paul Michel, Graham Neubig

Recently, NLP has seen a surge in the usage of large pre-trained models.

Sentiment Analysis Sentiment Classification +1

Paper
Add Code

Weight Poisoning Attacks on Pre-trained Models

2 code implementations • 14 Apr 2020 • Keita Kurita, Paul Michel, Graham Neubig

We show that by applying a regularization method, which we call RIPPLe, and an initialization procedure, which we call Embedding Surgery, such attacks are possible even with limited knowledge of the dataset and fine-tuning procedure.

Sentiment Analysis Sentiment Classification +1

135

Paper
Code

Towards Robust Toxic Content Classification

1 code implementation • 14 Dec 2019 • Keita Kurita, Anna Belova, Antonios Anastasopoulos

We propose a method of generating realistic model-agnostic attacks using a lexicon of toxic tokens, which attempts to mislead toxicity classifiers by diluting the toxicity signal either by obfuscating toxic tokens through character-level perturbations, or by injecting non-toxic distractor tokens.

Classification Denoising +1

Paper
Code

Measuring Bias in Contextualized Word Representations

1 code implementation • WS 2019 • Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W. black, Yulia Tsvetkov

Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks.

Word Embeddings

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.