Search Results for author: Avijit Thawani

Found 7 papers, 3 papers with code

Numeracy enhances the Literacy of Language Models

no code implementations • EMNLP 2021 • Avijit Thawani, Jay Pujara, Filip Ilievski

This paper studies the effect of using six different number encoders on the task of masked word prediction (MWP), as a proxy for evaluating literacy.

Sentence

Paper
Add Code

BPE beyond Word Boundary: How NOT to use Multi Word Expressions in Neural Machine Translation

1 code implementation • insights (ACL) 2022 • Dipesh Kumar, Avijit Thawani

BPE tokenization merges characters into longer tokens by finding frequently occurring contiguous patterns within the word boundary.

Machine Translation NMT +1

Paper
Code

Learn Your Tokens: Word-Pooled Tokenization for Language Modeling

1 code implementation • 17 Oct 2023 • Avijit Thawani, Saurabh Ghanekar, Xiaoyuan Zhu, Jay Pujara

Language models typically tokenize text into subwords, using a deterministic, hand-engineered heuristic of combining characters into longer surface-level strings such as 'ing' or whole words.

Language Modelling

Paper
Code

Estimating Numbers without Regression

no code implementations • 9 Oct 2023 • Avijit Thawani, Jay Pujara, Ashwin Kalyan

Despite recent successes in language models, their ability to represent numbers is insufficient.

Language Modelling regression

Paper
Add Code

Representing Numbers in NLP: a Survey and a Vision

no code implementations • NAACL 2021 • Avijit Thawani, Jay Pujara, Pedro A. Szekely, Filip Ilievski

NLP systems rarely give special consideration to numbers found in text.

Paper
Add Code

SWOW-8500: Word Association task for Intrinsic Evaluation of Word Embeddings

1 code implementation • WS 2019 • Avijit Thawani, Biplav Srivastava, Anil Singh

Downstream evaluation of pretrained word embeddings is expensive, more so for tasks where current state of the art models are very large architectures.

General Classification Natural Language Inference +4

Paper
Code

IJCNLP-2017 Task 3: Review Opinion Diversification (RevOpiD-2017)

no code implementations • IJCNLP 2017 • Anil Kumar Singh, Avijit Thawani, Mayank Panchal, Anubhav Gupta, Julian McAuley

Unlike Entity Disambiguation in web search results, Opinion Disambiguation is a relatively unexplored topic.

Document Ranking Document Summarization +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.