Search Results for author: Suma Bhat

Found 40 papers, 16 papers with code

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

4 code implementations ICLR 2018 Jiaqi Mu, Suma Bhat, Pramod Viswanath

The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and { text classification}) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.

General Classification Sentence +4

Document Similarity for Texts of Varying Lengths via Hidden Topics

1 code implementation ACL 2018 Hongyu Gong, Tarek Sakakini, Suma Bhat, JinJun Xiong

This is because of the lexical, contextual and the abstraction gaps between a long document of rich details and its concise summary of abstract information.

Text Matching

Self-Supervised Euphemism Detection and Identification for Content Moderation

1 code implementation31 Mar 2021 Wanzheng Zhu, Hongyu Gong, Rohan Bansal, Zachary Weinberg, Nicolas Christin, Giulia Fanti, Suma Bhat

It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy.

Sentence Word Embeddings

Euphemistic Phrase Detection by Masked Language Model

1 code implementation Findings (EMNLP) 2021 Wanzheng Zhu, Suma Bhat

It is a well-known approach for fringe groups and organizations to use euphemisms -- ordinary-sounding and innocent-looking words with a secret meaning -- to conceal what they are discussing.

Language Modelling TAR

Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech

1 code implementation Findings (ACL) 2021 Wanzheng Zhu, Suma Bhat

Countermeasures to effectively fight the ever increasing hate speech online without blocking freedom of speech is of great social interest.

Blocking Retrieval +1

Idiomatic Expression Identification using Semantic Compatibility

1 code implementation19 Oct 2021 Ziheng Zeng, Suma Bhat

Idiomatic expressions are an integral part of natural language and constantly being added to a language.

Sentence

Geometry of Compositionality

1 code implementation29 Nov 2016 Hongyu Gong, Suma Bhat, Pramod Viswanath

This paper proposes a simple test for compositionality (i. e., literal usage) of a word or phrase in a context-specific way.

Word Embeddings

FUSE: Multi-Faceted Set Expansion by Coherent Clustering of Skip-grams

1 code implementation10 Oct 2019 Wanzheng Zhu, Hongyu Gong, Jiaming Shen, Chao Zhang, Jingbo Shang, Suma Bhat, Jiawei Han

In this paper, we study the task of multi-faceted set expansion, which aims to capture all semantic facets in the seed set and return multiple sets of entities, one for each semantic facet.

Clustering Language Modelling

Enriching Word Embeddings with Temporal and Spatial Information

1 code implementation CONLL 2020 Hongyu Gong, Suma Bhat, Pramod Viswanath

The meaning of a word is closely linked to sociocultural factors that can change over time and location, resulting in corresponding meaning changes.

Word Embeddings

CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family

1 code implementation1 Oct 2022 S Ashwin Hebbar, Viraj Nadkarni, Ashok Vardhan Makkuva, Suma Bhat, Sewoong Oh, Pramod Viswanath

We design a principled curriculum, guided by information-theoretic insights, to train CRISP and show that it outperforms the successive-cancellation (SC) decoder and attains near-optimal reliability performance on the Polar(32, 16) and Polar(64, 22) codes.

Preposition Sense Disambiguation and Representation

1 code implementation EMNLP 2018 Hongyu Gong, Jiaqi Mu, Suma Bhat, Pramod Viswanath

Prepositions are highly polysemous, and their variegated senses encode significant semantic information.

IEKG: A Commonsense Knowledge Graph for Idiomatic Expressions

1 code implementation11 Dec 2023 Ziheng Zeng, Kellen Tan Cheng, Srihari Venkat Nanniyur, Jianing Zhou, Suma Bhat

Unlike prior works that enable IE comprehension through fine-tuning PTLMs with sentences containing IEs, in this work, we construct IEKG, a commonsense knowledge graph for figurative interpretations of IEs.

Natural Language Understanding

Embedding Syntax and Semantics of Prepositions via Tensor Decomposition

no code implementations NAACL 2018 Hongyu Gong, Suma Bhat, Pramod Viswanath

Prepositions are among the most frequent words in English and play complex roles in the syntax and semantics of sentences.

Tensor Decomposition

MORSE: Semantic-ally Drive-n MORpheme SEgment-er

no code implementations ACL 2017 Tarek Sakakini, Suma Bhat, Pramod Viswanath

We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes.

Benchmarking

Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology

no code implementations7 Feb 2017 Tarek Sakakini, Suma Bhat, Pramod Viswanath

We present an unsupervised and language-agnostic method for learning root-and-pattern morphology in Semitic languages.

Prepositions in Context

no code implementations5 Feb 2017 Hongyu Gong, Jiaqi Mu, Suma Bhat, Pramod Viswanath

Prepositions are highly polysemous, and their variegated senses encode significant semantic information.

Clustering

Geometry of Polysemy

no code implementations24 Oct 2016 Jiaqi Mu, Suma Bhat, Pramod Viswanath

Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec.

Clustering Sentence

Context-Sensitive Malicious Spelling Error Correction

no code implementations23 Jan 2019 Hongyu Gong, Yuchen Li, Suma Bhat, Pramod Viswanath

Misspelled words of the malicious kind work by changing specific keywords and are intended to thwart existing automated applications for cyber-environment control such as harassing content detection on the Internet and email spam detection.

Spam detection Spelling Correction +1

Equipping Educational Applications with Domain Knowledge

no code implementations WS 2019 Tarek Sakakini, Hongyu Gong, Jong Yoon Lee, Robert Schloss, JinJun Xiong, Suma Bhat

One of the challenges of building natural language processing (NLP) applications for education is finding a large domain-specific corpus for the subject of interest (e. g., history or science).

Distractor Generation Language Modelling +1

PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space

no code implementations IJCNLP 2019 Omer Anjum, Hongyu Gong, Suma Bhat, Wen-mei Hwu, JinJun Xiong

Finding the right reviewers to assess the quality of conference submissions is a time consuming process for conference organizers.

Topic Models

From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic Text Generation

no code implementations13 Apr 2021 Jianing Zhou, Hongyu Gong, Srihari Nanniyur, Suma Bhat

We study a new application for text generation -- idiomatic sentence generation -- which aims to transfer literal phrases in sentences into their idiomatic counterparts.

Sentence Text Generation

PIE: A Parallel Idiomatic Expression Corpus for Idiomatic Sentence Generation and Paraphrasing

no code implementations ACL (MWE) 2021 Jianing Zhou, Hongyu Gong, Suma Bhat

Idiomatic expressions (IE) play an important role in natural language, and have long been a “pain in the neck” for NLP systems.

Sentence Text Generation

Paraphrase Generation: A Survey of the State of the Art

no code implementations EMNLP 2021 Jianing Zhou, Suma Bhat

This paper focuses on paraphrase generation, which is a widely studied natural language generation task in NLP.

Paraphrase Generation

Rich Syntactic and Semantic Information Helps Unsupervised Text Style Transfer

no code implementations INLG (ACL) 2020 Hongyu Gong, Linfeng Song, Suma Bhat

Text style transfer aims to change an input sentence to an output sentence by changing its text style while preserving the content.

Sentence Style Transfer +2

Idiomatic Expression Paraphrasing without Strong Supervision

no code implementations16 Dec 2021 Jianing Zhou, Ziheng Zeng, Hongyu Gong, Suma Bhat

In this paper, we study the task of idiomatic sentence paraphrasing (ISP), which aims to paraphrase a sentence with an IE by replacing the IE with its literal paraphrase.

Machine Translation Sentence +1

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

1 code implementation8 Jul 2022 Ziheng Zeng, Suma Bhat

Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language.

Clustering

Unified Representation for Non-compositional and Compositional Expressions

1 code implementation29 Oct 2023 Ziheng Zeng, Suma Bhat

Accurate processing of non-compositional language relies on generating good representations for such expressions.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.