Search Results for author: Haw-Shiuan Chang

Found 20 papers, 9 papers with code

Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions

no code implementations ACL 2022 Haw-Shiuan Chang, Andrew McCallum

The softmax layer produces the distribution based on the dot products of a single hidden state and the embeddings of words in the vocabulary.

Word Embeddings

Unsupervised Partial Sentence Matching for Cited Text Identification

no code implementations sdp (COLING) 2022 Kathryn Ricci, Haw-Shiuan Chang, Purujit Goyal, Andrew McCallum

Given a citation in the body of a research paper, cited text identification aims to find the sentences in the cited paper that are most relevant to the citing sentence.

Sentence Sentence Embeddings

REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy

no code implementations11 Jun 2024 Haw-Shiuan Chang, Nanyun Peng, Mohit Bansal, Anil Ramakrishna, Tagyoung Chung

If a LLM's entropy is higher than the asymptotic entropy (i. e., the LLM is more uncertain than it should be), the THF model predicts a high hallucination hazard, which leads to a lower p threshold in REAL sampling.

Diversity Hallucination

To Copy, or not to Copy; That is a Critical Issue of the Output Softmax Layer in Neural Sequential Recommenders

1 code implementation21 Oct 2023 Haw-Shiuan Chang, Nikhil Agarwal, Andrew McCallum

Specifically, the similarity structure of the global item embeddings in the softmax layer sometimes forces the single hidden state embedding to be close to new items when copying is a better choice, while sometimes forcing the hidden state to be close to the items from the input inappropriately.

Sequential Recommendation

Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens

1 code implementation8 Sep 2023 Ronald Seoh, Haw-Shiuan Chang, Andrew McCallum

Many useful tasks on scientific documents, such as topic classification and citation prediction, involve corpora that span multiple scientific domains.

Citation Prediction Topic Classification

Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond

1 code implementation20 May 2023 Haw-Shiuan Chang, Zonghai Yao, Alolika Gon, Hong Yu, Andrew McCallum

Is the output softmax layer, which is adopted by most language models (LMs), always the best way to compute the next word probability?

Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling

1 code implementation10 Oct 2022 Haw-Shiuan Chang, Ruei-Yao Sun, Kathryn Ricci, Andrew McCallum

Ensembling BERT models often significantly improves accuracy, but at the cost of significantly more computation and memory footprint.

Augmenting Scientific Creativity with Retrieval across Knowledge Domains

1 code implementation2 Jun 2022 Hyeonsu B. Kang, Sheshera Mysore, Kevin Huang, Haw-Shiuan Chang, Thorben Prein, Andrew McCallum, Aniket Kittur, Elsa Olivetti

Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas.

Retrieval

Open Aspect Target Sentiment Classification with Natural Language Prompts

1 code implementation EMNLP 2021 Ronald Seoh, Ian Birle, Mrinal Tak, Haw-Shiuan Chang, Brian Pinette, Alfred Hough

For many business applications, we often seek to analyze sentiments associated with any arbitrary aspects of commercial products, despite having a very limited amount of labels or even without any labels at all.

Classification Sentiment Analysis +1

Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications

no code implementations29 Mar 2021 Haw-Shiuan Chang, Amol Agrawal, Andrew McCallum

Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences.

Extractive Summarization Sentence +2

Changing the Mind of Transformers for Topically-Controllable Language Generation

1 code implementation EACL 2021 Haw-Shiuan Chang, Jiaming Yuan, Mohit Iyyer, Andrew McCallum

Our framework consists of two components: (1) a method that produces a set of candidate topics by predicting the centers of word clusters in the possible continuations, and (2) a text generation model whose output adheres to the chosen topics.

Clustering Text Generation

Multi-facet Universal Schema

no code implementations EACL 2021 Rohan Paul, Haw-Shiuan Chang, Andrew McCallum

To address the violation of the USchema assumption, we propose multi-facet universal schema that uses a neural model to represent each sentence pattern as multiple facet embeddings and encourage one of these facet embeddings to be close to that of another sentence pattern if they co-occur with the same entity pair.

Relation Relation Extraction +1

Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings

no code implementations WS 2018 Haw-Shiuan Chang, Amol Agrawal, Ananya Ganesh, Anirudha Desai, Vinayak Mathur, Alfred Hough, Andrew McCallum

Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable.

Word Sense Induction

Automatically Extracting Action Graphs from Materials Science Synthesis Procedures

no code implementations18 Nov 2017 Sheshera Mysore, Edward Kim, Emma Strubell, Ao Liu, Haw-Shiuan Chang, Srikrishna Kompella, Kevin Huang, Andrew McCallum, Elsa Olivetti

In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds.

Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection

no code implementations NAACL 2018 Haw-Shiuan Chang, ZiYun Wang, Luke Vilnis, Andrew McCallum

Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, coreference, relation extraction, and question answering.

Hypernym Discovery Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.