Search Results for author: Jacob Eisenstein

Found 90 papers, 28 papers with code

On Writing a Textbook on Natural Language Processing

no code implementations • NAACL (TeachingNLP) 2021 • Jacob Eisenstein

There are thousands of papers about natural language processing and computational linguistics, but very few textbooks.

Paper
Add Code

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

no code implementations • 18 Apr 2024 • Zhaofeng Wu, Ananth Balashankar, Yoon Kim, Jacob Eisenstein, Ahmad Beirami

In this work, we evaluate a simple approach for zero-shot cross-lingual alignment, where a reward model is trained on preference data in one source language and directly applied to other target languages.

Paper
Add Code

Transforming and Combining Rewards for Aligning Large Language Models

no code implementations • 1 Feb 2024 • ZiHao Wang, Chirag Nagpal, Jonathan Berant, Jacob Eisenstein, Alex D'Amour, Sanmi Koyejo, Victor Veitch

A common approach for aligning language models to human preferences is to first learn a reward model from preference data, and then use this reward model to update the language model.

Language Modelling

Paper
Add Code

Theoretical guarantees on the best-of-n alignment policy

no code implementations • 3 Jan 2024 • Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D'Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

A commonly used analytical expression in the literature claims that the KL divergence between the best-of-$n$ policy and the base policy is equal to $\log (n) - (n-1)/n.$ We disprove the validity of this claim, and show that it is an upper bound on the actual KL divergence.

Paper
Add Code

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

no code implementations • 14 Dec 2023 • Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

However, even pretrain reward ensembles do not eliminate reward hacking: we show several qualitative reward hacking phenomena that are not mitigated by ensembling because all reward models in the ensemble exhibit similar error patterns.

Language Modelling

Paper
Add Code

Selectively Answering Ambiguous Questions

no code implementations • 24 May 2023 • Jeremy R. Cole, Michael J. Q. Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein

We investigate question answering from this perspective, focusing on answering a subset of questions with a high degree of accuracy, from a set of questions in which many are inherently ambiguous.

Question Answering

Paper
Add Code

MD3: The Multi-Dialect Dataset of Dialogues

no code implementations • 19 May 2023 • Jacob Eisenstein, Vinodkumar Prabhakaran, Clara Rivera, Dorottya Demszky, Devyani Sharma

We introduce a new dataset of conversational speech representing English from India, Nigeria, and the United States.

Paper
Add Code

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster

We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.

Attribute Question Answering

Paper
Code

Dialect-robust Evaluation of Generated Text

no code implementations • 2 Nov 2022 • Jiao Sun, Thibault Sellam, Elizabeth Clark, Tu Vu, Timothy Dozat, Dan Garrette, Aditya Siddhant, Jacob Eisenstein, Sebastian Gehrmann

Evaluation metrics that are not robust to dialect variation make it impossible to tell how well systems perform for many groups of users, and can even penalize systems for producing text in lower-resource dialects.

nlg evaluation

Paper
Add Code

Predicting Long-Term Citations from Short-Term Linguistic Influence

1 code implementation • 24 Oct 2022 • Sandeep Soni, David Bamman, Jacob Eisenstein

A standard measure of the influence of a research paper is the number of times it is cited.

Paper
Code

Pre-trained Sentence Embeddings for Implicit Discourse Relation Classification

no code implementations • 20 Oct 2022 • Murali Raghu Babu Balusu, Yangfeng Ji, Jacob Eisenstein

Implicit discourse relations bind smaller linguistic units into coherent texts.

Classification Implicit Discourse Relation Classification +4

Paper
Add Code

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model

no code implementations • 5 Oct 2022 • Jacob Eisenstein, Daniel Andor, Bernd Bohnet, Michael Collins, David Mimno

But what sorts of rationales are useful and how can we train systems to produce them?

In-Context Learning Language Modelling +2

Paper
Add Code

Informativeness and Invariance: Two Perspectives on Spurious Correlations in Natural Language

no code implementations • NAACL 2022 • Jacob Eisenstein

Spurious correlations are a threat to the trustworthiness of natural language processing systems, motivating research into methods for identifying and eliminating them.

Informativeness Vocal Bursts Valence Prediction

Paper
Add Code

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

1 code implementation • 2 Sep 2021 • Amir Feder, Katherine A. Keith, Emaad Manzoor, Reid Pryzant, Dhanya Sridhar, Zach Wood-Doughty, Jacob Eisenstein, Justin Grimmer, Roi Reichart, Margaret E. Roberts, Brandon M. Stewart, Victor Veitch, Diyi Yang

A fundamental goal of scientific research is to learn about causal relationships.

Causal Inference Fairness

Paper
Code

Learning to Look Inside: Augmenting Token-Based Encoders with Character-Level Information

no code implementations • 1 Aug 2021 • Yuval Pinter, Amanda Stent, Mark Dredze, Jacob Eisenstein

Commonly-used transformer language models depend on a tokenization schema which sets an unchangeable subword vocabulary prior to pre-training, destined to be applied to all downstream tasks regardless of domain shift, novel word formations, or other sources of vocabulary mismatch.

Paper
Add Code

The MultiBERTs: BERT Reproductions for Robustness Analysis

3 code implementations • ICLR 2022 • Thibault Sellam, Steve Yadlowsky, Jason Wei, Naomi Saphra, Alexander D'Amour, Tal Linzen, Jasmijn Bastings, Iulia Turc, Jacob Eisenstein, Dipanjan Das, Ian Tenney, Ellie Pavlick

Experiments with pre-trained models such as BERT are often based on a single checkpoint.

coreference-resolution

1,562

Paper
Code

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

no code implementations • 30 Jun 2021 • Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-Wei Chang, Kristina Toutanova

Zero-shot cross-lingual transfer is emerging as a practical solution: pre-trained models later fine-tuned on one transfer language exhibit surprising performance when tested on many target languages.

Question Answering Zero-Shot Cross-Lingual Transfer

Paper
Add Code

Time-Aware Language Models as Temporal Knowledge Bases

no code implementations • 29 Jun 2021 • Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, William W. Cohen

We introduce a diagnostic dataset aimed at probing LMs for factual knowledge that changes over time and highlight problems with LMs at either end of the spectrum -- those trained on specific slices of temporal data, as well as those trained on a wide range of temporal data.

Memorization

Paper
Add Code

Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests

no code implementations • NeurIPS 2021 • Victor Veitch, Alexander D'Amour, Steve Yadlowsky, Jacob Eisenstein

We introduce counterfactual invariance as a formalization of the requirement that changing irrelevant parts of the input shouldn't change model predictions.

Causal Inference counterfactual +2

Paper
Add Code

Counterfactual Invariance to Spurious Correlations in Text Classification

no code implementations • NeurIPS 2021 • Victor Veitch, Alexander D'Amour, Steve Yadlowsky, Jacob Eisenstein

We introduce counterfactual invariance as a formalization of the requirement that changing irrelevant parts of the input shouldn't change model predictions.

Causal Inference counterfactual +2

Paper
Add Code

Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers

1 code implementation • 12 Mar 2021 • Sandeep Soni, Lauren Klein, Jacob Eisenstein

This paper supplements recent qualitative work on the role of women in abolition's vanguard, as well as the role of the Black press, with a quantitative text modeling approach.

Diachronic Word Embeddings Word Embeddings

Paper
Code

Tuiteamos o pongamos un tuit? Investigating the Social Constraints of Loanword Integration in Spanish Social Media

no code implementations • SCiL 2021 • Ian Stewart, Diyi Yang, Jacob Eisenstein

In social media, we find that speaker background and expectations of formality explain loanword and native word integration, such that authors who use more Spanish and who write to a wider audience tend to use integrated verb forms more often.

Paper
Add Code

Underspecification Presents Challenges for Credibility in Modern Machine Learning

no code implementations • 6 Nov 2020 • Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne, Rajiv Raman, Kim Ramasamy, Rory Sayres, Jessica Schrouff, Martin Seneviratne, Shannon Sequeira, Harini Suresh, Victor Veitch, Max Vladymyrov, Xuezhi Wang, Kellie Webster, Steve Yadlowsky, Taedong Yun, Xiaohua Zhai, D. Sculley

Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains.

BIG-bench Machine Learning

Paper
Add Code

Learning to Recognize Dialect Features

no code implementations • NAACL 2021 • Dorottya Demszky, Devyani Sharma, Jonathan H. Clark, Vinodkumar Prabhakaran, Jacob Eisenstein

Evaluation on a test set of 22 dialect features of Indian English demonstrates that these models learn to recognize many features with high accuracy, and that a few minimal pairs can be as effective for training as thousands of labeled examples.

Paper
Add Code

Will it Unblend?

1 code implementation • SCiL 2021 • Yuval Pinter, Cassandra L. Jacobs, Jacob Eisenstein

Natural language processing systems often struggle with out-of-vocabulary (OOV) terms, which do not appear in training data.

Paper
Code

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

no code implementations • ACL 2020 • Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein

In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT).

Ranked #22 on Machine Translation on WMT2014 English-German

Data Augmentation Machine Translation +3

Paper
Add Code

Sparse, Dense, and Attentional Representations for Text Retrieval

1 code implementation • 1 May 2020 • Yi Luan, Jacob Eisenstein, Kristina Toutanova, Michael Collins

Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query.

Open-Domain Question Answering Retrieval +1

1,562

Paper
Code

Characterizing Collective Attention via Descriptor Context: A Case Study of Public Discussions of Crisis Events

1 code implementation • 19 Sep 2019 • Ian Stewart, Diyi Yang, Jacob Eisenstein

But according to rationalist models of natural language communication, the collective salience of each entity will be expressed not only in how often it is mentioned, but in the form that those mentions take.

Paper
Code

Follow the Leader: Documents on the Leading Edge of Semantic Change Get More Citations

1 code implementation • 9 Sep 2019 • Sandeep Soni, Kristina Lerman, Jacob Eisenstein

However, simply knowing that a word has changed in meaning is insufficient to identify the instances of word usage that convey the historical or the newer meaning.

Diachronic Word Embeddings Word Embeddings

Paper
Code

How we do things with words: Analyzing text as social and cultural data

no code implementations • 2 Jul 2019 • Dong Nguyen, Maria Liakata, Simon DeDeo, Jacob Eisenstein, David Mimno, Rebekah Tromble, Jane Winters

Second, we hope to provide a set of best practices for working with thick social and cultural concepts.

Paper
Add Code

Clinical Concept Extraction for Document-Level Coding

no code implementations • WS 2019 • Sarah Wiegreffe, Edward Choi, Sherry Yan, Jimeng Sun, Jacob Eisenstein

The text of clinical notes can be a valuable source of patient information and clinical assessments.

Clinical Concept Extraction

Paper
Add Code

Measuring and Modeling Language Change

no code implementations • NAACL 2019 • Jacob Eisenstein

Such questions are fundamental to the social sciences and humanities, and scholars in these disciplines are increasingly turning to computational techniques for answers.

Causal Inference Two-sample testing +1

Paper
Add Code

Correcting Whitespace Errors in Digitized Historical Texts

1 code implementation • WS 2019 • S Soni, eep, Lauren Klein, Jacob Eisenstein

Whitespace errors are common to digitized archives.

Paper
Code

Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling

1 code implementation • IJCNLP 2019 • Xiaochuang Han, Jacob Eisenstein

To address this scenario, we propose domain-adaptive fine-tuning, in which the contextualized embeddings are adapted by masked language modeling on text from the target domain.

Language Modelling Masked Language Modeling +3

Paper
Code

Character Eyes: Seeing Language through Character-Level Taggers

1 code implementation • WS 2019 • Yuval Pinter, Marc Marone, Jacob Eisenstein

Character-level models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed-vocabulary token-level word representations.

POS

Paper
Code

Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

no code implementations • WS 2019 • Vladimir Karpukhin, Omer Levy, Jacob Eisenstein, Marjan Ghazvininejad

We consider the problem of making machine translation more robust to character-level variation at the source side, such as typos.

Machine Translation Translation

Paper
Add Code

The Referential Reader: A Recurrent Entity Network for Anaphora Resolution

1 code implementation • ACL 2019 • Fei Liu, Luke Zettlemoyer, Jacob Eisenstein

We present a new architecture for storing and accessing entity mentions during online text processing.

Language Modelling

Paper
Code

Interactional Stancetaking in Online Forums

no code implementations • CL 2018 • Scott F. Kiesling, Umashanthi Pavalanathan, Jim Fitzpatrick, Xiaochuang Han, Jacob Eisenstein

Theories of interactional stancetaking have been put forward as holistic accounts, but until now, these theories have been applied only through detailed qualitative analysis of (portions of) a few individual conversations.

Paper
Add Code

Making ``fetch'' happen: The influence of social and linguistic context on nonstandard word growth and decline

no code implementations • EMNLP 2018 • Ian Stewart, Jacob Eisenstein

In an online community, new words come and go: today{'}s {``}haha{''} may be replaced by tomorrow{'}s {``}lol.

Paper
Add Code

Mind Your POV: Convergence of Articles and Editors Towards Wikipedia's Neutrality Norm

no code implementations • 18 Sep 2018 • Umashanthi Pavalanathan, Xiaochuang Han, Jacob Eisenstein

Do NPOV corrections encourage editors to adopt this style?

Time Series Time Series Analysis

Paper
Add Code

Predicting Semantic Relations using Global Graph Properties

1 code implementation • EMNLP 2018 • Yuval Pinter, Jacob Eisenstein

Semantic graphs, such as WordNet, are resources which curate natural language on two distinguishable layers.

Ranked #14 on Link Prediction on WN18RR

Link Prediction

Paper
Code

Si O No, Que Penses? Catalonian Independence and Linguistic Identity on Social Media

no code implementations • NAACL 2018 • Ian Stewart, Yuval Pinter, Jacob Eisenstein

We also find that Catalan is used more often in referendum-related discourse than in other contexts, contrary to prior findings on language variation.

Paper
Add Code

Stylistic Variation in Social Media Part-of-Speech Tagging

no code implementations • WS 2018 • Murali Raghu Babu Balusu, Taha Merghani, Jacob Eisenstein

While prior work found that similar approaches yield performance improvements in sentiment analysis and entity linking, we were unable to obtain performance improvements in part-of-speech tagging, despite strong evidence for the link between part-of-speech error rates and social network structure.

Entity Linking Part-Of-Speech Tagging +1

Paper
Add Code

Sí o no, què penses? Catalonian Independence and Linguistic Identity on Social Media

1 code implementation • 13 Apr 2018 • Ian Stewart, Yuval Pinter, Jacob Eisenstein

We also find that Catalan is used more often in referendum-related discourse than in other contexts, contrary to prior findings on language variation.

Paper
Code

Detecting Social Influence in Event Cascades by Comparing Discriminative Rankers

1 code implementation • 16 Feb 2018 • Sandeep Soni, Shawn Ling Ramirez, Jacob Eisenstein

However, detecting social influence from observational data is challenging due to confounds like homophily and practical issues like missing data.

Paper
Code

Explainable Prediction of Medical Codes from Clinical Text

3 code implementations • NAACL 2018 • James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein

Our method aggregates information across the document using a convolutional neural network, and uses an attention mechanism to select the most relevant segments for each of the thousands of possible codes.

Ranked #10 on Medical Code Prediction on MIMIC-III

Medical Code Prediction

262

Paper
Code

Making "fetch" happen: The influence of social and linguistic context on nonstandard word growth and decline

no code implementations • EMNLP 2018 • Ian Stewart, Jacob Eisenstein

In an online community, new words come and go: today's "haha" may be replaced by tomorrow's "lol."

Paper
Add Code

#anorexia, #anarexia, #anarexyia: Characterizing Online Community Practices with Orthographic Variation

no code implementations • 4 Dec 2017 • Ian Stewart, Stevie Chancellor, Munmun De Choudhury, Jacob Eisenstein

We also demonstrate the utility of orthographic variation as a new lens to study sociolinguistic change in online communities, particularly when the change results from an exogenous force such as a content ban.

Paper
Add Code

Making "fetch" happen: The influence of social and linguistic context on nonstandard word growth and decline

1 code implementation • 1 Sep 2017 • Ian Stewart, Jacob Eisenstein

In an online community, new words come and go: today's "haha" may be replaced by tomorrow's "lol."

Paper
Code

Mimicking Word Embeddings using Subword RNNs

2 code implementations • EMNLP 2017 • Yuval Pinter, Robert Guthrie, Jacob Eisenstein

In this paper, we present MIMICK, an approach to generating OOV word embeddings compositionally, by learning a function from spellings to distributional embeddings.

Word Embeddings

149

Paper
Code

A Multidimensional Lexicon for Interpersonal Stancetaking

no code implementations • ACL 2017 • Umashanthi Pavalanathan, Jim Fitzpatrick, Scott Kiesling, Jacob Eisenstein

The sociolinguistic construct of stancetaking describes the activities through which discourse participants create and signal relationships to their interlocutors, to the topic of discussion, and to the talk itself.

Word Embeddings

Paper
Add Code

Unsupervised Learning for Lexicon-Based Classification

1 code implementation • 21 Nov 2016 • Jacob Eisenstein

In lexicon-based classification, documents are assigned labels by comparing the number of words that appear from two opposed lexicons, such as positive and negative sentiment.

Classification General Classification

Paper
Code

A Joint Model of Rhetorical Discourse Structure and Summarization

no code implementations • WS 2016 • Naman Goyal, Jacob Eisenstein

Structured Prediction

Paper
Add Code

Toward Socially-Infused Information Extraction: Embedding Authors, Mentions, and Entities

no code implementations • EMNLP 2016 • Yi Yang, Ming-Wei Chang, Jacob Eisenstein

Entity linking is the task of identifying mentions of entities in text, and linking them to entries in a knowledge base.

Entity Linking Structured Prediction

Paper
Add Code

The Social Dynamics of Language Change in Online Networks

no code implementations • 7 Sep 2016 • Rahul Goel, Sandeep Soni, Naman Goyal, John Paparrizos, Hanna Wallach, Fernando Diaz, Jacob Eisenstein

Language change is a complex social phenomenon, revealing pathways of communication and sociocultural influence.

Paper
Add Code

Morphological Priors for Probabilistic Neural Word Embeddings

no code implementations • EMNLP 2016 • Parminder Bhatia, Robert Guthrie, Jacob Eisenstein

Word embeddings allow natural language processing systems to share statistical information across related words.

Part-Of-Speech Tagging Word Embeddings +1

Paper
Add Code

Shallow Discourse Parsing Using Distributed Argument Representations and Bayesian Optimization

no code implementations • 14 Jun 2016 • Akanksha, Jacob Eisenstein

This paper describes the Georgia Tech team's approach to the CoNLL-2016 supplementary evaluation on discourse relation sense classification.

Bayesian Optimization Discourse Parsing +2

Paper
Add Code

A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

no code implementations • NAACL 2016 • Yangfeng Ji, Gholamreza Haffari, Jacob Eisenstein

Dialog Act Classification General Classification +2

Paper
Add Code

Part-of-Speech Tagging for Historical English

no code implementations • NAACL 2016 • Yi Yang, Jacob Eisenstein

We evaluate several domain adaptation methods on the task of tagging Early Modern English and Modern British English texts in the Penn Corpora of Historical English.

Part-Of-Speech Tagging Unsupervised Domain Adaptation +1

Paper
Add Code

A Latent Variable Recurrent Neural Network for Discourse Relation Language Models

1 code implementation • 7 Mar 2016 • Yangfeng Ji, Gholamreza Haffari, Jacob Eisenstein

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences.

Classification Dialog Act Classification +4

Paper
Code

A Kernel Independence Test for Geographical Language Variation

1 code implementation • CL 2017 • Dong Nguyen, Jacob Eisenstein

Quantifying the degree of spatial dependence for linguistic variables is a key task for analyzing dialectal variation.

Paper
Code

Nonparametric Bayesian Storyline Detection from Microtexts

no code implementations • WS 2016 • Vinodh Krishnan, Jacob Eisenstein

News events and social media are composed of evolving storylines, which capture public attention for a limited period of time.

Clustering Retrieval

Paper
Add Code

Overcoming Language Variation in Sentiment Analysis with Social Attention

1 code implementation • TACL 2017 • Yi Yang, Jacob Eisenstein

Variation in language is ubiquitous, particularly in newer forms of writing such as social media.

Sentiment Analysis

Paper
Code

Document Context Language Models

1 code implementation • 12 Nov 2015 • Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure.

Sentence

Paper
Code

Emoticons vs. Emojis on Twitter: A Causal Inference Approach

no code implementations • 28 Oct 2015 • Umashanthi Pavalanathan, Jacob Eisenstein

Online writing lacks the non-verbal cues present in face-to-face communication, which provide additional contextual information about the utterance, such as the speaker's intention or affective state.

Causal Inference

Paper
Add Code

Better Document-level Sentiment Analysis from RST Discourse Parsing

no code implementations • EMNLP 2015 • Parminder Bhatia, Yangfeng Ji, Jacob Eisenstein

Discourse structure is the hidden link between surface features and document-level properties, such as sentiment polarity.

Discourse Parsing General Classification +2

Paper
Add Code

Closing the Gap: Domain Adaptation from Explicit to Implicit Discourse Relations

no code implementations • EMNLP 2015 • Yangfeng Ji, Gongbo Zhang, Jacob Eisenstein

Denoising Domain Adaptation +2

Paper
Add Code

Confounds and Consequences in Geotagged Twitter Data

no code implementations • EMNLP 2015 • Umashanthi Pavalanathan, Jacob Eisenstein

Twitter is often used in quantitative studies that identify geographically-preferred topics, writing styles, and entities.

Paper
Add Code

Unsupervised Multi-Domain Adaptation with Feature Embeddings

1 code implementation • HLT 2015 • Jacob Eisenstein, Yi Yang

Representation Learning Unsupervised Domain Adaptation

Paper
Code

"You’re Mr. Lebowski, I’m the Dude": Inducing Address Term Formality in Signed Social Networks

no code implementations • HLT 2015 • Jacob Eisenstein, Vinodh Krishnan

Clustering

Paper
Add Code

One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations

no code implementations • TACL 2015 • Yangfeng Ji, Jacob Eisenstein

A more subtle challenge is that it is not enough to represent the meaning of each argument of a discourse relation, because the relation may depend on links between lowerlevel components, such as entity mentions.

Question Answering Relation +1

Paper
Add Code

Entity-Augmented Distributional Semantics for Discourse Relations

no code implementations • 17 Dec 2014 • Yangfeng Ji, Jacob Eisenstein

A more subtle challenge is that it is not enough to represent the meaning of each sentence of a discourse relation, because the relation may depend on links between lower-level elements, such as entity mentions.

Relation Sentence

Paper
Add Code

Unsupervised Domain Adaptation with Feature Embeddings

1 code implementation • 14 Dec 2014 • Yi Yang, Jacob Eisenstein

Representation learning is the dominant technique for unsupervised domain adaptation, but existing approaches often require the specification of "pivot features" that generalize across domains, which are selected by task-specific heuristics.

Representation Learning Unsupervised Domain Adaptation

Paper
Code

One Vector is Not Enough: Entity-Augmented Distributional Semantics for Discourse Relations

no code implementations • 25 Nov 2014 • Yangfeng Ji, Jacob Eisenstein

Discourse relations bind smaller linguistic units into coherent texts.

Relation

Paper
Add Code

POS induction with distributional and morphological information using a distance-dependent Chinese restaurant process

no code implementations • ACL 2014 • Kairit Sirts, Jacob Eisenstein, Micha Elsner, Sharon Goldwater

Morphological Inflection POS +1

Paper
Add Code

Modeling Factuality Judgments in Social Media Text

no code implementations • ACL 2014 • S Soni, eep, Tanushree Mitra, Eric Gilbert, Jacob Eisenstein

Paper
Add Code

Representation Learning for Text-level Discourse Parsing

1 code implementation • ACL 2014 • Yangfeng Ji, Jacob Eisenstein

Ranked #10 on Discourse Parsing on RST-DT

Discourse Parsing Question Answering +4

118

Paper
Code

Fast Easy Unsupervised Domain Adaptation with Marginalized Structured Dropout

no code implementations • ACL 2014 • Yi Yang, Jacob Eisenstein

Denoising Part-Of-Speech Tagging +2

Paper
Add Code

Mining Themes and Interests in the Asperger's and Autism Community

no code implementations • WS 2014 • Yangfeng Ji, Hwajung Hong, Rosa Arriaga, Agata Rozga, Gregory Abowd, Jacob Eisenstein

Paper
Add Code

Learning Document-Level Semantic Properties from Free-Text Annotations

no code implementations • 15 Jan 2014 • S. R. K. Branavan, Harr Chen, Jacob Eisenstein, Regina Barzilay

The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews.

Clustering

Paper
Add Code

Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches

no code implementations • 15 Jan 2014 • Tahira Naseem, Benjamin Snyder, Jacob Eisenstein, Regina Barzilay

We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech tagging.

TAG Unsupervised Part-Of-Speech Tagging +1

Paper
Add Code

Discriminative Improvements to Distributional Sentence Similarity

no code implementations • EMNLP 2013 • Yangfeng Ji, Jacob Eisenstein

Ranked #1 on Paraphrase Identification on MSRP

Machine Translation Paraphrase Identification +2

Paper
Add Code

A Log-Linear Model for Unsupervised Text Normalization

no code implementations • EMNLP 2013 • Yi Yang, Jacob Eisenstein

Ranked #4 on Lexical Normalization on LexNorm

Language Modelling Lexical Normalization

Paper
Add Code

Phonological Factors in Social Media Writing

no code implementations • WS 2013 • Jacob Eisenstein

Paper
Add Code

What to do about bad language on the internet

no code implementations • NAACL 2013 • Jacob Eisenstein

Domain Adaptation Named Entity Recognition (NER) +1

Paper
Add Code

Discourse Connectors for Latent Subjectivity in Sentiment Analysis

no code implementations • NAACL 2013 • Rakshit Trivedi, Jacob Eisenstein

Sentiment Analysis Subjectivity Analysis

Paper
Add Code

Diffusion of Lexical Change in Social Media

no code implementations • 18 Oct 2012 • Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, Eric P. Xing

Computer-mediated communication is driving fundamental changes in the nature of written language.

Paper
Add Code

Gender identity and lexical variation in social media

1 code implementation • 16 Oct 2012 • David Bamman, Jacob Eisenstein, Tyler Schnoebelen

Examining individuals whose language does not match the classifier's model for their gender, we find that they have social networks that include significantly fewer same-gender social connections and that, in general, social network homophily is correlated with the use of same-gender language markers.

Clustering

Paper
Code

Bootstrapping a Unified Model of Lexical and Phonetic Acquisition

no code implementations • ACL 2012 • Micha Elsner, Sharon Goldwater, Jacob Eisenstein

Language Acquisition Language Modelling

Paper
Add Code

Tutorial Abstracts at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

no code implementations • HLT 2012 • Radu Florian, Jacob Eisenstein

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.