Search Results for author: Andrew McCallum

Found 158 papers, 67 papers with code

Event and Entity Coreference using Trees to Encode Uncertainty in Joint Decisions

no code implementations CRAC (ACL) 2021 Nishant Yadav, Nicholas Monath, Rico Angell, Andrew McCallum

Coreference decisions among event mentions and among co-occurring entity mentions are highly interdependent, thus motivating joint inference.


Entity Linking via Explicit Mention-Mention Coreference Modeling

1 code implementation NAACL 2022 Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum

Learning representations of entity mentions is a core component of modern entity linking systems for both candidate generation and making linking predictions.

Entity Linking Re-Ranking

Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions

no code implementations ACL 2022 Haw-Shiuan Chang, Andrew McCallum

The softmax layer produces the distribution based on the dot products of a single hidden state and the embeddings of words in the vocabulary.

Word Embeddings

Box-To-Box Transformations for Modeling Joint Hierarchies

no code implementations ACL (RepL4NLP) 2021 Shib Sankar Dasgupta, Xiang Lorraine Li, Michael Boratko, Dongxu Zhang, Andrew McCallum

In Patel et al., (2020), the authors demonstrate that only the transitive reduction is required and further extend box embeddings to capture joint hierarchies by augmenting the graph with new nodes.

MS-Mentions: Consistently Annotating Entity Mentions in Materials Science Procedural Text

no code implementations EMNLP 2021 Tim O’Gorman, Zach Jensen, Sheshera Mysore, Kevin Huang, Rubayyat Mahbub, Elsa Olivetti, Andrew McCallum

Material science synthesis procedures are a promising domain for scientific NLP, as proper modeling of these recipes could provide insight into new ways of creating materials.

named-entity-recognition Named Entity Recognition +1

Enhanced Distant Supervision with State-Change Information for Relation Extraction

1 code implementation LREC 2022 Jui Shah, Dongxu Zhang, Sam Brody, Andrew McCallum

In this work, we introduce a method for enhancing distant supervision with state-change information for relation extraction.

Relation Relation Extraction

Unsupervised Parsing with S-DIORA: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders

no code implementations EMNLP 2020 Andrew Drozdov, Subendhu Rongali, Yi-Pei Chen, Tim O{'}Gorman, Mohit Iyyer, Andrew McCallum

The deep inside-outside recursive autoencoder (DIORA; Drozdov et al. 2019) is a self-supervised neural model that learns to induce syntactic tree structures for input sentences *without access to labeled training data*.

Constituency Grammar Induction Sentence

Unsupervised Partial Sentence Matching for Cited Text Identification

no code implementations sdp (COLING) 2022 Kathryn Ricci, Haw-Shiuan Chang, Purujit Goyal, Andrew McCallum

Given a citation in the body of a research paper, cited text identification aims to find the sentences in the cited paper that are most relevant to the citing sentence.

Sentence Sentence Embeddings

Adaptive Retrieval and Scalable Indexing for k-NN Search with Cross-Encoders

no code implementations6 May 2024 Nishant Yadav, Nicholas Monath, Manzil Zaheer, Rob Fergus, Andrew McCallum

Our method produces a high-quality approximation while requiring only a fraction of CE calls as compared to CUR-based methods, and allows for leveraging DE to initialize the embedding space while avoiding compute- and resource-intensive finetuning of DE via distillation.


Incremental Extractive Opinion Summarization Using Cover Trees

1 code implementation16 Jan 2024 Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Manzil Zaheer, Andrew McCallum, Amr Ahmed, Snigdha Chaturvedi

In this work, we study the task of extractive opinion summarization in an incremental setting, where the underlying review set evolves over time.

Extractive Summarization Opinion Summarization

Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching

1 code implementation19 Dec 2023 Rico Angell, Andrew McCallum

We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs that can leverage a warm-start initialization to further accelerate convergence.

Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation

no code implementations15 Nov 2023 Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum

In this paper, we present the discovery that a student model distilled from a few-shot prompted LLM can commonly generalize better than its teacher to unseen examples on such tasks.

Constituency Parsing Knowledge Distillation +3

PaRaDe: Passage Ranking using Demonstrations with Large Language Models

no code implementations22 Oct 2023 Andrew Drozdov, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler, Kai Hui

Recent studies show that large language models (LLMs) can be instructed to effectively perform zero-shot passage re-ranking, in which the results of a first stage retrieval method, such as BM25, are rated and reordered to improve relevance.

Passage Ranking Passage Re-Ranking +6

To Copy, or not to Copy; That is a Critical Issue of the Output Softmax Layer in Neural Sequential Recommenders

1 code implementation21 Oct 2023 Haw-Shiuan Chang, Nikhil Agarwal, Andrew McCallum

Specifically, the similarity structure of the global item embeddings in the softmax layer sometimes forces the single hidden state embedding to be close to new items when copying is a better choice, while sometimes forcing the hidden state to be close to the items from the input inappropriately.

Sequential Recommendation

Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens

1 code implementation8 Sep 2023 Ronald Seoh, Haw-Shiuan Chang, Andrew McCallum

Many useful tasks on scientific documents, such as topic classification and citation prediction, involve corpora that span multiple scientific domains.

Citation Prediction Topic Classification

Answering Compositional Queries with Set-Theoretic Embeddings

no code implementations7 Jun 2023 Shib Dasgupta, Andrew McCallum, Steffen Rendle, Li Zhang

The need to compactly and robustly represent item-attribute relations arises in many important tasks, such as faceted browsing and recommendation systems.

Attribute Recommendation Systems +1

Large Language Model Augmented Narrative Driven Recommendations

1 code implementation4 Jun 2023 Sheshera Mysore, Andrew McCallum, Hamed Zamani

Narrative-driven recommendation (NDR) presents an information access problem where users solicit recommendations with verbose descriptions of their preferences and context, for example, travelers soliciting recommendations for points of interest while describing their likes/dislikes and travel circumstances.

Data Augmentation Language Modelling +3

Machine Reading Comprehension using Case-based Reasoning

no code implementations24 May 2023 Dung Thai, Dhruv Agarwal, Mudit Chaudhary, Wenlong Zhao, Rajarshi Das, Manzil Zaheer, Jay-Yoon Lee, Hannaneh Hajishirzi, Andrew McCallum

Given a test question, CBR-MRC first retrieves a set of similar cases from a nonparametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases.

Attribute Machine Reading Comprehension

Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond

1 code implementation20 May 2023 Haw-Shiuan Chang, Zonghai Yao, Alolika Gon, Hong Yu, Andrew McCallum

Is the output softmax layer, which is adopted by most language models (LMs), always the best way to compute the next word probability?

Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition

1 code implementation4 May 2023 Nishant Yadav, Nicholas Monath, Manzil Zaheer, Andrew McCallum

While ANNCUR's one-time selection of anchors tends to approximate the cross-encoder distances on average, doing so forfeits the capacity to accurately estimate distances to items near the query, leading to regret in the crucial end-task: recall of top-k items.


Editable User Profiles for Controllable Text Recommendation

1 code implementation9 Apr 2023 Sheshera Mysore, Mahmood Jasim, Andrew McCallum, Hamed Zamani

Finally, we implement LACE in an interactive controllable recommender system and conduct a user study to demonstrate that users are able to improve the quality of recommendations they receive through interactions with an editable user profile.

Recommendation Systems Retrieval

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

no code implementations27 Mar 2023 Nicholas Monath, Manzil Zaheer, Kelsey Allen, Andrew McCallum

First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree.


Low-Resource Compositional Semantic Parsing with Concept Pretraining

no code implementations24 Jan 2023 Subendhu Rongali, Mukund Sridhar, Haidar Khan, Konstantine Arkoudas, Wael Hamza, Andrew McCallum

In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot).

Decoder Domain Adaptation +1

You can't pick your neighbors, or can you? When and how to rely on retrieval in the $k$NN-LM

1 code implementation28 Oct 2022 Andrew Drozdov, Shufan Wang, Razieh Rahimi, Andrew McCallum, Hamed Zamani, Mohit Iyyer

Retrieval-enhanced language models (LMs), which condition their predictions on text retrieved from large external datastores, have recently shown significant perplexity improvements compared to standard LMs.

Language Modelling Retrieval +2

Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization

1 code implementation23 Oct 2022 Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum

When the similarity is measured by dot-product between dual-encoder vectors or $\ell_2$-distance, there already exist many scalable and efficient search methods.


Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling

1 code implementation10 Oct 2022 Haw-Shiuan Chang, Ruei-Yao Sun, Kathryn Ricci, Andrew McCallum

Ensembling BERT models often significantly improves accuracy, but at the cost of significantly more computation and memory footprint.

Augmenting Scientific Creativity with Retrieval across Knowledge Domains

1 code implementation2 Jun 2022 Hyeonsu B. Kang, Sheshera Mysore, Kevin Huang, Haw-Shiuan Chang, Thorben Prein, Andrew McCallum, Aniket Kittur, Elsa Olivetti

Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas.


Inducing and Using Alignments for Transition-based AMR Parsing

1 code implementation NAACL 2022 Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, Ramon Fernandez Astudillo

These alignments are learned separately from parser training and require a complex pipeline of rule-based components, pre-processing, and post-processing to satisfy domain-specific constraints.

AMR Parsing

Knowledge Base Question Answering by Case-based Reasoning over Subgraphs

1 code implementation22 Feb 2022 Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Robin Jia, Manzil Zaheer, Hannaneh Hajishirzi, Andrew McCallum

Question answering (QA) over knowledge bases (KBs) is challenging because of the diverse, essentially unbounded, types of reasoning patterns needed.

Knowledge Base Question Answering

Sublinear Time Approximation of Text Similarity Matrices

1 code implementation17 Dec 2021 Archan Ray, Nicholas Monath, Andrew McCallum, Cameron Musco

Approximation methods reduce this quadratic complexity, often by using a small subset of exactly computed similarities to approximate the remainder of the complete pairwise similarity matrix.

Document Classification Sentence +2

Capacity and Bias of Learned Geometric Embeddings for Directed Graphs

1 code implementation NeurIPS 2021 Michael Boratko, Dongxu Zhang, Nicholas Monath, Luke Vilnis, Kenneth Clarkson, Andrew McCallum

While vectors in Euclidean space can theoretically represent any graph, much recent work shows that alternatives such as complex, hyperbolic, order, or box embeddings have geometric properties better suited to modeling real-world graphs.

Knowledge Base Completion Multi-Label Classification

Diverse Distributions of Self-Supervised Tasks for Meta-Learning in NLP

no code implementations EMNLP 2021 Trapit Bansal, Karthick Gunasekaran, Tong Wang, Tsendsuren Munkhdalai, Andrew McCallum

Meta-learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks.

Few-Shot Learning

Structured Energy Network as a dynamic loss function. Case study. A case study with multi-label Classification

no code implementations29 Sep 2021 Jay-Yoon Lee, Dhruvesh Patel, Purujit Goyal, Andrew McCallum

The best version of SEAL that uses NCE ranking method achieves close to +2. 85, +2. 23 respective F1 point gain in average over cross-entropy and INFNET on the feature-based datasets, excluding one outlier that has an excessive gain of +50. 0 F1 points.

Multi-Label Classification Structured Prediction

Entity Linking and Discovery via Arborescence-based Supervised Clustering

1 code implementation2 Sep 2021 Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum

Previous work has shown promising results in performing entity linking by measuring not only the affinities between mentions and entities but also those amongst mentions.

Clustering Entity Linking

Benchmarking Scalable Methods for Streaming Cross Document Entity Coreference

1 code implementation ACL 2021 Robert L Logan IV, Andrew McCallum, Sameer Singh, Dan Bikel

We investigate: how to best encode mentions, which clustering algorithms are most effective for grouping mentions, how models transfer to different domains, and how bounding the number of mentions tracked during inference impacts performance.

Benchmarking Clustering +2

Probabilistic Box Embeddings for Uncertain Knowledge Graph Reasoning

1 code implementation NAACL 2021 Xuelu Chen, Michael Boratko, Muhao Chen, Shib Sankar Dasgupta, Xiang Lorraine Li, Andrew McCallum

Knowledge bases often consist of facts which are harvested from a variety of sources, many of which are noisy and some of which conflict, resulting in a level of uncertainty for each triple.

Knowledge Graph Embedding

Changing the Mind of Transformers for Topically-Controllable Language Generation

1 code implementation EACL 2021 Haw-Shiuan Chang, Jiaming Yuan, Mohit Iyyer, Andrew McCallum

Our framework consists of two components: (1) a method that produces a set of candidate topics by predicting the centers of word clusters in the possible continuations, and (2) a text generation model whose output adheres to the chosen topics.

Clustering Text Generation

Multi-facet Universal Schema

no code implementations EACL 2021 Rohan Paul, Haw-Shiuan Chang, Andrew McCallum

To address the violation of the USchema assumption, we propose multi-facet universal schema that uses a neural model to represent each sentence pattern as multiple facet embeddings and encourage one of these facet embeddings to be close to that of another sentence pattern if they co-occur with the same entity pair.

Relation Relation Extraction +1

Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications

no code implementations29 Mar 2021 Haw-Shiuan Chang, Amol Agrawal, Andrew McCallum

Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences.

Extractive Summarization Sentence +2

CSFCube -- A Test Collection of Computer Science Research Articles for Faceted Query by Example

1 code implementation24 Mar 2021 Sheshera Mysore, Tim O'Gorman, Andrew McCallum, Hamed Zamani

Query by Example is a well-known information retrieval task in which a document is chosen by the user as the search query and the goal is to retrieve relevant documents from a large collection.

Information Retrieval Retrieval

Low Resource Recognition and Linking of Biomedical Concepts from a Large Ontology

1 code implementation26 Jan 2021 Sunil Mohan, Rico Angell, Nick Monath, Andrew McCallum

Tools to explore scientific literature are essential for scientists, especially in biomedicine, where about a million new papers are published every year.


Modeling Fine-Grained Entity Types with Box Embeddings

1 code implementation ACL 2021 Yasumasa Onoe, Michael Boratko, Andrew McCallum, Greg Durrett

Neural entity typing models typically represent fine-grained entity types as vectors in a high-dimensional space, but such spaces are not well-suited to modeling these types' complex interdependencies.

Entity Typing

Box-To-Box Transformation for Modeling Joint Hierarchies

no code implementations1 Jan 2021 Shib Sankar Dasgupta, Xiang Li, Michael Boratko, Dongxu Zhang, Andrew McCallum

In Patel et al. (2020), the authors demonstrate that only the transitive reduction is required, and further extend box embeddings to capture joint hierarchies by augmenting the graph with new nodes.

Knowledge Graphs

An Instance Level Approach for Shallow Semantic Parsing in Scientific Procedural Text

no code implementations Findings of the Association for Computational Linguistics 2020 Daivik Swarup, Ahsaas Bajaj, Sheshera Mysore, Tim O{'}Gorman, Rajarshi Das, Andrew McCallum

Fortunately, such specific domains often use rather formulaic writing, such that the different ways of expressing relations in a small number of grammatically similar labeled sentences may provide high coverage of semantic structures in the corpus, through an appropriately rich similarity metric.

Semantic Parsing Sentence

Clustering-based Inference for Biomedical Entity Linking

no code implementations NAACL 2021 Rico Angell, Nicholas Monath, Sunil Mohan, Nishant Yadav, Andrew McCallum

In this paper, we introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions.

Clustering Entity Linking

Improving Local Identifiability in Probabilistic Box Embeddings

1 code implementation NeurIPS 2020 Shib Sankar Dasgupta, Michael Boratko, Dongxu Zhang, Luke Vilnis, Xiang Lorraine Li, Andrew McCallum

Geometric embeddings have recently received attention for their natural ability to represent transitive asymmetric relations via containment.

Unsupervised Pre-training for Biomedical Question Answering

no code implementations27 Sep 2020 Vaishnavi Kommaraju, Karthick Gunasekaran, Kun Li, Trapit Bansal, Andrew McCallum, Ivana Williams, Ana-Maria Istrate

We explore the suitability of unsupervised representation learning methods on biomedical text -- BioBERT, SciBERT, and BioSentVec -- for biomedical question answering.

Question Answering Representation Learning +1

Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models

1 code implementation ACL 2021 Sumanta Bhattacharyya, Amirmohammad Rooshenas, Subhajit Naskar, Simeng Sun, Mohit Iyyer, Andrew McCallum

To benefit from this observation, we train an energy-based model to mimic the behavior of the task measure (i. e., the energy-based model assigns lower energy to samples with higher BLEU score), which is resulted in a re-ranking algorithm based on the samples drawn from NMT: energy-based re-ranking (EBR).

Computational Efficiency Machine Translation +4

A Simple Approach to Case-Based Reasoning in Knowledge Bases

1 code implementation AKBC 2020 Rajarshi Das, Ameya Godbole, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires \emph{no training}, and is reminiscent of case-based reasoning in classical artificial intelligence (AI).

Knowledge Graphs Meta-Learning +1

Using BibTeX to Automatically Generate Labeled Data for Citation Field Extraction

1 code implementation AKBC 2020 Dung Thai, Zhiyang Xu, Nicholas Monath, Boris Veytsman, Andrew McCallum

In this paper, we describe a technique for using BibTeX to generate, automatically, a large-scale 41M labeled strings), labeled dataset, that is four orders of magnitude larger than the current largest CFE dataset, namely the UMass Citation Field Extraction dataset [Anzaroot and McCallum, 2013].


Data Structures & Algorithms for Exact Inference in Hierarchical Clustering

1 code implementation26 Feb 2020 Craig S. Greenberg, Sebastian Macaluso, Nicholas Monath, Ji-Ah Lee, Patrick Flaherty, Kyle Cranmer, Andrew Mcgregor, Andrew McCallum

In contrast to existing methods, we present novel dynamic-programming algorithms for \emph{exact} inference in hierarchical clustering based on a novel trellis data structure, and we prove that we can exactly compute the partition function, maximum likelihood hierarchy, and marginal probabilities of sub-hierarchies and clusters.

Clustering Small Data Image Classification

Representing Joint Hierarchies with Box Embeddings

1 code implementation AKBC 2020 Dhruvesh Patel, Shib Sankar Dasgupta, Michael Boratko, Xiang Li, Luke Vilnis, Andrew McCallum

Box Embeddings [Vilnis et al., 2018, Li et al., 2019] represent concepts with hyperrectangles in $n$-dimensional space and are shown to be capable of modeling tree-like structures efficiently by training on a large subset of the transitive closure of the WordNet hypernym graph.

Predicting Institution Hierarchies with Set-based Models

no code implementations AKBC 2020 Derek Tam, Nicholas Monath, Ari Kobren, Andrew McCallum

The hierarchical structure of research organizations plays a pivotal role in science of science research as well as in tools that track the research achievements and output.

Scalable Hierarchical Clustering with Tree Grafting

1 code implementation31 Dec 2019 Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum

We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets.


Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

2 code implementations COLING 2020 Trapit Bansal, Rishikesh Jha, Andrew McCallum

LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label.

Entity Typing Few-Shot Learning +6

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference

no code implementations WS 2019 Rajarshi Das, Ameya Godbole, Manzil Zaheer, Shehzaad Dhuliawala, Andrew McCallum

This paper describes our submission to the shared task on {``}Multi-hop Inference Explanation Regeneration{''} in TextGraphs workshop at EMNLP 2019 (Jansen and Ustalov, 2019).

Roll Call Vote Prediction with Knowledge Augmented Models

no code implementations CONLL 2019 Pallavi Patil, Kriti Myer, Ronak Zala, Arpit Singh, Sheshera Mysore, Andrew McCallum, Adrian Benton, Am Stent, a

The sources of knowledge we use are news text and Freebase, a manually curated knowledge base.

Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering

no code implementations WS 2019 Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum

Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging.

Information Retrieval Multi-hop Question Answering +2

A2N: Attending to Neighbors for Knowledge Graph Inference

no code implementations ACL 2019 Trapit Bansal, Da-Cheng Juan, Sujith Ravi, Andrew McCallum

State-of-the-art models for knowledge graph completion aim at learning a fixed embedding representation of entities in a multi-relational graph which can generalize to infer unseen entity relationships at test time.

Knowledge Graph Completion Link Prediction

Supervised Hierarchical Clustering with Exponential Linkage

1 code implementation19 Jun 2019 Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew McCallum

Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function.


Energy and Policy Considerations for Deep Learning in NLP

3 code implementations ACL 2019 Emma Strubell, Ananya Ganesh, Andrew McCallum

Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data.

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders

1 code implementation NAACL 2019 Andrew Drozdov, Patrick Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum

We introduce the deep inside-outside recursive autoencoder (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree.

Constituency Grammar Induction Sentence

Paper Matching with Local Fairness Constraints

3 code implementations28 May 2019 Ari Kobren, Barna Saha, Andrew McCallum

Automatically matching reviewers to papers is a crucial step of the peer review process for venues receiving thousands of submissions.

Data Structures and Algorithms Digital Libraries

Smoothing the Geometry of Probabilistic Box Embeddings

no code implementations ICLR 2019 Xiang Li, Luke Vilnis, Dongxu Zhang, Michael Boratko, Andrew McCallum

However, the hard edges of the boxes present difficulties for standard gradient based optimization; that work employed a special surrogate function for the disjoint case, but we find this method to be fragile.

Inductive Bias

OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference

1 code implementation NAACL 2019 Dongxu Zhang, Subhabrata Mukherjee, Colin Lockard, Xin Luna Dong, Andrew McCallum

In this paper, we consider advancing web-scale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB).

Open Information Extraction Relation

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

3 code implementations3 Apr 2019 Andrew Drozdov, Pat Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum

We introduce deep inside-outside recursive autoencoders (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree.

Constituency Parsing Sentence

Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks

no code implementations22 Dec 2018 Amirmohammad Rooshenas, Dongxu Zhang, Gopal Sharma, Andrew McCallum

In this paper, we instead use efficient truncated randomized search in this reward function to train structured prediction energy networks (SPENs), which provide efficient test-time inference using gradient-based search on a smooth, learned representation of the score landscape, and have previously yielded state-of-the-art results in structured prediction.

Structured Prediction

Compact Representation of Uncertainty in Clustering

no code implementations NeurIPS 2018 Craig Greenberg, Nicholas Monath, Ari Kobren, Patrick Flaherty, Andrew Mcgregor, Andrew McCallum

For many classic structured prediction problems, probability distributions over the dependent variables can be efficiently computed using widely-known algorithms and data structures (such as forward-backward, and its corresponding trellis for exact probability distributions in Markov models).

Clustering Small Data Image Classification +1

Integrating User Feedback under Identity Uncertainty in Knowledge Base Construction

no code implementations AKBC 2019 Ari Kobren, Nicholas Monath, Andrew McCallum

Users have tremendous potential to aid in the construction and maintenance of knowledges bases (KBs) through the contribution of feedback that identifies incorrect and missing entity attributes and relations.

Entity Resolution

Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL?

no code implementations WS 2018 Emma Strubell, Andrew McCallum

Do unsupervised methods for learning rich, contextualized token representations obviate the need for explicit modeling of linguistic structure in neural network models for semantic role labeling (SRL)?

Semantic Role Labeling Word Embeddings

Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

no code implementations ICLR 2019 Rajarshi Das, Tsendsuren Munkhdalai, Xingdi Yuan, Adam Trischler, Andrew McCallum

We harness and extend a recently proposed machine reading comprehension (MRC) model to query for entity states, since these states are generally communicated in spans of text and MRC models perform well in extracting entity-centric spans.

Knowledge Graphs Machine Reading Comprehension +2

Embedded-State Latent Conditional Random Fields for Sequence Labeling

no code implementations CONLL 2018 Dung Thai, Sree Harsha Ramesh, Shikhar Murty, Luke Vilnis, Andrew McCallum

Complex textual information extraction tasks are often posed as sequence labeling or \emph{shallow parsing}, where fields are extracted using local labels made consistent through probabilistic inference in a graphical model with constrained transitions.

Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking

2 code implementations ACL 2018 Shikhar Murty*, Patrick Verga*, Luke Vilnis, Irena Radovanovic, Andrew McCallum

Extraction from raw text to a knowledge base of entities and fine-grained types is often cast as prediction into a flat set of entity and type labels, neglecting the rich hierarchies over types and entities contained in curated ontologies.

2k Entity Linking +1

Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures

no code implementations ACL 2018 Luke Vilnis, Xiang Li, Shikhar Murty, Andrew McCallum

Embedding methods which enforce a partial order or lattice structure over the concept space, such as Order Embeddings (OE) (Vendrov et al., 2016), are a natural way to model transitive relational data (e. g. entailment graphs).

Inductive Bias Knowledge Graphs +1

Linguistically-Informed Self-Attention for Semantic Role Labeling

1 code implementation EMNLP 2018 Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum

Unlike previous models which require significant pre-processing to prepare linguistic features, LISA can incorporate syntax using merely raw tokens as input, encoding the sequence only once to simultaneously perform parsing, predicate detection and role labeling for all predicates.

Dependency Parsing Multi-Task Learning +4

Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings

no code implementations WS 2018 Haw-Shiuan Chang, Amol Agrawal, Ananya Ganesh, Anirudha Desai, Vinayak Mathur, Alfred Hough, Andrew McCallum

Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable.

Word Sense Induction

Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction

1 code implementation NAACL 2018 Patrick Verga, Emma Strubell, Andrew McCallum

Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention.

Relation Relation Extraction +1

Automatically Extracting Action Graphs from Materials Science Synthesis Procedures

no code implementations18 Nov 2017 Sheshera Mysore, Edward Kim, Emma Strubell, Ao Liu, Haw-Shiuan Chang, Srikrishna Kompella, Kevin Huang, Andrew McCallum, Elsa Olivetti

In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds.

Finer Grained Entity Typing with TypeNet

no code implementations15 Nov 2017 Shikhar Murty, Patrick Verga, Luke Vilnis, Andrew McCallum

We consider the challenging problem of entity typing over an extremely fine grained set of types, wherein a single mention or entity can have many simultaneous and often hierarchically-structured types.

Entity Typing

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

7 code implementations ICLR 2018 Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Navigate Relation +1

Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection

no code implementations NAACL 2018 Haw-Shiuan Chang, ZiYun Wang, Luke Vilnis, Andrew McCallum

Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, coreference, relation extraction, and question answering.

Hypernym Discovery Question Answering +1

Low-Rank Hidden State Embeddings for Viterbi Sequence Labeling

no code implementations2 Aug 2017 Dung Thai, Shikhar Murty, Trapit Bansal, Luke Vilnis, David Belanger, Andrew McCallum

In textual information extraction and other sequence labeling tasks it is now common to use recurrent neural networks (such as LSTM) to form rich embedded representations of long-term input co-occurrence patterns.

named-entity-recognition Named Entity Recognition +1

Improved Representation Learning for Predicting Commonsense Ontologies

no code implementations1 Aug 2017 Xiang Li, Luke Vilnis, Andrew McCallum

Recent work in learning ontologies (hierarchical and partially-ordered structures) has leveraged the intrinsic geometry of spaces of learned representations to make predictions that automatically obey complex structural constraints.

Representation Learning

Dependency Parsing with Dilated Iterated Graph CNNs

no code implementations WS 2017 Emma Strubell, Andrew McCallum

Dependency parses are an effective way to inject linguistic knowledge into many downstream tasks, and many practitioners wish to efficiently parse sentences at scale.

Dependency Parsing Sentence

SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications

1 code implementation SEMEVAL 2017 Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, Andrew McCallum

We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for understanding which publications describe which processes, tasks and materials.

Knowledge Base Population

An Online Hierarchical Algorithm for Extreme Clustering

2 code implementations6 Apr 2017 Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum

Many modern clustering methods scale well to a large number of data items, N, but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K--a problem setting we term extreme clustering.


End-to-End Learning for Structured Prediction Energy Networks

no code implementations ICML 2017 David Belanger, Bishan Yang, Andrew McCallum

Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016).

Image Denoising Semantic Role Labeling +1

Learning a Natural Language Interface with Neural Programmer

2 code implementations28 Nov 2016 Arvind Neelakantan, Quoc V. Le, Martin Abadi, Andrew McCallum, Dario Amodei

The main experimental result in this paper is that a single Neural Programmer model achieves 34. 2% accuracy using only 10, 000 examples with weak supervision.

Natural Language Queries Program induction +1

Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks

2 code implementations EACL 2017 Rajarshi Das, Arvind Neelakantan, David Belanger, Andrew McCallum

Our goal is to combine the rich multistep inference of symbolic logical reasoning with the generalization capabilities of neural networks.

Logical Reasoning

Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema

1 code implementation EACL 2017 Patrick Verga, Arvind Neelakantan, Andrew McCallum

In experiments predicting both relations and entity types, we demonstrate that despite having an order of magnitude fewer parameters than traditional universal schema, we can match the accuracy of the traditional model, and more importantly, we can now make predictions about unseen rows with nearly the same accuracy as rows available at training time.

Matrix Completion

Row-less Universal Schema

1 code implementation WS 2016 Patrick Verga, Andrew McCallum

In experimental results on the FB15k-237 benchmark we demonstrate that we can match the performance of a comparable model with explicit entity pair representations using a model of attention over relation types.


Multilingual Relation Extraction using Compositional Universal Schema

1 code implementation NAACL 2016 Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew McCallum

In response, this paper introduces significant further improvements to the coverage and flexibility of universal schema relation extraction: predictions for entities unseen in training and multilingual transfer learning to domains with no annotation.

Relation Relation Extraction +4

Structured Prediction Energy Networks

no code implementations19 Nov 2015 David Belanger, Andrew McCallum

This deep architecture captures dependencies between labels that would lead to intractable graphical models, and performs structure learning by automatically learning discriminative features of the structured output.

General Classification Multi-Label Classification +1

Learning Dynamic Feature Selection for Fast Sequential Prediction

no code implementations IJCNLP 2015 Emma Strubell, Luke Vilnis, Kate Silverstein, Andrew McCallum

We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components.

Benchmarking feature selection +7

Compositional Vector Space Models for Knowledge Base Completion

no code implementations IJCNLP 2015 Arvind Neelakantan, Benjamin Roth, Andrew McCallum

Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X, Y) from bornIn(X, Y).

Knowledge Base Completion Relation +1

Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space

no code implementations EMNLP 2014 Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, Andrew McCallum

There is rising interest in vector-space word embeddings and their use in NLP, especially given recent methods for their fast estimation at very large scale.

Vocal Bursts Type Prediction Word Embeddings +1

Bethe Projections for Non-Local Inference

no code implementations4 Mar 2015 Luke Vilnis, David Belanger, Daniel Sheldon, Andrew McCallum

Many inference problems in structured prediction are naturally solved by augmenting a tractable dependency structure with complex, non-local auxiliary objectives.

Handwriting Recognition Structured Prediction +1

Word Representations via Gaussian Embedding

1 code implementation20 Dec 2014 Luke Vilnis, Andrew McCallum

Current work in lexical distributed representations maps each word to a point vector in low-dimensional space.

Training for Fast Sequential Prediction Using Dynamic Feature Selection

no code implementations30 Oct 2014 Emma Strubell, Luke Vilnis, Andrew McCallum

We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components.

feature selection Part-Of-Speech Tagging

Lexicon Infused Phrase Embeddings for Named Entity Resolution

no code implementations WS 2014 Alexandre Passos, Vineet Kumar, Andrew McCallum

Most state-of-the-art approaches for named-entity recognition (NER) use semi supervised information in the form of word clusters and lexicons.

Entity Resolution Learning Word Embeddings +3

Anytime Belief Propagation Using Sparse Domains

no code implementations14 Nov 2013 Sameer Singh, Sebastian Riedel, Andrew McCallum

Belief Propagation has been widely used for marginal inference, however it is slow on problems with large-domain variables and high-order factors.


Query-Aware MCMC

no code implementations NeurIPS 2011 Michael L. Wick, Andrew McCallum

Traditional approaches to probabilistic inference such as loopy belief propagation and Gibbs sampling typically compute marginals for it all the unobserved variables in a graphical model.

Computational Efficiency

An Introduction to Conditional Random Fields

no code implementations17 Nov 2010 Charles Sutton, Andrew McCallum

This tutorial describes conditional random fields, a popular probabilistic method for structured prediction.

General Classification Structured Prediction

FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs

no code implementations NeurIPS 2009 Andrew Mccallum, Karl Schultz, Sameer Singh

Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data.

Probabilistic Programming

Rethinking LDA: Why Priors Matter

no code implementations NeurIPS 2009 Hanna M. Wallach, David M. Mimno, Andrew McCallum

Implementations of topic models typically use symmetric Dirichlet priors with fixed concentration parameters, with the implicit assumption that such smoothing parameters" have little practical effect.

Hyperparameter Optimization Topic Models

Cannot find the paper you are looking for? You can Submit a new open access paper.