no code implementations • EMNLP 2020 • Andrew Drozdov, Subendhu Rongali, Yi-Pei Chen, Tim O{'}Gorman, Mohit Iyyer, Andrew McCallum
The deep inside-outside recursive autoencoder (DIORA; Drozdov et al. 2019) is a self-supervised neural model that learns to induce syntactic tree structures for input sentences *without access to labeled training data*.
no code implementations • ACL 2022 • Haw-Shiuan Chang, Andrew McCallum
The softmax layer produces the distribution based on the dot products of a single hidden state and the embeddings of words in the vocabulary.
no code implementations • ACL (RepL4NLP) 2021 • Raghuveer Thirukovalluru, Mukund Sridhar, Dung Thai, Shruti Chanumolu, Nicholas Monath, Sankaranarayanan Ananthakrishnan, Andrew McCallum
Specially, neural semantic parsers (NSPs) effectively translate natural questions to logical forms, which execute on KB and give desirable answers.
no code implementations • ACL (RepL4NLP) 2021 • Dung Thai, Raghuveer Thirukovalluru, Trapit Bansal, Andrew McCallum
In this work, we aim at directly learning text representations which leverage structured knowledge about entities mentioned in the text.
no code implementations • ACL (RepL4NLP) 2021 • Shib Sankar Dasgupta, Xiang Lorraine Li, Michael Boratko, Dongxu Zhang, Andrew McCallum
In Patel et al., (2020), the authors demonstrate that only the transitive reduction is required and further extend box embeddings to capture joint hierarchies by augmenting the graph with new nodes.
no code implementations • sdp (COLING) 2022 • Kathryn Ricci, Haw-Shiuan Chang, Purujit Goyal, Andrew McCallum
Given a citation in the body of a research paper, cited text identification aims to find the sentences in the cited paper that are most relevant to the citing sentence.
no code implementations • EMNLP 2021 • Tim O’Gorman, Zach Jensen, Sheshera Mysore, Kevin Huang, Rubayyat Mahbub, Elsa Olivetti, Andrew McCallum
Material science synthesis procedures are a promising domain for scientific NLP, as proper modeling of these recipes could provide insight into new ways of creating materials.
1 code implementation • LREC 2022 • Jui Shah, Dongxu Zhang, Sam Brody, Andrew McCallum
In this work, we introduce a method for enhancing distant supervision with state-change information for relation extraction.
1 code implementation • NAACL 2022 • Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum
Learning representations of entity mentions is a core component of modern entity linking systems for both candidate generation and making linking predictions.
Ranked #1 on
Entity Linking
on ZESHEL
no code implementations • CRAC (ACL) 2021 • Nishant Yadav, Nicholas Monath, Rico Angell, Andrew McCallum
Coreference decisions among event mentions and among co-occurring entity mentions are highly interdependent, thus motivating joint inference.
1 code implementation • ACL 2022 • EunJeong Hwang, Jay-Yoon Lee, Tianyi Yang, Dhruvesh Patel, Dongxu Zhang, Andrew McCallum
To understand a story with multiple events, it is important to capture the proper relations across these events.
no code implementations • 27 Mar 2023 • Nicholas Monath, Manzil Zaheer, Kelsey Allen, Andrew McCallum
First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree.
no code implementations • 24 Jan 2023 • Subendhu Rongali, Mukund Sridhar, Haidar Khan, Konstantine Arkoudas, Wael Hamza, Andrew McCallum
In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot).
1 code implementation • 28 Oct 2022 • Andrew Drozdov, Shufan Wang, Razieh Rahimi, Andrew McCallum, Hamed Zamani, Mohit Iyyer
Retrieval-enhanced language models (LMs), which condition their predictions on text retrieved from large external datastores, have recently shown significant perplexity improvements compared to standard LMs.
1 code implementation • 23 Oct 2022 • Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum
When the similarity is measured by dot-product between dual-encoder vectors or $\ell_2$-distance, there already exist many scalable and efficient search methods.
no code implementations • 10 Oct 2022 • Haw-Shiuan Chang, Ruei-Yao Sun, Kathryn Ricci, Andrew McCallum
Ensembling BERT models often significantly improves accuracy, but at the cost of significantly more computation and memory footprint.
1 code implementation • 7 Oct 2022 • Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum, Mrinmaya Sachan
Ontonotes has served as the most important benchmark for coreference resolution.
1 code implementation • 2 Jun 2022 • Hyeonsu B. Kang, Sheshera Mysore, Kevin Huang, Haw-Shiuan Chang, Thorben Prein, Andrew McCallum, Aniket Kittur, Elsa Olivetti
Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas.
1 code implementation • NAACL 2022 • Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, Ramon Fernandez Astudillo
These alignments are learned separately from parser training and require a complex pipeline of rule-based components, pre-processing, and post-processing to satisfy domain-specific constraints.
no code implementations • 18 Apr 2022 • Dung Thai, Srinivas Ravishankar, Ibrahim Abdelaziz, Mudit Chaudhary, Nandana Mihindukulasooriya, Tahira Naseem, Rajarshi Das, Pavan Kapanipathi, Achille Fokoue, Andrew McCallum
Yet, in many question answering applications coupled with knowledge bases, the sparse nature of KBs is often overlooked.
1 code implementation • LREC 2022 • Dongxu Zhang, Sunil Mohan, Michaela Torkar, Andrew McCallum
We introduce ChemDisGene, a new dataset for training and evaluating multi-class multi-label document-level biomedical relation extraction models.
1 code implementation • 22 Feb 2022 • Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Robin Jia, Manzil Zaheer, Hannaneh Hajishirzi, Andrew McCallum
Question answering (QA) over knowledge bases (KBs) is challenging because of the diverse, essentially unbounded, types of reasoning patterns needed.
1 code implementation • 17 Dec 2021 • Archan Ray, Nicholas Monath, Andrew McCallum, Cameron Musco
Approximation methods reduce this quadratic complexity, often by using a small subset of exactly computed similarities to approximate the remainder of the complete pairwise similarity matrix.
1 code implementation • NeurIPS 2021 • Michael Boratko, Dongxu Zhang, Nicholas Monath, Luke Vilnis, Kenneth Clarkson, Andrew McCallum
While vectors in Euclidean space can theoretically represent any graph, much recent work shows that alternatives such as complex, hyperbolic, order, or box embeddings have geometric properties better suited to modeling real-world graphs.
no code implementations • EMNLP 2021 • Trapit Bansal, Karthick Gunasekaran, Tong Wang, Tsendsuren Munkhdalai, Andrew McCallum
Meta-learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks.
no code implementations • NAACL 2022 • Neha Kennard, Tim O'Gorman, Rajarshi Das, Akshay Sharma, Chhandak Bagchi, Matthew Clinton, Pranay Kumar Yelugam, Hamed Zamani, Andrew McCallum
At the foundation of scientific evaluation is the labor-intensive process of peer review.
no code implementations • 29 Sep 2021 • Jay-Yoon Lee, Dhruvesh Patel, Purujit Goyal, Andrew McCallum
The best version of SEAL that uses NCE ranking method achieves close to +2. 85, +2. 23 respective F1 point gain in average over cross-entropy and INFNET on the feature-based datasets, excluding one outlier that has an excessive gain of +50. 0 F1 points.
1 code implementation • EMNLP 2021 • Zhiyang Xu, Andrew Drozdov, Jay Yoon Lee, Tim O'Gorman, Subendhu Rongali, Dylan Finkbeiner, Shilpa Suresh, Mohit Iyyer, Andrew McCallum
For over thirty years, researchers have developed and analyzed methods for latent tree induction as an approach for unsupervised syntactic parsing.
1 code implementation • EMNLP (ACL) 2021 • Tejas Chheda, Purujit Goyal, Trang Tran, Dhruvesh Patel, Michael Boratko, Shib Sankar Dasgupta, Andrew McCallum
A major factor contributing to the success of modern representation learning is the ease of performing various vector operations.
1 code implementation • 2 Sep 2021 • Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum
Previous work has shown promising results in performing entity linking by measuring not only the affinities between mentions and entities but also those amongst mentions.
1 code implementation • ACL 2021 • Robert L Logan IV, Andrew McCallum, Sameer Singh, Dan Bikel
We investigate: how to best encode mentions, which clustering algorithms are most effective for grouping mentions, how models transfer to different domains, and how bounding the number of mentions tracked during inference impacts performance.
1 code implementation • ACL 2022 • Shib Sankar Dasgupta, Michael Boratko, Siddhartha Mishra, Shriya Atmakuri, Dhruvesh Patel, Xiang Lorraine Li, Andrew McCallum
In this work, we provide a fuzzy-set interpretation of box embeddings, and learn box representations of words using a set-theoretic training objective.
no code implementations • ACL 2021 • Nicholas FitzGerald, Jan A. Botha, Daniel Gillick, Daniel M. Bikel, Tom Kwiatkowski, Andrew McCallum
We present an instance-based nearest neighbor approach to entity linking.
no code implementations • EMNLP 2021 • Rajarshi Das, Manzil Zaheer, Dung Thai, Ameya Godbole, Ethan Perez, Jay-Yoon Lee, Lizhen Tan, Lazaros Polymenakos, Andrew McCallum
It is often challenging to solve a complex problem from scratch, but much easier if we can access other similar problems with their solutions -- a paradigm known as case-based reasoning (CBR).
Knowledge Base Question Answering
Natural Language Queries
+1
no code implementations • 14 Apr 2021 • Craig S. Greenberg, Sebastian Macaluso, Nicholas Monath, Avinava Dubey, Patrick Flaherty, Manzil Zaheer, Amr Ahmed, Kyle Cranmer, Andrew McCallum
In those cases, hierarchical clustering can be seen as a combinatorial optimization problem.
1 code implementation • NAACL 2021 • Xuelu Chen, Michael Boratko, Muhao Chen, Shib Sankar Dasgupta, Xiang Lorraine Li, Andrew McCallum
Knowledge bases often consist of facts which are harvested from a variety of sources, many of which are noisy and some of which conflict, resulting in a level of uncertainty for each triple.
1 code implementation • EACL 2021 • Haw-Shiuan Chang, Jiaming Yuan, Mohit Iyyer, Andrew McCallum
Our framework consists of two components: (1) a method that produces a set of candidate topics by predicting the centers of word clusters in the possible continuations, and (2) a text generation model whose output adheres to the chosen topics.
no code implementations • 29 Mar 2021 • Haw-Shiuan Chang, Amol Agrawal, Andrew McCallum
Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences.
no code implementations • EACL 2021 • Rohan Paul, Haw-Shiuan Chang, Andrew McCallum
To address the violation of the USchema assumption, we propose multi-facet universal schema that uses a neural model to represent each sentence pattern as multiple facet embeddings and encourage one of these facet embeddings to be close to that of another sentence pattern if they co-occur with the same entity pair.
1 code implementation • 24 Mar 2021 • Sheshera Mysore, Tim O'Gorman, Andrew McCallum, Hamed Zamani
Query by Example is a well-known information retrieval task in which a document is chosen by the user as the search query and the goal is to retrieve relevant documents from a large collection.
no code implementations • ACL 2021 • Ahsaas Bajaj, Pavitra Dangati, Kalpesh Krishna, Pradhiksha Ashok Kumar, Rheeya Uppaal, Bradford Windsor, Eliot Brenner, Dominic Dotterrer, Rajarshi Das, Andrew McCallum
Abstractive summarization is the task of compressing a long document into a coherent short document while retaining salient information.
1 code implementation • 26 Jan 2021 • Sunil Mohan, Rico Angell, Nick Monath, Andrew McCallum
Tools to explore scientific literature are essential for scientists, especially in biomedicine, where about a million new papers are published every year.
1 code implementation • ACL 2021 • Yasumasa Onoe, Michael Boratko, Andrew McCallum, Greg Durrett
Neural entity typing models typically represent fine-grained entity types as vectors in a high-dimensional space, but such spaces are not well-suited to modeling these types' complex interdependencies.
Ranked #8 on
Entity Typing
on Open Entity
no code implementations • 1 Jan 2021 • Shib Sankar Dasgupta, Xiang Li, Michael Boratko, Dongxu Zhang, Andrew McCallum
In Patel et al. (2020), the authors demonstrate that only the transitive reduction is required, and further extend box embeddings to capture joint hierarchies by augmenting the graph with new nodes.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Daivik Swarup, Ahsaas Bajaj, Sheshera Mysore, Tim O{'}Gorman, Rajarshi Das, Andrew McCallum
Fortunately, such specific domains often use rather formulaic writing, such that the different ways of expressing relations in a small number of grammatically similar labeled sentences may provide high coverage of semantic structures in the corpus, through an appropriately rich similarity metric.
2 code implementations • 22 Oct 2020 • Nicholas Monath, Avinava Dubey, Guru Guruganesh, Manzil Zaheer, Amr Ahmed, Andrew McCallum, Gokhan Mergen, Marc Najork, Mert Terzihan, Bryon Tjanaka, YuAn Wang, Yuchen Wu
The applicability of agglomerative clustering, for inferring both hierarchical and flat clustering, is limited by its scalability.
no code implementations • NAACL 2021 • Rico Angell, Nicholas Monath, Sunil Mohan, Nishant Yadav, Andrew McCallum
In this paper, we introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions.
1 code implementation • NeurIPS 2020 • Shib Sankar Dasgupta, Michael Boratko, Dongxu Zhang, Luke Vilnis, Xiang Lorraine Li, Andrew McCallum
Geometric embeddings have recently received attention for their natural ability to represent transitive asymmetric relations via containment.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Rajarshi Das, Ameya Godbole, Nicholas Monath, Manzil Zaheer, Andrew McCallum
A case-based reasoning (CBR) system solves a new problem by retrieving `cases' that are similar to the given problem.
Ranked #1 on
Link Prediction
on NELL-995
no code implementations • 27 Sep 2020 • Vaishnavi Kommaraju, Karthick Gunasekaran, Kun Li, Trapit Bansal, Andrew McCallum, Ivana Williams, Ana-Maria Istrate
We explore the suitability of unsupervised representation learning methods on biomedical text -- BioBERT, SciBERT, and BioSentVec -- for biomedical question answering.
1 code implementation • ACL 2021 • Sumanta Bhattacharyya, Amirmohammad Rooshenas, Subhajit Naskar, Simeng Sun, Mohit Iyyer, Andrew McCallum
To benefit from this observation, we train an energy-based model to mimic the behavior of the task measure (i. e., the energy-based model assigns lower energy to samples with higher BLEU score), which is resulted in a re-ranking algorithm based on the samples drawn from NMT: energy-based re-ranking (EBR).
1 code implementation • EMNLP 2020 • Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum
We meta-train a transformer model on this distribution of tasks using a recent meta-learning framework.
1 code implementation • AKBC 2020 • Rajarshi Das, Ameya Godbole, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum
We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires \emph{no training}, and is reminiscent of case-based reasoning in classical artificial intelligence (AI).
no code implementations • 24 Jun 2020 • Xin Luna Dong, Xiang He, Andrey Kan, Xi-An Li, Yan Liang, Jun Ma, Yifan Ethan Xu, Chenwei Zhang, Tong Zhao, Gabriel Blanco Saldana, Saurabh Deshpande, Alexandre Michetti Manduca, Jay Ren, Surender Pal Singh, Fan Xiao, Haw-Shiuan Chang, Giannis Karamanolakis, Yuning Mao, Yaqing Wang, Christos Faloutsos, Andrew McCallum, Jiawei Han
Can one build a knowledge graph (KG) for all products in the world?
1 code implementation • AKBC 2020 • Dung Thai, Zhiyang Xu, Nicholas Monath, Boris Veytsman, Andrew McCallum
In this paper, we describe a technique for using BibTeX to generate, automatically, a large-scale 41M labeled strings), labeled dataset, that is four orders of magnitude larger than the current largest CFE dataset, namely the UMass Citation Field Extraction dataset [Anzaroot and McCallum, 2013].
1 code implementation • EMNLP 2020 • Michael Boratko, Xiang Lorraine Li, Rajarshi Das, Tim O'Gorman, Dan Le, Andrew McCallum
Given questions regarding some prototypical situation such as Name something that people usually do before they leave the house for work?
1 code implementation • 26 Feb 2020 • Craig S. Greenberg, Sebastian Macaluso, Nicholas Monath, Ji-Ah Lee, Patrick Flaherty, Kyle Cranmer, Andrew Mcgregor, Andrew McCallum
In contrast to existing methods, we present novel dynamic-programming algorithms for \emph{exact} inference in hierarchical clustering based on a novel trellis data structure, and we prove that we can exactly compute the partition function, maximum likelihood hierarchy, and marginal probabilities of sub-hierarchies and clusters.
1 code implementation • AKBC 2020 • Dhruvesh Patel, Shib Sankar Dasgupta, Michael Boratko, Xiang Li, Luke Vilnis, Andrew McCallum
Box Embeddings [Vilnis et al., 2018, Li et al., 2019] represent concepts with hyperrectangles in $n$-dimensional space and are shown to be capable of modeling tree-like structures efficiently by training on a large subset of the transitive closure of the WordNet hypernym graph.
no code implementations • AKBC 2020 • Derek Tam, Nicholas Monath, Ari Kobren, Andrew McCallum
The hierarchical structure of research organizations plays a pivotal role in science of science research as well as in tools that track the research achievements and output.
1 code implementation • 31 Dec 2019 • Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum
We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets.
no code implementations • 2 Dec 2019 • Trapit Bansal, Pat Verga, Neha Choudhary, Andrew McCallum
Understanding the meaning of text often involves reasoning about entities and their relationships.
no code implementations • 17 Nov 2019 • Haw-Shiuan Chang, Shankar Vembu, Sunil Mohan, Rheeya Uppaal, Andrew McCallum
Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks.
2 code implementations • COLING 2020 • Trapit Bansal, Rishikesh Jha, Andrew McCallum
LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label.
no code implementations • WS 2019 • Rajarshi Das, Ameya Godbole, Manzil Zaheer, Shehzaad Dhuliawala, Andrew McCallum
This paper describes our submission to the shared task on {``}Multi-hop Inference Explanation Regeneration{''} in TextGraphs workshop at EMNLP 2019 (Jansen and Ustalov, 2019).
no code implementations • IJCNLP 2019 • Andrew Drozdov, Patrick Verga, Yi-Pei Chen, Mohit Iyyer, Andrew McCallum
Understanding text often requires identifying meaningful constituent spans such as noun phrases and verb phrases.
no code implementations • CONLL 2019 • Pallavi Patil, Kriti Myer, Ronak Zala, Arpit Singh, Sheshera Mysore, Andrew McCallum, Adrian Benton, Am Stent, a
The sources of knowledge we use are news text and Freebase, a manually curated knowledge base.
no code implementations • WS 2019 • Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum
Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging.
1 code implementation • ACL 2019 • Derek Tam, Nicholas Monath, Ari Kobren, Aaron Traylor, Rajarshi Das, Andrew McCallum
We evaluate STANCE's ability to detect whether two strings can refer to the same entity--a task we term alias detection.
no code implementations • ACL 2019 • Trapit Bansal, Da-Cheng Juan, Sujith Ravi, Andrew McCallum
State-of-the-art models for knowledge graph completion aim at learning a fixed embedding representation of entities in a multi-relational graph which can generalize to infer unseen entity relationships at test time.
1 code implementation • 19 Jun 2019 • Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew McCallum
Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function.
3 code implementations • ACL 2019 • Emma Strubell, Ananya Ganesh, Andrew McCallum
Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data.
1 code implementation • NAACL 2019 • Andrew Drozdov, Patrick Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum
We introduce the deep inside-outside recursive autoencoder (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree.
Ranked #5 on
Constituency Grammar Induction
on PTB
3 code implementations • 28 May 2019 • Ari Kobren, Barna Saha, Andrew McCallum
Automatically matching reviewers to papers is a crucial step of the peer review process for venues receiving thousands of submissions.
Data Structures and Algorithms Digital Libraries
no code implementations • WS 2019 • Sheshera Mysore, Zach Jensen, Edward Kim, Kevin Huang, Haw-Shiuan Chang, Emma Strubell, Jeffrey Flanigan, Andrew McCallum, Elsa Olivetti
Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text.
1 code implementation • ICLR 2019 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum
This paper introduces a new framework for open-domain question answering in which the retriever and the reader iteratively interact with each other.
no code implementations • ICLR 2019 • Xiang Li, Luke Vilnis, Dongxu Zhang, Michael Boratko, Andrew McCallum
However, the hard edges of the boxes present difficulties for standard gradient based optimization; that work employed a special surrogate function for the disjoint case, but we find this method to be fragile.
1 code implementation • NAACL 2019 • Dongxu Zhang, Subhabrata Mukherjee, Colin Lockard, Xin Luna Dong, Andrew McCallum
In this paper, we consider advancing web-scale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB).
3 code implementations • 3 Apr 2019 • Andrew Drozdov, Pat Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum
We introduce deep inside-outside recursive autoencoders (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree.
1 code implementation • 31 Dec 2018 • Edward Kim, Zach Jensen, Alexander van Grootel, Kevin Huang, Matthew Staib, Sheshera Mysore, Haw-Shiuan Chang, Emma Strubell, Andrew McCallum, Stefanie Jegelka, Elsa Olivetti
Leveraging new data sources is a key step in accelerating the pace of materials design and discovery.
no code implementations • 22 Dec 2018 • Amirmohammad Rooshenas, Dongxu Zhang, Gopal Sharma, Andrew McCallum
In this paper, we instead use efficient truncated randomized search in this reward function to train structured prediction energy networks (SPENs), which provide efficient test-time inference using gradient-based search on a smooth, learned representation of the score landscape, and have previously yielded state-of-the-art results in structured prediction.
no code implementations • NeurIPS 2018 • Craig Greenberg, Nicholas Monath, Ari Kobren, Patrick Flaherty, Andrew Mcgregor, Andrew McCallum
For many classic structured prediction problems, probability distributions over the dependent variables can be efficiently computed using widely-known algorithms and data structures (such as forward-backward, and its corresponding trellis for exact probability distributions in Markov models).
no code implementations • AKBC 2019 • Ari Kobren, Nicholas Monath, Andrew McCallum
Users have tremendous potential to aid in the construction and maintenance of knowledges bases (KBs) through the contribution of feedback that identifies incorrect and missing entity attributes and relations.
no code implementations • WS 2018 • Emma Strubell, Andrew McCallum
Do unsupervised methods for learning rich, contextualized token representations obviate the need for explicit modeling of linguistic structure in neural network models for semantic role labeling (SRL)?
no code implementations • EMNLP 2018 • Michael Boratko, Harshit Padigela, Divyendra Mikkilineni, Pritish Yuvraj, Rajarshi Das, Andrew McCallum, Maria Chang, Achille Fokoue, Pavan Kapanipathi, Nicholas Mattei, Ryan Musa, Kartik Talamadupula, Michael Witbrock
Recent work introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into an Easy Set and a Challenge Set.
no code implementations • ICLR 2019 • Rajarshi Das, Tsendsuren Munkhdalai, Xingdi Yuan, Adam Trischler, Andrew McCallum
We harness and extend a recently proposed machine reading comprehension (MRC) model to query for entity states, since these states are generally communicated in spans of text and MRC models perform well in extracting entity-centric spans.
Ranked #3 on
Procedural Text Understanding
on ProPara
no code implementations • EMNLP 2018 • Nathan Greenberg, Trapit Bansal, Patrick Verga, Andrew McCallum
This paper presents a method for training a single CRF extractor from multiple datasets with disjoint or partially overlapping sets of entity types.
no code implementations • CONLL 2018 • Dung Thai, Sree Harsha Ramesh, Shikhar Murty, Luke Vilnis, Andrew McCallum
Complex textual information extraction tasks are often posed as sequence labeling or \emph{shallow parsing}, where fields are extracted using local labels made consistent through probabilistic inference in a graphical model with constrained transitions.
2 code implementations • ACL 2018 • Shikhar Murty*, Patrick Verga*, Luke Vilnis, Irena Radovanovic, Andrew McCallum
Extraction from raw text to a knowledge base of entities and fine-grained types is often cast as prediction into a flat set of entity and type labels, neglecting the rich hierarchies over types and entities contained in curated ontologies.
no code implementations • NAACL 2018 • Amirmohammad Rooshenas, Aishwarya Kamath, Andrew McCallum
This paper introduces rank-based training of structured prediction energy networks (SPENs).
no code implementations • WS 2018 • Michael Boratko, Harshit Padigela, Divyendra Mikkilineni, Pritish Yuvraj, Rajarshi Das, Andrew McCallum, Maria Chang, Achille Fokoue-Nkoutche, Pavan Kapanipathi, Nicholas Mattei, Ryan Musa, Kartik Talamadupula, Michael Witbrock
We propose a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset.
no code implementations • ACL 2018 • Luke Vilnis, Xiang Li, Shikhar Murty, Andrew McCallum
Embedding methods which enforce a partial order or lattice structure over the concept space, such as Order Embeddings (OE) (Vendrov et al., 2016), are a natural way to model transitive relational data (e. g. entailment graphs).
1 code implementation • EMNLP 2018 • Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum
Unlike previous models which require significant pre-processing to prepare linguistic features, LISA can incorporate syntax using merely raw tokens as input, encoding the sequence only once to simultaneously perform parsing, predicate detection and role labeling for all predicates.
no code implementations • WS 2018 • Haw-Shiuan Chang, Amol Agrawal, Ananya Ganesh, Anirudha Desai, Vinayak Mathur, Alfred Hough, Andrew McCallum
Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable.
1 code implementation • NAACL 2018 • Patrick Verga, Emma Strubell, Andrew McCallum
Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention.
no code implementations • 18 Nov 2017 • Sheshera Mysore, Edward Kim, Emma Strubell, Ao Liu, Haw-Shiuan Chang, Srikrishna Kompella, Kevin Huang, Andrew McCallum, Elsa Olivetti
In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds.
no code implementations • 15 Nov 2017 • Shikhar Murty, Patrick Verga, Luke Vilnis, Andrew McCallum
We consider the challenging problem of entity typing over an extremely fine grained set of types, wherein a single mention or entity can have many simultaneous and often hierarchically-structured types.
7 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum
Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.
no code implementations • 23 Oct 2017 • Patrick Verga, Emma Strubell, Ofer Shai, Andrew McCallum
We propose a model to consider all mention and entity pairs simultaneously in order to make a prediction.
no code implementations • NAACL 2018 • Haw-Shiuan Chang, ZiYun Wang, Luke Vilnis, Andrew McCallum
Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, coreference, relation extraction, and question answering.
no code implementations • 2 Aug 2017 • Dung Thai, Shikhar Murty, Trapit Bansal, Luke Vilnis, David Belanger, Andrew McCallum
In textual information extraction and other sequence labeling tasks it is now common to use recurrent neural networks (such as LSTM) to form rich embedded representations of long-term input co-occurrence patterns.
no code implementations • 1 Aug 2017 • Xiang Li, Luke Vilnis, Andrew McCallum
Recent work in learning ontologies (hierarchical and partially-ordered structures) has leveraged the intrinsic geometry of spaces of learned representations to make predictions that automatically obey complex structural constraints.
no code implementations • 22 Jun 2017 • Trapit Bansal, Arvind Neelakantan, Andrew McCallum
We introduce RelNet: a new model for relational reasoning.
no code implementations • WS 2017 • Emma Strubell, Andrew McCallum
Dependency parses are an effective way to inject linguistic knowledge into many downstream tasks, and many practitioners wish to efficiently parse sentences at scale.
no code implementations • ACL 2017 • Rajarshi Das, Manzil Zaheer, Siva Reddy, Andrew McCallum
Existing question answering methods infer answers either from a knowledge base or from raw text.
1 code implementation • NeurIPS 2017 • Haw-Shiuan Chang, Erik Learned-Miller, Andrew McCallum
Self-paced learning and hard example mining re-weight training instances to improve learning accuracy.
1 code implementation • SEMEVAL 2017 • Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, Andrew McCallum
We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for understanding which publications describe which processes, tasks and materials.
2 code implementations • 6 Apr 2017 • Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum
Many modern clustering methods scale well to a large number of data items, N, but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K--a problem setting we term extreme clustering.
no code implementations • ICML 2017 • David Belanger, Bishan Yang, Andrew McCallum
Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016).
4 code implementations • EMNLP 2017 • Emma Strubell, Patrick Verga, David Belanger, Andrew McCallum
Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs.
Ranked #23 on
Named Entity Recognition (NER)
on Ontonotes v5 (English)
2 code implementations • 28 Nov 2016 • Arvind Neelakantan, Quoc V. Le, Martin Abadi, Andrew McCallum, Dario Amodei
The main experimental result in this paper is that a single Neural Programmer model achieves 34. 2% accuracy using only 10, 000 examples with weak supervision.
no code implementations • 7 Sep 2016 • Trapit Bansal, David Belanger, Andrew McCallum
In a variety of application domains the content to be recommended to users is associated with text.
2 code implementations • EACL 2017 • Rajarshi Das, Arvind Neelakantan, David Belanger, Andrew McCallum
Our goal is to combine the rich multistep inference of symbolic logical reasoning with the generalization capabilities of neural networks.
1 code implementation • EACL 2017 • Patrick Verga, Arvind Neelakantan, Andrew McCallum
In experiments predicting both relations and entity types, we demonstrate that despite having an order of magnitude fewer parameters than traditional universal schema, we can match the accuracy of the traditional model, and more importantly, we can now make predictions about unseen rows with nearly the same accuracy as rows available at training time.
1 code implementation • WS 2016 • Patrick Verga, Andrew McCallum
In experimental results on the FB15k-237 benchmark we demonstrate that we can match the performance of a comparable model with explicit entity pair representations using a model of attention over relation types.
1 code implementation • NAACL 2016 • Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew McCallum
In response, this paper introduces significant further improvements to the coverage and flexibility of universal schema relation extraction: predictions for entities unseen in training and multilingual transfer learning to domains with no annotation.
no code implementations • 19 Nov 2015 • David Belanger, Andrew McCallum
This deep architecture captures dependencies between labels that would lead to intractable graphical models, and performs structure learning by automatically learning discriminative features of the structured output.
no code implementations • IJCNLP 2015 • Emma Strubell, Luke Vilnis, Kate Silverstein, Andrew McCallum
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components.
no code implementations • IJCNLP 2015 • Arvind Neelakantan, Benjamin Roth, Andrew McCallum
Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X, Y) from bornIn(X, Y).
no code implementations • EMNLP 2014 • Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, Andrew McCallum
There is rising interest in vector-space word embeddings and their use in NLP, especially given recent methods for their fast estimation at very large scale.
no code implementations • 4 Mar 2015 • Luke Vilnis, David Belanger, Daniel Sheldon, Andrew McCallum
Many inference problems in structured prediction are naturally solved by augmenting a tractable dependency structure with complex, non-local auxiliary objectives.
1 code implementation • 20 Dec 2014 • Luke Vilnis, Andrew McCallum
Current work in lexical distributed representations maps each word to a point vector in low-dimensional space.
no code implementations • 30 Oct 2014 • Emma Strubell, Luke Vilnis, Andrew McCallum
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components.
no code implementations • WS 2014 • Alexandre Passos, Vineet Kumar, Andrew McCallum
Most state-of-the-art approaches for named-entity recognition (NER) use semi supervised information in the form of word clusters and lexicons.
no code implementations • ACL 2014 • Sam Anzaroot, Alexandre Passos, David Belanger, Andrew McCallum
Accurately segmenting a citation string into fields for authors, titles, etc.
no code implementations • 14 Nov 2013 • Sameer Singh, Sebastian Riedel, Andrew McCallum
Belief Propagation has been widely used for marginal inference, however it is slow on problems with large-domain variables and high-order factors.
no code implementations • NeurIPS 2012 • David Belanger, Alexandre Passos, Sebastian Riedel, Andrew McCallum
Linear chains and trees are basic building blocks in many applications of graphical models.
no code implementations • NeurIPS 2011 • Michael L. Wick, Andrew McCallum
Traditional approaches to probabilistic inference such as loopy belief propagation and Gibbs sampling typically compute marginals for it all the unobserved variables in a graphical model.
no code implementations • 17 Nov 2010 • Charles Sutton, Andrew McCallum
This tutorial describes conditional random fields, a popular probabilistic method for structured prediction.
no code implementations • NeurIPS 2009 • Khashayar Rohanimanesh, Sameer Singh, Andrew McCallum, Michael J. Black
Large, relational factor graphs with structure defined by first-order logic or other languages give rise to notoriously difficult inference problems.
no code implementations • NeurIPS 2009 • Hanna M. Wallach, David M. Mimno, Andrew McCallum
Implementations of topic models typically use symmetric Dirichlet priors with fixed concentration parameters, with the implicit assumption that such smoothing parameters" have little practical effect.
no code implementations • NeurIPS 2009 • Andrew Mccallum, Karl Schultz, Sameer Singh
Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data.