Search Results for author: Nicholas Monath

Found 31 papers, 18 papers with code

An Online Hierarchical Algorithm for Extreme Clustering

2 code implementations6 Apr 2017 Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum

Many modern clustering methods scale well to a large number of data items, N, but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K--a problem setting we term extreme clustering.

Clustering

Integrating User Feedback under Identity Uncertainty in Knowledge Base Construction

no code implementations AKBC 2019 Ari Kobren, Nicholas Monath, Andrew McCallum

Users have tremendous potential to aid in the construction and maintenance of knowledges bases (KBs) through the contribution of feedback that identifies incorrect and missing entity attributes and relations.

Entity Resolution

Compact Representation of Uncertainty in Clustering

no code implementations NeurIPS 2018 Craig Greenberg, Nicholas Monath, Ari Kobren, Patrick Flaherty, Andrew Mcgregor, Andrew McCallum

For many classic structured prediction problems, probability distributions over the dependent variables can be efficiently computed using widely-known algorithms and data structures (such as forward-backward, and its corresponding trellis for exact probability distributions in Markov models).

Clustering Small Data Image Classification +1

Supervised Hierarchical Clustering with Exponential Linkage

1 code implementation19 Jun 2019 Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew McCallum

Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function.

Clustering

Scalable Hierarchical Clustering with Tree Grafting

1 code implementation31 Dec 2019 Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum

We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets.

Clustering

Predicting Institution Hierarchies with Set-based Models

no code implementations AKBC 2020 Derek Tam, Nicholas Monath, Ari Kobren, Andrew McCallum

The hierarchical structure of research organizations plays a pivotal role in science of science research as well as in tools that track the research achievements and output.

Data Structures & Algorithms for Exact Inference in Hierarchical Clustering

1 code implementation26 Feb 2020 Craig S. Greenberg, Sebastian Macaluso, Nicholas Monath, Ji-Ah Lee, Patrick Flaherty, Kyle Cranmer, Andrew Mcgregor, Andrew McCallum

In contrast to existing methods, we present novel dynamic-programming algorithms for \emph{exact} inference in hierarchical clustering based on a novel trellis data structure, and we prove that we can exactly compute the partition function, maximum likelihood hierarchy, and marginal probabilities of sub-hierarchies and clusters.

Clustering Small Data Image Classification

Using BibTeX to Automatically Generate Labeled Data for Citation Field Extraction

1 code implementation AKBC 2020 Dung Thai, Zhiyang Xu, Nicholas Monath, Boris Veytsman, Andrew McCallum

In this paper, we describe a technique for using BibTeX to generate, automatically, a large-scale 41M labeled strings), labeled dataset, that is four orders of magnitude larger than the current largest CFE dataset, namely the UMass Citation Field Extraction dataset [Anzaroot and McCallum, 2013].

Management

Clustering-based Inference for Biomedical Entity Linking

no code implementations NAACL 2021 Rico Angell, Nicholas Monath, Sunil Mohan, Nishant Yadav, Andrew McCallum

In this paper, we introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions.

Clustering Entity Linking

Entity Linking and Discovery via Arborescence-based Supervised Clustering

1 code implementation2 Sep 2021 Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum

Previous work has shown promising results in performing entity linking by measuring not only the affinities between mentions and entities but also those amongst mentions.

Clustering Entity Linking

Capacity and Bias of Learned Geometric Embeddings for Directed Graphs

1 code implementation NeurIPS 2021 Michael Boratko, Dongxu Zhang, Nicholas Monath, Luke Vilnis, Kenneth Clarkson, Andrew McCallum

While vectors in Euclidean space can theoretically represent any graph, much recent work shows that alternatives such as complex, hyperbolic, order, or box embeddings have geometric properties better suited to modeling real-world graphs.

Knowledge Base Completion Multi-Label Classification

Sublinear Time Approximation of Text Similarity Matrices

1 code implementation17 Dec 2021 Archan Ray, Nicholas Monath, Andrew McCallum, Cameron Musco

Approximation methods reduce this quadratic complexity, often by using a small subset of exactly computed similarities to approximate the remainder of the complete pairwise similarity matrix.

Document Classification Sentence +2

Unsupervised Opinion Summarization Using Approximate Geodesics

no code implementations15 Sep 2022 Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi

We then use these representations to quantify the relevance of review sentences using a novel approximate geodesic distance based scoring mechanism.

Dictionary Learning Opinion Summarization +2

Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization

1 code implementation23 Oct 2022 Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum

When the similarity is measured by dot-product between dual-encoder vectors or $\ell_2$-distance, there already exist many scalable and efficient search methods.

Retrieval

Autoregressive Structured Prediction with Language Models

1 code implementation26 Oct 2022 Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya Sachan

Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks.

 Ranked #1 on Relation Extraction on CoNLL04 (RE+ Micro F1 metric)

Named Entity Recognition Named Entity Recognition (NER) +2

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

no code implementations27 Mar 2023 Nicholas Monath, Manzil Zaheer, Kelsey Allen, Andrew McCallum

First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree.

Retrieval

Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition

1 code implementation4 May 2023 Nishant Yadav, Nicholas Monath, Manzil Zaheer, Andrew McCallum

While ANNCUR's one-time selection of anchors tends to approximate the cross-encoder distances on average, doing so forfeits the capacity to accurately estimate distances to items near the query, leading to regret in the crucial end-task: recall of top-k items.

Retrieval

Enhancing Group Fairness in Online Settings Using Oblique Decision Forests

1 code implementation17 Oct 2023 Somnath Basu Roy Chowdhury, Nicholas Monath, Ahmad Beirami, Rahul Kidambi, Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi

In the online setting, where the algorithm has access to a single instance at a time, estimating the group fairness objective requires additional storage and significantly more computation (e. g., forward/backward passes) than the task-specific objective at every time step.

Fairness

Incremental Extractive Opinion Summarization Using Cover Trees

no code implementations16 Jan 2024 Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Manzil Zaheer, Andrew McCallum, Amr Ahmed, Snigdha Chaturvedi

In this work, we study the task of extractive opinion summarization in an incremental setting, where the underlying review set evolves over time.

Extractive Summarization Opinion Summarization

Event and Entity Coreference using Trees to Encode Uncertainty in Joint Decisions

no code implementations CRAC (ACL) 2021 Nishant Yadav, Nicholas Monath, Rico Angell, Andrew McCallum

Coreference decisions among event mentions and among co-occurring entity mentions are highly interdependent, thus motivating joint inference.

Clustering

Entity Linking via Explicit Mention-Mention Coreference Modeling

1 code implementation NAACL 2022 Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum

Learning representations of entity mentions is a core component of modern entity linking systems for both candidate generation and making linking predictions.

Entity Linking Re-Ranking

Cannot find the paper you are looking for? You can Submit a new open access paper.