We present ACCoRD, an end-to-end system tackling the novel task of generating sets of descriptions of scientific concepts.
Creating labeled natural language training data is expensive and requires significant human effort.
An important goal in the field of human-AI interaction is to help users more appropriately trust AI systems' decisions.
To improve the discovery experience we introduce multiple new methods for \em augmenting recommendations with textual relevance messages that highlight knowledge-graph connections between recommended papers and a user's publication and interaction history.
Based on our findings, we present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
However, we note that the two explanation approaches may be better compared in the context of a higher-stakes or more opaque domain.
To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery.
Experiments are conducted on a newly curated evaluation suite, S2-VLUE, that unifies existing automatically-labeled datasets and includes a new dataset of manual annotations covering diverse papers from 19 scientific disciplines.
Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository.
While counterfactual examples are useful for analysis and training of NLP models, current generation methods either rely on manual labor to create very few counterfactuals, or only instantiate limited types of perturbations such as paraphrases or word substitutions.
Based on this analysis, we develop a new definition detection system, HEDDEx, that utilizes syntactic features, transformer encoders, and heuristic filters, and evaluate it on a standard sentence-level benchmark.
We introduce ScholarPhi, an augmented reading interface with four novel features: (1) tooltips that surface position-sensitive definitions from elsewhere in a paper, (2) a filter over the paper that "declutters" it to reveal how the term or symbol is used across the paper, (3) automatic equation diagrams that expose multiple definitions in parallel, and (4) an automatically generated glossary of important terms and symbols.
However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team.
From a corpus of computer science papers on arXiv, we find that our method achieves a Precision@1000 of 99%, compared to 86% for prior work, and a substantially better precision-yield trade-off across the top 15, 000 extractions.
The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions.
We report the results of running the pipeline on 16. 3 million pages from the Chronicling America corpus and describe the resulting Newspaper Navigator dataset, the largest dataset of extracted visual content from historic newspapers ever produced.
To optimize the team performance for this setting we maximize the team's expected utility, expressed in terms of the quality of the final decision, cost of verifying, and individual accuracies of people and machines.
4 code implementations • • Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Doug Burdick, Darrin Eide, Kathryn Funk, Yannis Katsis, Rodney Kinney, Yunyao Li, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex Wade, Kuansan Wang, Nancy Xin Ru Wang, Chris Wilhelm, Boya Xie, Douglas Raymond, Daniel S. Weld, Oren Etzioni, Sebastian Kohlmeier
The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research.
We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph.
We show our method improves accuracy compared to a rigorous baseline on the image classification domains.
As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in context of the document.
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3. 9 F1) and GAP (+11. 5 F1) benchmarks.
Ranked #4 on Coreference Resolution on OntoNotes
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.
Ranked #1 on Open-Domain Question Answering on SearchQA (F1 metric)
Reasoning about implied relationships (e. g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems.
In this paper, we investigate a new problem of systematically mining question-code pairs from Stack Overflow (in contrast to heuristically collecting them).
Since Artificial Intelligence (AI) software uses techniques like deep lookahead search and stochastic optimization of huge neural networks to fit mammoth datasets, it often results in complex behavior that is difficult for people to understand.
We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples.
We present POAPS, a novel planning system for defining Partially Observable Markov Decision Processes (POMDPs) that abstracts away from POMDP details for the benefit of non-expert practitioners.
Recent research on entity linking (EL) has introduced a plethora of promising techniques, ranging from deep neural networks to joint inference.