no code implementations • 23 Sep 2024 • Marissa Radensky, Simra Shahid, Raymond Fok, Pao Siangliulue, Tom Hope, Daniel S. Weld
The scientific ideation process often involves blending salient aspects of existing papers to create new ideas.
1 code implementation • 18 Dec 2023 • Madeleine Grunde-McLaughlin, Michelle S. Lam, Ranjay Krishna, Daniel S. Weld, Jeffrey Heer
The design space covers a designer's objectives and the tactics used to build workflows.
no code implementations • 12 May 2023 • Raymond Fok, Daniel S. Weld
To synthesize these findings, we propose a simple theory that elucidates the frequent failure of AI explanations to engender appropriate reliance and complementary decision making performance.
no code implementations • 25 Mar 2023 • Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang, Cassidy Trier, Chloe Anastasiades, Tal August, Russell Authur, Danielle Bragg, Erin Bransom, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Yen-Sung Chen, Evie Yu-Yen Cheng, Yvonne Chou, Doug Downey, Rob Evans, Raymond Fok, Fangzhou Hu, Regan Huff, Dongyeop Kang, Tae Soo Kim, Rodney Kinney, Aniket Kittur, Hyeonsu Kang, Egor Klevak, Bailey Kuehl, Michael Langan, Matt Latzke, Jaron Lochner, Kelsey MacMillan, Eric Marsh, Tyler Murray, Aakanksha Naik, Ngoc-Uyen Nguyen, Srishti Palani, Soya Park, Caroline Paulic, Napol Rachatasumrit, Smita Rao, Paul Sayre, Zejiang Shen, Pao Siangliulue, Luca Soldaini, Huy Tran, Madeleine van Zuylen, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Marti A. Hearst, Daniel S. Weld
Scholarly publications are key to the transfer of knowledge from scholars to others.
no code implementations • 11 Mar 2023 • Joyce Zhou, Elena Glassman, Daniel S. Weld
Scientists and science journalists, among others, often need to make sense of a large number of papers and how they compare with each other in scope, focus, findings, or any other important factors.
1 code implementation • 14 Feb 2023 • Tongshuang Wu, Hua Shen, Daniel S. Weld, Jeffrey Heer, Marco Tulio Ribeiro
ScatterShot iteratively slices unlabeled data into task-specific patterns, samples informative inputs from underexplored or not-yet-saturated slices in an active learning manner, and helps users label more efficiently with the help of an LLM and the current example set.
1 code implementation • 24 Jan 2023 • Rodney Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, Jason Dunkelberger, Oren Etzioni, Rob Evans, Sergey Feldman, Joseph Gorney, David Graham, Fangzhou Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Chris Newell, Smita Rao, Shaurya Rohatgi, Paul Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, Amber Tanaka, Alex D. Wade, Linda Wagner, Lucy Lu Wang, Chris Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Madeleine van Zuylen, Daniel S. Weld
The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field.
no code implementations • 16 Aug 2022 • Harmanpreet Kaur, Doug Downey, Amanpreet Singh, Evie Yu-Yen Cheng, Daniel S. Weld, Jonathan Bragg
We implement our technique in a novel system, FeedLens, which is built over Semantic Scholar, a production system for navigating the scientific literature KG.
1 code implementation • 14 May 2022 • Sonia K. Murthy, Kyle Lo, Daniel King, Chandra Bhagavatula, Bailey Kuehl, Sophie Johnson, Jonathan Borchardt, Daniel S. Weld, Tom Hope, Doug Downey
We present ACCoRD, an end-to-end system tackling the novel task of generating sets of descriptions of scientific concepts.
no code implementations • 9 May 2022 • Mandar Joshi, Terra Blevins, Mike Lewis, Daniel S. Weld, Luke Zettlemoyer
Creating labeled natural language training data is expensive and requires significant human effort.
no code implementations • 4 May 2022 • Tom Hope, Doug Downey, Oren Etzioni, Daniel S. Weld, Eric Horvitz
We stand at the foot of a significant inflection in the trajectory of scientific discovery.
no code implementations • 27 Apr 2022 • Marissa Radensky, Dustin Burson, Rajya Bhaiya, Daniel S. Weld
An important goal in the field of human-AI interaction is to help users more appropriately trust AI systems' decisions.
no code implementations • 21 Apr 2022 • Hyeonsu B. Kang, Rafal Kocielnik, Andrew Head, Jiangjiang Yang, Matt Latzke, Aniket Kittur, Daniel S. Weld, Doug Downey, Jonathan Bragg
To improve the discovery experience we introduce multiple new methods for \em augmenting recommendations with textual relevance messages that highlight knowledge-graph connections between recommended papers and a user's publication and interaction history.
1 code implementation • 16 Mar 2022 • Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, Doug Downey
Based on our findings, we present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
no code implementations • 27 Sep 2021 • Marissa Radensky, Doug Downey, Kyle Lo, Zoran Popović, Daniel S. Weld
However, we note that the two explanation approaches may be better compared in the context of a higher-stakes or more opaque domain.
1 code implementation • NeurIPS Workshop AI4Scien 2021 • Dan Lahav, Jon Saad Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S. Weld, Tom Hope
To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery.
1 code implementation • 1 Jun 2021 • Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey
Experiments are conducted on a newly curated evaluation suite, S2-VLUE, that unifies existing automatically-labeled datasets and includes a new dataset of manual annotations covering diverse papers from 19 scientific disciplines.
2 code implementations • 17 Jan 2021 • Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, Daniel S. Weld
While often assumed a gold standard, effective human evaluation of text generation remains an important, open area for research.
1 code implementation • ACL 2021 • Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, Daniel S. Weld
While counterfactual examples are useful for analysis and training of NLP models, current generation methods either rely on manual labor to create very few counterfactuals, or only instantiate limited types of perturbations such as paraphrases or word substitutions.
1 code implementation • EMNLP (sdp) 2020 • Dongyeop Kang, Andrew Head, Risham Sidhu, Kyle Lo, Daniel S. Weld, Marti A. Hearst
Based on this analysis, we develop a new definition detection system, HEDDEx, that utilizes syntactic features, transformer encoders, and heuristic filters, and evaluate it on a standard sentence-level benchmark.
1 code implementation • 29 Sep 2020 • Andrew Head, Kyle Lo, Dongyeop Kang, Raymond Fok, Sam Skjonsberg, Daniel S. Weld, Marti A. Hearst
We introduce ScholarPhi, an augmented reading interface with four novel features: (1) tooltips that surface position-sensitive definitions from elsewhere in a paper, (2) a filter over the paper that "declutters" it to reveal how the term or symbol is used across the paper, (3) automatic equation diagrams that expose multiple definitions in parallel, and (4) an automatically generated glossary of important terms and symbols.
no code implementations • 26 Jun 2020 • Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, Daniel S. Weld
However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team.
1 code implementation • 11 Jun 2020 • Daniel King, Doug Downey, Daniel S. Weld
From a corpus of computer science papers on arXiv, we find that our method achieves a Precision@1000 of 99%, compared to 86% for prior work, and a substantially better precision-yield trade-off across the top 15, 000 extractions.
Knowledge Base Construction Vocal Bursts Intensity Prediction
no code implementations • EMNLP 2020 • Tom Hope, Jason Portenoy, Kishore Vasan, Jonathan Borchardt, Eric Horvitz, Daniel S. Weld, Marti A. Hearst, Jevin West
The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions.
2 code implementations • 4 May 2020 • Benjamin Charles Germain Lee, Jaime Mears, Eileen Jakeway, Meghan Ferriter, Chris Adams, Nathan Yarasavage, Deborah Thomas, Kate Zwaard, Daniel S. Weld
We report the results of running the pipeline on 16. 3 million pages from the Chronicling America corpus and describe the resulting Newspaper Navigator dataset, the largest dataset of extracted visual content from historic newspapers ever produced.
4 code implementations • Findings of the Association for Computational Linguistics 2020 • Isabel Cachola, Kyle Lo, Arman Cohan, Daniel S. Weld
We introduce TLDR generation, a new form of extreme summarization, for scientific papers.
no code implementations • 27 Apr 2020 • Gagan Bansal, Besmira Nushi, Ece Kamar, Eric Horvitz, Daniel S. Weld
To optimize the team performance for this setting we maximize the team's expected utility, expressed in terms of the quality of the final decision, cost of verifying, and individual accuracies of people and machines.
4 code implementations • ACL 2020 • Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Doug Burdick, Darrin Eide, Kathryn Funk, Yannis Katsis, Rodney Kinney, Yunyao Li, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex Wade, Kuansan Wang, Nancy Xin Ru Wang, Chris Wilhelm, Boya Xie, Douglas Raymond, Daniel S. Weld, Oren Etzioni, Sebastian Kohlmeier
The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research.
5 code implementations • ACL 2020 • Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph.
Ranked #1 on Document Classification on SciDocs (MAG)
1 code implementation • 9 Mar 2020 • Benjamin Charles Germain Lee, Doug Downey, Kyle Lo, Daniel S. Weld
We show our method improves accuracy compared to a rigorous baseline on the image classification domains.
1 code implementation • IJCNLP 2019 • Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in context of the document.
2 code implementations • IJCNLP 2019 • Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3. 9 F1) and GAP (+11. 5 F1) benchmarks.
Ranked #11 on Coreference Resolution on CoNLL 2012 (using extra training data)
6 code implementations • TACL 2020 • Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.
Ranked #1 on Question Answering on NewsQA (F1 metric)
3 code implementations • NAACL 2019 • Mandar Joshi, Eunsol Choi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer
Reasoning about implied relationships (e. g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems.
no code implementations • NAACL 2018 • James Ferguson, Colin Lockard, Daniel S. Weld, Hannaneh Hajishirzi
Supervised event extraction systems are limited in their accuracy due to the lack of available training data.
1 code implementation • 26 Mar 2018 • Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, Huan Sun
In this paper, we investigate a new problem of systematically mining question-code pairs from Stack Overflow (in contrast to heuristically collecting them).
no code implementations • 9 Mar 2018 • Daniel S. Weld, Gagan Bansal
Since Artificial Intelligence (AI) software uses techniques like deep lookahead search and stochastic optimization of huge neural networks to fit mammoth datasets, it often results in complex behavior that is difficult for people to understand.
3 code implementations • ACL 2017 • Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples.
no code implementations • 31 Aug 2016 • Christopher H. Lin, Mausam, Daniel S. Weld
We present POAPS, a novel planning system for defining Partially Observable Markov Decision Processes (POMDPs) that abstracts away from POMDP details for the benefit of non-expert practitioners.
no code implementations • 21 Jun 2015 • Raphael Hoffmann, Luke Zettlemoyer, Daniel S. Weld
Information Extraction (IE) aims to automatically generate a large knowledge base from natural language text, but progress remains slow.
no code implementations • TACL 2015 • Xiao Ling, Sameer Singh, Daniel S. Weld
Recent research on entity linking (EL) has introduced a plethora of promising techniques, ranging from deep neural networks to joint inference.