Search Results for author: Lucy Lu Wang

Found 27 papers, 19 papers with code

S2ORC: The Semantic Scholar Open Research Corpus

2 code implementations • ACL 2020 • Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, Dan S. Weld

We introduce S2ORC, a large corpus of 81. 1M English-language academic papers spanning many academic disciplines.

736

Paper
Code

Fact or Fiction: Verifying Scientific Claims

2 code implementations • EMNLP 2020 • David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, Hannaneh Hajishirzi

We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to identify rationales justifying each decision.

Claim Verification Domain Adaptation +1

211

Paper
Code

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups

1 code implementation • 1 Jun 2021 • Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey

Experiments are conducted on a newly curated evaluation suite, S2-VLUE, that unifies existing automatically-labeled datasets and includes a new dataset of manual annotations covering diverse papers from 19 scientific disciplines.

Language Modelling Text Classification +2

153

Paper
Code

CORD-19: The COVID-19 Open Research Dataset

4 code implementations • ACL 2020 • Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Doug Burdick, Darrin Eide, Kathryn Funk, Yannis Katsis, Rodney Kinney, Yunyao Li, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex Wade, Kuansan Wang, Nancy Xin Ru Wang, Chris Wilhelm, Boya Xie, Douglas Raymond, Daniel S. Weld, Oren Etzioni, Sebastian Kohlmeier

The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research.

Information Retrieval Management +1

147

Paper
Code

MedICaT: A Dataset of Medical Images, Captions, and Textual References

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Sanjay Subramanian, Lucy Lu Wang, Sachin Mehta, Ben Bogin, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi

To address challenges in figure retrieval and figure-to-text alignment, we introduce MedICaT, a dataset of medical images in context.

document understanding Image-text matching +2

Paper
Code

Ontology Alignment in the Biomedical Domain Using Entity Definitions and Context

1 code implementation • WS 2018 • Lucy Lu Wang, Chandra Bhagavatula, Mark Neumann, Kyle Lo, Chris Wilhelm, Waleed Ammar

Ontology alignment is the task of identifying semantically equivalent entities from two given ontologies.

Paper
Code

MS2: Multi-Document Summarization of Medical Studies

2 code implementations • 13 Apr 2021 • Jay DeYoung, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl, Lucy Lu Wang

In support of this goal, we release MS^2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20k summaries derived from the scientific literature.

Document Summarization Multi-Document Summarization

Paper
Code

MultiVerS: Improving scientific claim verification with weak supervision and full-document context

3 code implementations • Findings (NAACL) 2022 • David Wadden, Kyle Lo, Lucy Lu Wang, Arman Cohan, Iz Beltagy, Hannaneh Hajishirzi

Our approach outperforms two competitive baselines on three scientific claim verification datasets, with particularly strong performance in zero / few-shot domain adaptation experiments.

Claim Verification Domain Adaptation +2

Paper
Code

Generating Scientific Claims for Zero-Shot Scientific Fact Checking

1 code implementation • ACL 2022 • Dustin Wright, David Wadden, Kyle Lo, Bailey Kuehl, Arman Cohan, Isabelle Augenstein, Lucy Lu Wang

To address this challenge, we propose scientific claim generation, the task of generating one or more atomic and verifiable claims from scientific sentences, and demonstrate its usefulness in zero-shot fact checking for biomedical claims.

Fact Checking Negation

Paper
Code

Overview of MSLR2022: A Shared Task on Multi-document Summarization for Literature Reviews

1 code implementation • sdp (COLING) 2022 • Lucy Lu Wang, Jay DeYoung, Byron Wallace

We provide an overview of the MSLR2022 shared task on multi-document summarization for literature reviews.

Document Summarization Multi-Document Summarization

Paper
Code

Literature-Augmented Clinical Outcome Prediction

1 code implementation • Findings (NAACL) 2022 • Aakanksha Naik, Sravanthi Parasa, Sergey Feldman, Lucy Lu Wang, Tom Hope

We present BEEP (Biomedical Evidence-Enhanced Predictions), a novel approach for clinical outcome prediction that retrieves patient-specific medical literature and incorporates it into predictive models.

Decision Making

Paper
Code

SciFact-Open: Towards open-domain scientific claim verification

1 code implementation • 25 Oct 2022 • David Wadden, Kyle Lo, Bailey Kuehl, Arman Cohan, Iz Beltagy, Lucy Lu Wang, Hannaneh Hajishirzi

While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature.

Claim Verification Information Retrieval +1

Paper
Code

The Semantic Scholar Open Data Platform

1 code implementation • 24 Jan 2023 • Rodney Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, Jason Dunkelberger, Oren Etzioni, Rob Evans, Sergey Feldman, Joseph Gorney, David Graham, Fangzhou Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Chris Newell, Smita Rao, Shaurya Rohatgi, Paul Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, Amber Tanaka, Alex D. Wade, Linda Wagner, Lucy Lu Wang, Chris Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Madeleine van Zuylen, Daniel S. Weld

The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field.

graph construction

Paper
Code

Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing

1 code implementation • 28 Feb 2022 • Tal August, Lucy Lu Wang, Jonathan Bragg, Marti A. Hearst, Andrew Head, Kyle Lo

When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature.

Paper
Code

SUPP.AI: Finding Evidence for Supplement-Drug Interactions

1 code implementation • ACL 2020 • Lucy Lu Wang, Oyvind Tafjord, Arman Cohan, Sarthak Jain, Sam Skjonsberg, Carissa Schoenick, Nick Botner, Waleed Ammar

We fine-tune the contextualized word representations of the RoBERTa language model using labeled DDI data, and apply the fine-tuned model to identify supplement interactions.

General Classification Language Modelling

Paper
Code

Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

1 code implementation • 23 May 2023 • Lucy Lu Wang, Yulia Otmakhova, Jay DeYoung, Thinh Hung Truong, Bailey E. Kuehl, Erin Bransom, Byron C. Wallace

We analyze how automated summarization evaluation metrics correlate with lexical features of generated summaries, to other automated metrics including several we propose in this work, and to aspects of human-assessed summary quality.

Document Summarization Multi-Document Summarization

Paper
Code

NLP for Maternal Healthcare: Perspectives and Guiding Principles in the Age of LLMs

1 code implementation • 19 Dec 2023 • Maria Antoniak, Aakanksha Naik, Carla S. Alvarado, Lucy Lu Wang, Irene Y. Chen

Ethical frameworks for the use of natural language processing (NLP) are urgently needed to shape how large language models (LLMs) and similar tools are used for healthcare applications.

Chatbot

Paper
Code

Construction of the Literature Graph in Semantic Scholar

no code implementations • NAACL 2018 • Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler Murray, Hsu-Han Ooi, Matthew Peters, Joanna Power, Sam Skjonsberg, Lucy Lu Wang, Chris Wilhelm, Zheng Yuan, Madeleine van Zuylen, Oren Etzioni

We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery.

Entity Extraction using GAN graph construction

Paper
Add Code

TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection

no code implementations • 9 May 2020 • Ellen Voorhees, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, William R Hersh, Kyle Lo, Kirk Roberts, Ian Soboroff, Lucy Lu Wang

TREC-COVID is a community evaluation designed to build a test collection that captures the information needs of biomedical researchers using the scientific literature during a pandemic.

Information Retrieval Retrieval

Paper
Add Code

Searching for Scientific Evidence in a Pandemic: An Overview of TREC-COVID

no code implementations • 19 Apr 2021 • Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Ian Soboroff, Ellen Voorhees, Lucy Lu Wang, William R Hersh

We present an overview of the TREC-COVID Challenge, an information retrieval (IR) shared task to evaluate search on scientific literature related to COVID-19.

Information Retrieval Retrieval

Paper
Add Code

Overview of the Third Workshop on Scholarly Document Processing

no code implementations • sdp (COLING) 2022 • Arman Cohan, Guy Feigenblat, Dayne Freitag, Tirthankar Ghosal, Drahomira Herrmannova, Petr Knoth, Kyle Lo, Philipp Mayr, Michal Shmueli-Scheuer, Anita de Waard, Lucy Lu Wang

With the ever-increasing pace of research and high volume of scholarly communication, scholars face a daunting task.

Document Summarization Graph Generation +4

Paper
Add Code

Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval

no code implementations • 20 Dec 2022 • John Giorgi, Luca Soldaini, Bo wang, Gary Bader, Kyle Lo, Lucy Lu Wang, Arman Cohan

Via extensive automatic and human evaluation, we determine: (1) state-of-the-art summarizers suffer large reductions in performance when applied to open-domain MDS, (2) additional training in the open-domain setting can reduce this sensitivity to imperfect retrieval, and (3) summarizers are insensitive to the retrieval of duplicate documents and the order of retrieved documents, but highly sensitive to other errors, like the retrieval of irrelevant documents.

Ranked #1 on Multi-Document Summarization on MS^2

Document Summarization Multi-Document Summarization +1

Paper
Add Code

The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces

no code implementations • 25 Mar 2023 • Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang, Cassidy Trier, Chloe Anastasiades, Tal August, Russell Authur, Danielle Bragg, Erin Bransom, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Yen-Sung Chen, Evie Yu-Yen Cheng, Yvonne Chou, Doug Downey, Rob Evans, Raymond Fok, Fangzhou Hu, Regan Huff, Dongyeop Kang, Tae Soo Kim, Rodney Kinney, Aniket Kittur, Hyeonsu Kang, Egor Klevak, Bailey Kuehl, Michael Langan, Matt Latzke, Jaron Lochner, Kelsey MacMillan, Eric Marsh, Tyler Murray, Aakanksha Naik, Ngoc-Uyen Nguyen, Srishti Palani, Soya Park, Caroline Paulic, Napol Rachatasumrit, Smita Rao, Paul Sayre, Zejiang Shen, Pao Siangliulue, Luca Soldaini, Huy Tran, Madeleine van Zuylen, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Marti A. Hearst, Daniel S. Weld

Scholarly publications are key to the transfer of knowledge from scholars to others.

Paper
Add Code

APPLS: Evaluating Evaluation Metrics for Plain Language Summarization

1 code implementation • 23 May 2023 • Yue Guo, Tal August, Gondy Leroy, Trevor Cohen, Lucy Lu Wang

In response, we introduce POMME, a new metric designed to assess text simplification in PLS; the metric is calculated as the normalized perplexity difference between an in-domain and out-of-domain language model.

Informativeness Language Modelling +2

Paper
Code

The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices

1 code implementation • 4 Oct 2023 • Hancheng Cao, Jesse Dodge, Kyle Lo, Daniel A. McFarland, Lucy Lu Wang

In recent years, funding agencies and journals increasingly advocate for open science practices (e. g. data and method sharing) to improve the transparency, access, and reproducibility of science.

Math text-classification +1

Paper
Code

Personalized Jargon Identification for Enhanced Interdisciplinary Communication

no code implementations • 16 Nov 2023 • Yue Guo, Joseph Chee Chang, Maria Antoniak, Erin Bransom, Trevor Cohen, Lucy Lu Wang, Tal August

We collect a dataset of over 10K term familiarity annotations from 11 computer science researchers for terms drawn from 100 paper abstracts.

Paper
Add Code

From Paper to Card: Transforming Design Implications with Generative AI

no code implementations • 12 Mar 2024 • Donghoon Shin, Lucy Lu Wang, Gary Hsieh

Communicating design implications is common within the HCI community when publishing academic papers, yet these papers are rarely read and used by designers.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.