Search Results for author: Zonghai Yao

Found 33 papers, 19 papers with code

The impact of preprint servers in the formation of novel ideas

1 code implementation EMNLP (sdp) 2020 Swarup Satish, Zonghai Yao, Andrew Drozdov, Boris Veytsman

We study whether novel ideas in biomedical literature appear first in preprints or traditional journals.

RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models

no code implementations3 Dec 2024 Hieu Tran, Zonghai Yao, Junda Wang, Yifan Zhang, Zhichao Yang, Hong Yu

This work introduces RARE (Retrieval-Augmented Reasoning Enhancement), a versatile extension to the mutual reasoning framework (rStar), aimed at enhancing reasoning accuracy and factual integrity across large language models (LLMs) for complex, knowledge-intensive tasks such as commonsense and medical reasoning.

Information Retrieval Retrieval

RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs

no code implementations17 Oct 2024 Jiatan Huang, Mingchen Li, Zonghai Yao, Zhichao Yang, Yongkang Xiao, Feiyun ouyang, Xiaohan Li, Shuo Han, Hong Yu

To tackle these challenges, we first develop a Dataset for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure coverage. We synthesize realistic user queries that integrate diverse topological structures, relational information, and complex textual descriptions.

Knowledge Graphs Retrieval

Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History

no code implementations6 Oct 2024 Lingxi Li, Zonghai Yao, Sunjae Kwon, Hong Yu

Wikipedia (Wiki) is one of the most widely used and publicly available resources for natural language processing (NLP) applications.

CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints

1 code implementation5 Oct 2024 Anirudh Atmakuru, Jatin Nainani, Rohith Siddhartha Reddy Bheemreddy, Anirudh Lakkaraju, Zonghai Yao, Hamed Zamani, Haw-Shiuan Chang

Additionally, our experiments on OLMo suggest that Learning from Human Feedback (LHF) can help LLMs select better stories from their training data but has limited influence in boosting LLMs' ability to produce creative stories that are unseen in the training corpora.

Instruction Following Specificity

MedQA-CS: Benchmarking Large Language Models Clinical Skills Using an AI-SCE Framework

1 code implementation2 Oct 2024 Zonghai Yao, Zihao Zhang, Chaolong Tang, Xingyu Bian, Youxia Zhao, Zhichao Yang, Junda Wang, Huixue Zhou, Won Seok Jang, Feiyun ouyang, Hong Yu

Combined with existing benchmarks, MedQA-CS enables a more comprehensive evaluation of LLMs' clinical capabilities for both open- and closed-source LLMs.

Benchmarking Instruction Following +2

Large Language Model-based Role-Playing for Personalized Medical Jargon Extraction

no code implementations10 Aug 2024 Jung Hoon Lim, Sunjae Kwon, Zonghai Yao, John P. Lalor, Hong Yu

Previous studies reveal that Electronic Health Records (EHR), which have been widely adopted in the U. S. to allow patients to access their personal medical information, do not have high readability to patients due to the prevalence of medical jargon.

In-Context Learning Language Modeling +3

ReadCtrl: Personalizing text generation with readability-controlled instruction learning

no code implementations13 Jun 2024 Hieu Tran, Zonghai Yao, Lingxi Li, Hong Yu

In an era of large language models (LLMs), readability-controlled text generation based on LLMs has become increasingly important.

Text Generation

JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability

1 code implementation27 Feb 2024 Junda Wang, Zhichao Yang, Zonghai Yao, Hong Yu

Through this work, we provide a new and efficient knowledge enhancement method for healthcare, demonstrating the potential of integrating retrieval and LLM training for medical question-answering systems.

Information Retrieval Question Answering +2

LocalTweets to LocalHealth: A Mental Health Surveillance Framework Based on Twitter Data

no code implementations21 Feb 2024 Vijeta Deshpande, Minhwa Lee, Zonghai Yao, Zihao Zhang, Jason Brian Gibbons, Hong Yu

Prior research on Twitter (now X) data has provided positive evidence of its utility in developing supplementary health surveillance systems.

SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization

1 code implementation21 Feb 2024 Prakamya Mishra, Zonghai Yao, Parth Vashisht, Feiyun ouyang, Beining Wang, Vidhi Dhaval Mody, Hong Yu

Large Language Models (LLMs) such as GPT & Llama have demonstrated significant achievements in summarization tasks but struggle with factual inaccuracies, a critical issue in clinical NLP applications where errors could lead to serious consequences.

EHR Interaction Between Patients and AI: NoteAid EHR Interaction

no code implementations29 Dec 2023 Xiaocheng Zhang, Zonghai Yao, Hong Yu

Through a comprehensive evaluation of the entire dataset using LLM assessment and a rigorous manual evaluation of 64 instances, we showcase the potential of LLMs in patient education.

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP

1 code implementation24 Dec 2023 Zonghai Yao, Nandyala Siddharth Kantu, Guanghao Wei, Hieu Tran, Zhangqi Duan, Sunjae Kwon, Zhichao Yang, README annotation team, Hong Yu

The advancement in healthcare has shifted focus toward patient-centric approaches, particularly in self-care and patient education, facilitated by access to Electronic Health Records (EHR).

Do Physicians Know How to Prompt? The Need for Automatic Prompt Optimization Help in Clinical Note Generation

1 code implementation16 Nov 2023 Zonghai Yao, Ahmed Jaafar, Beining Wang, Zhichao Yang, Hong Yu

We recommend a two-phase optimization process, leveraging APO-GPT4 for consistency and expert input for personalization.

Prompt Engineering

Large Language Models are In-context Teachers for Knowledge Reasoning

no code implementations12 Nov 2023 Jiachen Zhao, Zonghai Yao, Zhichao Yang, Hong Yu

Furthermore, we reveal that for ICT, rationales from different teacher LLMs or human experts that more resemble the student LLM's self-explanations are better in-context demonstrations.

In-Context Learning Information Retrieval +4

EHRTutor: Enhancing Patient Understanding of Discharge Instructions

no code implementations30 Oct 2023 Zihao Zhang, Zonghai Yao, Huixue Zhou, Feiyun ouyang, Hong Yu

This paper presents EHRTutor, an innovative multi-component framework leveraging the Large Language Model (LLM) for patient education through conversational question-answering.

Conversational Question Answering Language Modeling +2

Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization

1 code implementation30 Oct 2023 Prakamya Mishra, Zonghai Yao, Shuwei Chen, Beining Wang, Rohan Mittal, Hong Yu

In this work, we propose a new pipeline using ChatGPT instead of human experts to generate high-quality feedback data for improving factual consistency in the clinical note summarization task.

Hallucination

NoteChat: A Dataset of Synthetic Doctor-Patient Conversations Conditioned on Clinical Notes

1 code implementation24 Oct 2023 Junda Wang, Zonghai Yao, Zhichao Yang, Huixue Zhou, Rumeng Li, Xun Wang, Yucheng Xu, Hong Yu

We introduce NoteChat, a novel cooperative multi-agent framework leveraging Large Language Models (LLMs) to generate patient-physician dialogues.

Dialogue Generation

Improving Summarization with Human Edits

2 code implementations9 Oct 2023 Zonghai Yao, Benjamin J Schloss, Sai P. Selvaraj

Existing works use human feedback to train large language models (LLMs) in general domain abstractive summarization and have obtained summary quality exceeding traditional likelihood training.

Abstractive Text Summarization

PaniniQA: Enhancing Patient Education Through Interactive Question Answering

1 code implementation7 Aug 2023 Pengshan Cai, Zonghai Yao, Fei Liu, Dakuo Wang, Meghan Reilly, Huixue Zhou, Lingxi Li, Yi Cao, Alok Kapoor, Adarsha Bajracharya, Dan Berlowitz, Hong Yu

Patient portal allows discharged patients to access their personalized discharge instructions in electronic health records (EHRs).

Question Answering

UMASS_BioNLP at MEDIQA-Chat 2023: Can LLMs generate high-quality synthetic note-oriented doctor-patient conversations?

1 code implementation29 Jun 2023 Junda Wang, Zonghai Yao, Avijit Mitra, Samuel Osebe, Zhichao Yang, Hong Yu

This paper presents UMASS_BioNLP team participation in the MEDIQA-Chat 2023 shared task for Task-A and Task-C. We focus especially on Task-C and propose a novel LLMs cooperation system named a doctor-patient loop to generate high-quality conversation data sets.

Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond

1 code implementation20 May 2023 Haw-Shiuan Chang, Zonghai Yao, Alolika Gon, Hong Yu, Andrew McCallum

Is the output softmax layer, which is adopted by most language models (LMs), always the best way to compute the next word probability?

Automated Identification of Eviction Status from Electronic Health Record Notes

1 code implementation6 Dec 2022 Zonghai Yao, Jack Tsai, Weisong Liu, David A. Levy, Emily Druhl, Joel I Reisman, Hong Yu

Materials and Methods: We first defined eviction status (eviction presence and eviction period) and then annotated eviction status in 5000 EHR notes from the Veterans Health Administration (VHA).

Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt

1 code implementation24 Nov 2022 Zhichao Yang, Sunjae Kwon, Zonghai Yao, Hong Yu

This task is challenging due to the high-dimensional space of multi-label assignment (155, 000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically.

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION

Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing

no code implementations18 Nov 2022 Zonghai Yao, Yi Cao, Zhichao Yang, Hong Yu

Different from the previous known-unknown evaluation criteria, we propose the concept of "Misunderstand" in LAMA for the first time.

Knowledge Probing

Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context

no code implementations26 Aug 2022 Zonghai Yao, Yi Cao, Zhichao Yang, Vijeta Deshpande, Hong Yu

In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain.

Language Modeling Language Modelling

Improving Formality Style Transfer with Context-Aware Rule Injection

no code implementations ACL 2021 Zonghai Yao, Hong Yu

Models pre-trained on large-scale regular text corpora often do not work well for user-generated data where the language styles differ significantly from the mainstream text.

Decoder Formality Style Transfer +2

Cannot find the paper you are looking for? You can Submit a new open access paper.