Search Results for author: Zhiyong Lu

Found 71 papers, 27 papers with code

Measuring the relative importance of full text sections for information retrieval from scientific literature.

no code implementations NAACL (BioNLP) 2021 Lana Yeganova, Won Gyu Kim, Donald Comeau, W John Wilbur, Zhiyong Lu

In this work we establish the connection between the BM25 score of a query term appearing in a section of a full text document and the probability of that document being clicked or identified as relevant.

Information Retrieval Retrieval

Automatic recognition of abdominal lymph nodes from clinical text

1 code implementation EMNLP (ClinicalNLP) 2020 Yifan Peng, SungWon Lee, Daniel C. Elton, Thomas Shen, Yu-Xing Tang, Qingyu Chen, Shuai Wang, Yingying Zhu, Ronald Summers, Zhiyong Lu

We then introduce an end-to-end approach based on the combination of rules and transformer-based methods to detect these abdominal lymph node mentions and classify their types from the MRI radiology reports.

Matching Patients to Clinical Trials with Large Language Models

no code implementations27 Jul 2023 Qiao Jin, Zifeng Wang, Charalampos S. Floudas, Jimeng Sun, Zhiyong Lu

Second, the aggregated trial-level TrialGPT scores are highly correlated with expert eligibility annotations.

PubMed and Beyond: Biomedical Literature Search in the Age of Artificial Intelligence

no code implementations18 Jul 2023 Qiao Jin, Robert Leaman, Zhiyong Lu

In response, we present a survey of literature search tools tailored to both general and specific information needs in biomedicine, with the objective of helping readers efficiently fulfill their information needs.

A scoping review on multimodal deep learning in biomedical images and texts

no code implementations14 Jul 2023 Zhaoyi Sun, Mingquan Lin, Qingqing Zhu, Qianqian Xie, Fei Wang, Zhiyong Lu, Yifan Peng

In this scoping review, we aim to provide a comprehensive overview of the current state of the field and identify key concepts, types of studies, and research gaps with a focus on biomedical images and texts joint learning, mainly because these two were the most commonly available data types in MDL research.

Cross-Modal Retrieval Decision Making +5

BioCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval

1 code implementation2 Jul 2023 Qiao Jin, Won Kim, Qingyu Chen, Donald C. Comeau, Lana Yeganova, John Wilbur, Zhiyong Lu

Experimental results show that BioCPT sets new state-of-the-art performance on five biomedical IR tasks, outperforming various baselines including much larger models such as GPT-3-sized cpt-text-XL.

Biomedical Information Retrieval Contrastive Learning +4

BioREx: Improving Biomedical Relation Extraction by Leveraging Heterogeneous Datasets

1 code implementation19 Jun 2023 Po-Ting Lai, Chih-Hsuan Wei, Ling Luo, Qingyu Chen, Zhiyong Lu

State-of-the-art methods were used primarily to train machine learning models on individual RE datasets, such as protein-protein interaction and chemical-induced disease relation.

graph construction Multi-Task Learning +1

Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports

no code implementations14 Jun 2023 Qingqing Zhu, Tejas Sudharshan Mathai, Pritam Mukherjee, Yifan Peng, Ronald M. Summers, Zhiyong Lu

Pre-filling a radiology report holds promise in mitigating reporting errors, and despite efforts in the literature to generate medical reports, there exists a lack of approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset.

speech-recognition Speech Recognition

GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information

1 code implementation19 Apr 2023 Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu

In this paper, we present GeneGPT, a novel method for teaching LLMs to use the Web APIs of the National Center for Biotechnology Information (NCBI) for answering genomics questions.


LADER: Log-Augmented DEnse Retrieval for Biomedical Literature Search

no code implementations10 Apr 2023 Qiao Jin, Andrew Shin, Zhiyong Lu

On all queries, LADER can improve the performance of a dense retriever by 24%-37% relative NDCG@10 while not requiring additional training, and further performance improvement is expected from more logs.


Bioformer: an efficient transformer language model for biomedical text mining

1 code implementation3 Feb 2023 Li Fang, Qingyu Chen, Chih-Hsuan Wei, Zhiyong Lu, Kai Wang

We thoroughly evaluated the performance of Bioformer as well as existing biomedical BERT models including BioBERT and PubMedBERT on 15 benchmark datasets of four different biomedical NLP tasks: named entity recognition, relation extraction, question answering and document classification.

Document Classification Language Modelling +5

AIONER: All-in-one scheme-based biomedical named entity recognition using deep learning

1 code implementation30 Nov 2022 Ling Luo, Chih-Hsuan Wei, Po-Ting Lai, Robert Leaman, Qingyu Chen, Zhiyong Lu

Biomedical named entity recognition (BioNER) seeks to automatically recognize biomedical entities in natural language text, serving as a necessary foundation for downstream text mining tasks and applications such as information extraction and question answering.

Multi-Task Learning named-entity-recognition +3

LitCovid in 2022: an information resource for the COVID-19 literature

no code implementations27 Sep 2022 Qingyu Chen, Alexis Allot, Robert Leaman, Chih-Hsuan Wei, Elaheh Aghaarabi, John J. Guerrerio, Lilly Xu, Zhiyong Lu

LitCovid (https://www. ncbi. nlm. nih. gov/research/coronavirus/), first launched in February 2020, is a first-of-its-kind literature hub for tracking up-to-date published research on COVID-19.

Comprehensively identifying Long Covid articles with human-in-the-loop machine learning

no code implementations16 Sep 2022 Robert Leaman, Rezarta Islamaj, Alexis Allot, Qingyu Chen, W. John Wilbur, Zhiyong Lu

A significant percentage of COVID-19 survivors experience ongoing multisystemic symptoms that often affect daily living, a condition known as Long Covid or post-acute-sequelae of SARS-CoV-2 infection.

Active Learning Specificity

Assigning Species Information to Corresponding Genes by a Sequence Labeling Framework

1 code implementation8 May 2022 Ling Luo, Chih-Hsuan Wei, Po-Ting Lai, Qingyu Chen, Rezarta Islamaj Doğan, Zhiyong Lu

The automatic assignment of species information to the corresponding genes in a research article is a critically important step in the gene normalization task, whereby a gene mention is normalized and linked to a database record or identifier by a text-mining algorithm.

Benchmarking Binary Classification

BioRED: A Rich Biomedical Relation Extraction Dataset

1 code implementation8 Apr 2022 Ling Luo, Po-Ting Lai, Chih-Hsuan Wei, Cecilia N Arighi, Zhiyong Lu

However, most existing benchmarking datasets for bio-medical RE only focus on relations of a single type (e. g., protein-protein interactions) at the sentence level, greatly limiting the development of RE systems in biomedicine.

Benchmarking Binary Relation Extraction +1

Universal Lymph Node Detection in T2 MRI using Neural Networks

no code implementations31 Mar 2022 Tejas Sudharshan Mathai, SungWon Lee, Thomas C. Shen, Zhiyong Lu, Ronald M. Summers

Results: Experiments on 122 test T2 MRI volumes revealed that VFNet achieved a 51. 1% mAP and 78. 7% recall at 4 false positives (FP) per volume, while the one-stage model ensemble achieved a mAP of 52. 3% and sensitivity of 78. 7% at 4FP.

Radiology Text Analysis System (RadText): Architecture and Evaluation

1 code implementation19 Mar 2022 Song Wang, Mingquan Lin, Ying Ding, George Shih, Zhiyong Lu, Yifan Peng

Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis.

De-identification named-entity-recognition +3

A Privacy-Preserving Unsupervised Domain Adaptation Framework for Clinical Text Analysis

no code implementations18 Jan 2022 Qiyuan An, Ruijiang Li, Lin Gu, Hao Zhang, Qingyu Chen, Zhiyong Lu, Fei Wang, Yingying Zhu

To evaluate our proposed method's utility and privacy loss, we apply our model on a medical report disease label classification task using two noisy challenging clinical text datasets.

Inference Attack Membership Inference Attack +4

Perceiving and Modeling Density is All You Need for Image Dehazing

1 code implementation18 Nov 2021 Tian Ye, Mingchao Jiang, Yunchen Zhang, Liang Chen, ErKang Chen, Pen Chen, Zhiyong Lu

However, due to the paradox caused by the variation of real captured haze and the fixed degradation parameters of the current networks, the generalization ability of recent dehazing methods on real-world hazy images is not ideal. To address the problem of modeling real-world haze degradation, we propose to solve this problem by perceiving and modeling density for uneven haze distribution.

Image Dehazing Single Image Dehazing

Lymph Node Detection in T2 MRI with Transformers

no code implementations9 Nov 2021 Tejas Sudharshan Mathai, SungWon Lee, Daniel C. Elton, Thomas C. Shen, Yifan Peng, Zhiyong Lu, Ronald M. Summers

Identification of lymph nodes (LN) in T2 Magnetic Resonance Imaging (MRI) is an important step performed by radiologists during the assessment of lymphoproliferative diseases.

BERT-GT: Cross-sentence n-ary relation extraction with BERT and Graph Transformer

no code implementations11 Jan 2021 Po-Ting Lai, Zhiyong Lu

A biomedical relation statement is commonly expressed in multiple sentences and consists of many concepts, including gene, disease, chemical, and mutation.

Benchmarking Binary Relation Extraction

A Comprehensive Dictionary and Term Variation Analysis for COVID-19 and SARS-CoV-2

1 code implementation EMNLP (NLP-COVID19) 2020 Robert Leaman, Zhiyong Lu

In this manuscript we present an extensive dictionary of terms used in the literature to refer to SARS-CoV-2 and COVID-19.

Artificial Intelligence (AI) in Action: Addressing the COVID-19 Pandemic with Natural Language Processing (NLP)

no code implementations9 Oct 2020 Qingyu Chen, Robert Leaman, Alexis Allot, Ling Luo, Chih-Hsuan Wei, Shankai Yan, Zhiyong Lu

The COVID-19 pandemic has had a significant impact on society, both because of the serious health effects of COVID-19 and because of public health measures implemented to slow its spread.

Emotion Recognition Information Retrieval +7

PhenoTagger: A Hybrid Method for Phenotype Concept Recognition using Human Phenotype Ontology

no code implementations17 Sep 2020 Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N. Robinson, Zhiyong Lu

In this paper, we propose PhenoTagger, a hybrid method that combines both dictionary and machine learning-based methods to recognize Human Phenotype Ontology (HPO) concepts in unstructured biomedical text.

BIG-bench Machine Learning

Navigating the landscape of COVID-19 research through literature analysis: A bird's eye view

no code implementations7 Aug 2020 Lana Yeganova, Rezarta Islamaj, Qingyu Chen, Robert Leaman, Alexis Allot, Chin-Hsuan Wei, Donald C. Comeau, Won Kim, Yifan Peng, W. John Wilbur, Zhiyong Lu

In this study we analyze the LitCovid collection, 13, 369 COVID-19 related articles found in PubMed as of May 15th, 2020 with the purpose of examining the landscape of literature and presenting it in a format that facilitates information navigation and understanding.

Clustering named-entity-recognition +2

COVID-19-CT-CXR: a freely accessible and weakly labeled chest X-ray and CT image collection on COVID-19 from biomedical literature

1 code implementation11 Jun 2020 Yifan Peng, Yu-Xing Tang, Sung-Won Lee, Yingying Zhu, Ronald M. Summers, Zhiyong Lu

(1) We show that COVID-19-CT-CXR, when used as additional training data, is able to contribute to improved DL performance for the classification of COVID-19 and non-COVID-19 CT. (2) We collected CT images of influenza and trained a DL baseline to distinguish a diagnosis of COVID-19, influenza, or normal or other types of diseases on CT. (3) We trained an unsupervised one-class classifier from non-COVID-19 CXR and performed anomaly detection to detect COVID-19 CXR.

Anomaly Detection Computed Tomography (CT) +1

TeamTat: a collaborative text annotation tool

1 code implementation24 Apr 2020 Rezarta Islamaj, Dongseop Kwon, Sun Kim, Zhiyong Lu

Manually annotated data is key to developing text-mining and information-extraction algorithms.

Management text annotation

BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale

1 code implementation23 Dec 2019 Qingyu Chen, Kyubum Lee, Shankai Yan, Sun Kim, Chih-Hsuan Wei, Zhiyong Lu

Capturing the semantics of related biological concepts, such as genes and mutations, is of significant importance to many research tasks in computational biology such as protein-protein interaction detection, gene-drug association prediction, and biomedical literature-based discovery.

Biomedical Mention Disambiguation using a Deep Learning Approach

no code implementations23 Sep 2019 Chih-Hsuan Wei, Kyubum Lee, Robert Leaman, Zhiyong Lu

The priority ordering rule-based approach demonstrated F1-scores of 71. 29% (micro-averaged) and 41. 19% (macro-averaged), while the new disambiguation method demonstrated F1-scores of 91. 94% (micro-averaged) and 85. 42% (macro-averaged), a very substantial increase.

named-entity-recognition Named Entity Recognition +1

Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records

no code implementations6 Sep 2019 Qingyu Chen, Jingcheng Du, Sun Kim, W. John Wilbur, Zhiyong Lu

For the post challenge, the performance of both Random Forest and the Encoder Network was improved; in particular, the correlation of the Encoder Network was improved by ~13%.

Semantic Textual Similarity Sentence Embeddings +1

MULAN: Multitask Universal Lesion Analysis Network for Joint Lesion Detection, Tagging, and Segmentation

14 code implementations12 Aug 2019 Ke Yan, You-Bao Tang, Yifan Peng, Veit Sandfort, Mohammadhadi Bagheri, Zhiyong Lu, Ronald M. Summers

When reading medical images such as a computed tomography (CT) scan, radiologists generally search across the image to find lesions, characterize and measure them, and then describe them in the radiological report.

Computed Tomography (CT) Lesion Detection +2

A deep learning approach for automated detection of geographic atrophy from color fundus photographs

1 code implementation7 Jun 2019 Tiarnan D. Keenan, Shazia Dharssi, Yifan Peng, Qingyu Chen, Elvira Agrón, Wai T. Wong, Zhiyong Lu, Emily Y. Chew

Results: The deep learning models (GA detection, CGA detection from all eyes, and centrality detection from GA eyes) had AUC of 0. 933-0. 976, 0. 939-0. 976, and 0. 827-0. 888, respectively.


A self-attention based deep learning method for lesion attribute detection from CT reports

no code implementations30 Apr 2019 Yifan Peng, Ke Yan, Veit Sandfort, Ronald M. Summers, Zhiyong Lu

In radiology, radiologists not only detect lesions from the medical image, but also describe them with various attributes such as their type, location, size, shape, and intensity.

Fine-grained lesion annotation in CT images with knowledge mined from radiology reports

no code implementations4 Mar 2019 Ke Yan, Yifan Peng, Zhiyong Lu, Ronald M. Summers

To address this problem, we define a set of 145 labels based on RadLex to describe a large variety of lesions in the DeepLesion dataset.

MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs

no code implementations21 Jan 2019 Alistair E. W. Johnson, Tom J. Pollard, Nathaniel R. Greenbaum, Matthew P. Lungren, Chih-ying Deng, Yifan Peng, Zhiyong Lu, Roger G. Mark, Seth J. Berkowitz, Steven Horng

Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's thorax, but requiring specialized training for proper interpretation.

Exploring Semi-supervised Variational Autoencoders for Biomedical Relation Extraction

no code implementations18 Jan 2019 Yijia Zhang, Zhiyong Lu

Experimental results show that our method effectively exploits the unlabeled data to improve the performance and reduce the dependence on labeled data.

Relation Extraction

DeepSeeNet: A deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs

1 code implementation19 Nov 2018 Yifan Peng, Shazia Dharssi, Qingyu Chen, Tiarnan D. Keenan, Elvira Agrón, Wai T. Wong, Emily Y. Chew, Zhiyong Lu

DeepSeeNet simulates the human grading process by first detecting individual AMD risk factors (drusen size, pigmentary abnormalities) for each eye and then calculating a patient-based AMD severity score using the AREDS Simplified Severity Scale.

Decision Making General Classification

ML-Net: multi-label classification of biomedical texts with deep neural networks

4 code implementations13 Nov 2018 Jingcheng Du, Qingyu Chen, Yifan Peng, Yang Xiang, Cui Tao, Zhiyong Lu

Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class text classification problems.

Benchmarking Feature Engineering +4

BioSentVec: creating sentence embeddings for biomedical texts

4 code implementations22 Oct 2018 Qingyu Chen, Yifan Peng, Zhiyong Lu

Sentence embeddings have become an essential part of today's natural language processing (NLP) systems, especially together advanced deep learning methods.

 Ranked #1 on Sentence Embeddings For Biomedical Texts on MedSTS (using extra training data)

Benchmarking Sentence Embeddings For Biomedical Texts +1

SingleCite: Towards an improved Single Citation Search in PubMed

no code implementations WS 2018 Lana Yeganova, Donald C. Comeau, Won Kim, W. John Wilbur, Zhiyong Lu

A search that is targeted at finding a specific document in databases is called a Single Citation search.

MeSH-based dataset for measuring the relevance of text retrieval

no code implementations WS 2018 Won Gyu Kim, Lana Yeganova, Donald Comeau, W. John Wilbur, Zhiyong Lu

Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains.

Information Retrieval Retrieval +1

Personalized neural language models for real-world query auto completion

no code implementations NAACL 2018 Nicolas Fiorini, Zhiyong Lu

Query auto completion (QAC) systems are a standard part of search engines in industry, helping users formulate their query.

Language Modelling

A Fast Deep Learning Model for Textual Relevance in Biomedical Information Retrieval

no code implementations26 Feb 2018 Sunil Mohan, Nicolas Fiorini, Sun Kim, Zhiyong Lu

Publications in the life sciences are characterized by a large technical vocabulary, with many lexical and semantic variations for expressing the same concept.

Biomedical Information Retrieval Information Retrieval +2

NegBio: a high-performance tool for negation and uncertainty detection in radiology reports

1 code implementation16 Dec 2017 Yifan Peng, Xiaosong Wang, Le Lu, Mohammadhadi Bagheri, Ronald Summers, Zhiyong Lu

Negative and uncertain medical findings are frequent in radiology reports, but discriminating them from positive findings remains challenging for information extraction.


BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations

no code implementations WS 2017 Rezarta Islamaj Do{\u{g}}an, Andrew Chatr-aryamontri, Sun Kim, Chih-Hsuan Wei, Yifan Peng, Donald Comeau, Zhiyong Lu

The Precision Medicine Track in BioCre-ative VI aims to bring together the Bi-oNLP community for a novel challenge focused on mining the biomedical litera-ture in search of mutations and protein-protein interactions (PPI).

Relation Extraction

Deep learning for extracting protein-protein interactions from biomedical literature

no code implementations WS 2017 Yifan Peng, Zhiyong Lu

State-of-the-art methods for protein-protein interaction (PPI) extraction are primarily feature-based or kernel-based by leveraging lexical and syntactic information.

Benchmarking Cross-corpus +1

Challenges in clinical natural language processing for automated disorder normalization

no code implementations Journal of Biomedical Informatics 2015 Robert Leaman, Ritu Khare, Zhiyong Lu

Conclusion Disorder mentions in text from clinical narratives use a rich vocabulary that results in high term variation, which we believe to be one of the primary causes of reduced performance in clinical narrative.

Learning-To-Rank Medical Named Entity Recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.