Search Results for author: Cao Xiao

Found 71 papers, 34 papers with code

TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale

no code implementations15 Mar 2024 Pengcheng Jiang, Cao Xiao, Zifeng Wang, Parminder Bhatia, Jimeng Sun, Jiawei Han

To overcome this, we introduce TriSum, a framework for distilling LLMs' text summarization abilities into a compact, local model.

Text Summarization

Recent Advances in Predictive Modeling with Electronic Health Records

no code implementations2 Feb 2024 Jiaqi Wang, Junyu Luo, Muchao Ye, Xiaochen Wang, Yuan Zhong, Aofei Chang, Guanjie Huang, Ziyi Yin, Cao Xiao, Jimeng Sun, Fenglong Ma

This survey systematically reviews recent advances in deep learning-based predictive models using EHR data.

PILOT: Legal Case Outcome Prediction with Case Law

no code implementations28 Jan 2024 Lang Cao, Zifeng Wang, Cao Xiao, Jimeng Sun

We demonstrate the importance of accurately identifying precedent cases and mitigating the temporal shift when making predictions for case law, as our method shows a significant improvement over the prior methods that focus on civil law case outcome predictions.

Decision Making Retrieval

ConSequence: Synthesizing Logically Constrained Sequences for Electronic Health Record Generation

no code implementations10 Dec 2023 Brandon Theodorou, Shrusti Jain, Cao Xiao, Jimeng Sun

Generative models can produce synthetic patient records for analytical tasks when real data is unavailable or limited.

Computational Efficiency

Zero-Resource Hallucination Prevention for Large Language Models

1 code implementation6 Sep 2023 Junyu Luo, Cao Xiao, Fenglong Ma

Existing techniques for hallucination detection in language assistants rely on intricate fuzzy, specific free-language-based chain of thought (CoT) techniques or parameter-based methods that suffer from interpretability issues.

Hallucination

TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network

no code implementations19 Jul 2023 Brandon Theodorou, Cao Xiao, Jimeng Sun

In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials based on longitudinal patient electronic health records (EHR) data and eligibility criteria of clinical trials.

Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations

no code implementations2 Jun 2023 Pengcheng Jiang, Cao Xiao, Tianfan Fu, Jimeng Sun

In this paper, we propose a novel method called GODE, which takes into account the two-level structure of individual molecules.

Contrastive Learning Knowledge Graphs +4

FRAMM: Fair Ranking with Missing Modalities for Clinical Trial Site Selection

no code implementations30 May 2023 Brandon Theodorou, Lucas Glass, Cao Xiao, Jimeng Sun

This paper focuses on the trial site selection task and proposes FRAMM, a deep reinforcement learning framework for fair trial site selection.

Fairness Imputation +1

GraphCare: Enhancing Healthcare Predictions with Personalized Knowledge Graphs

no code implementations22 May 2023 Pengcheng Jiang, Cao Xiao, Adam Cross, Jimeng Sun

This is because personalized predictions require personalized knowledge graphs (KGs), which are difficult to generate from patient EHR data.

Decision Making Knowledge Graphs

MediTab: Scaling Medical Tabular Data Predictors via Data Consolidation, Enrichment, and Refinement

1 code implementation20 May 2023 Zifeng Wang, Chufan Gao, Cao Xiao, Jimeng Sun

Tabular data prediction has been employed in medical applications such as patient health risk prediction.

SPOT: Sequential Predictive Modeling of Clinical Trial Outcome with Meta-Learning

no code implementations7 Apr 2023 Zifeng Wang, Cao Xiao, Jimeng Sun

Accurate trial outcome prediction based on historical trial data promises better trial investment decisions and more trial success.

Meta-Learning

Synthesize High-dimensional Longitudinal Electronic Health Records via Hierarchical Autoregressive Language Model

1 code implementation4 Apr 2023 Brandon Theodorou, Cao Xiao, Jimeng Sun

In this paper, we propose Hierarchical Autoregressive Language mOdel (HALO) for generating longitudinal high-dimensional EHR, which preserve the statistical properties of real EHR and can be used to train accurate ML models without privacy concerns.

Language Modelling Variable Selection

Fast Online Value-Maximizing Prediction Sets with Conformal Cost Control

1 code implementation2 Feb 2023 Zhen Lin, Shubhendu Trivedi, Cao Xiao, Jimeng Sun

We focus on a typical scenario where such requirements, separately encoding $\textit{value}$ and $\textit{cost}$, compete with each other.

AutoMap: Automatic Medical Code Mapping for Clinical Prediction Model Deployment

no code implementations4 Mar 2022 Zhenbang Wu, Cao Xiao, Lucas M Glass, David M Liebovitz, Jimeng Sun

To tackle this problem, we propose AutoMap to automatically map the medical codes across different EHR systems in a coarse-to-fine manner: (1) Ontology-level Alignment: We leverage the ontology structure to learn a coarse alignment between the source and target medical coding systems; (2) Code-level Refinement: We refine the alignment at a fine-grained code level for the downstream tasks using a teacher-student framework.

Mortality Prediction

MedAttacker: Exploring Black-Box Adversarial Attacks on Risk Prediction Models in Healthcare

no code implementations11 Dec 2021 Muchao Ye, Junyu Luo, Guanjie Zheng, Cao Xiao, Ting Wang, Fenglong Ma

Deep neural networks (DNNs) have been broadly adopted in health risk prediction to provide healthcare diagnoses and treatments.

Adversarial Attack Position +1

Differentiable Scaffolding Tree for Molecule Optimization

no code implementations ICLR 2022 Tianfan Fu, Wenhao Gao, Cao Xiao, Jacob Yasonik, Connor W. Coley, Jimeng Sun

The structural design of functional molecules, also called molecular optimization, is an essential chemical science and engineering task with important applications, such as drug discovery.

Combinatorial Optimization Drug Discovery

Differentiable Scaffolding Tree for Molecular Optimization

no code implementations22 Sep 2021 Tianfan Fu, Wenhao Gao, Cao Xiao, Jacob Yasonik, Connor W. Coley, Jimeng Sun

The structural design of functional molecules, also called molecular optimization, is an essential chemical science and engineering task with important applications, such as drug discovery.

Combinatorial Optimization Drug Discovery

ATD: Augmenting CP Tensor Decomposition by Self Supervision

1 code implementation15 Jun 2021 Chaoqi Yang, Cheng Qian, Navjot Singh, Cao Xiao, M Brandon Westover, Edgar Solomonik, Jimeng Sun

This paper addresses the above challenges by proposing augmented tensor decomposition (ATD), which effectively incorporates data augmentations and self-supervised learning (SSL) to boost downstream classification.

Data Augmentation Dimensionality Reduction +3

MTC: Multiresolution Tensor Completion from Partial and Coarse Observations

1 code implementation14 Jun 2021 Chaoqi Yang, Navjot Singh, Cao Xiao, Cheng Qian, Edgar Solomonik, Jimeng Sun

Our MTC model explores tensor mode properties and leverages the hierarchy of resolutions to recursively initialize an optimization setup, and optimizes on the coupled system using alternating least squares.

Multi-version Tensor Completion for Time-delayed Spatio-temporal Data

no code implementations11 May 2021 Cheng Qian, Nikos Kargas, Cao Xiao, Lucas Glass, Nicholas Sidiropoulos, Jimeng Sun

Recovering such missing or noisy (under-reported) elements of the input tensor can be viewed as a generalized tensor completion problem.

Missing Elements

Change Matters: Medication Change Prediction with Recurrent Residual Networks

no code implementations5 May 2021 Chaoqi Yang, Cao Xiao, Lucas Glass, Jimeng Sun

Deep learning is revolutionizing predictive healthcare, including recommending medications to patients with complex health conditions.

SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and Safe Drug Combinations

no code implementations5 May 2021 Chaoqi Yang, Cao Xiao, Fenglong Ma, Lucas Glass, Jimeng Sun

On a benchmark dataset, our SafeDrug is relatively shown to reduce DDI by 19. 43% and improves 2. 88% on Jaccard similarity between recommended and actually prescribed drug combinations over previous approaches.

Machine Learning Applications for Therapeutic Tasks with Genomics Data

no code implementations3 May 2021 Kexin Huang, Cao Xiao, Lucas M. Glass, Cathy W. Critchlow, Greg Gibson, Jimeng Sun

Thanks to the increasing availability of genomics and other biomedical data, many machine learning approaches have been proposed for a wide range of therapeutic discovery and development tasks.

BIG-bench Machine Learning

SCRIB: Set-classifier with Class-specific Risk Bounds for Blackbox Models

no code implementations5 Mar 2021 Zhen Lin, Cao Xiao, Lucas Glass, M. Brandon Westover, Jimeng Sun

Despite deep learning (DL) success in classification problems, DL classifiers do not provide a sound mechanism to decide when to refrain from predicting.

Atrial Fibrillation Detection Classification +4

Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development

2 code implementations18 Feb 2021 Kexin Huang, Tianfan Fu, Wenhao Gao, Yue Zhao, Yusuf Roohani, Jure Leskovec, Connor W. Coley, Cao Xiao, Jimeng Sun, Marinka Zitnik

Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics.

BIG-bench Machine Learning Drug Discovery

HINT: Hierarchical Interaction Network for Trial Outcome Prediction Leveraging Web Data

1 code implementation8 Feb 2021 Tianfan Fu, Kexin Huang, Cao Xiao, Lucas M. Glass, Jimeng Sun

Next, these embeddings will be fed into the knowledge embedding module to generate knowledge embeddings that are pretrained using external knowledge on pharmaco-kinetic properties and trial risk from the web.

Imputation

PyHealth: A Python Library for Health Predictive Models

2 code implementations11 Jan 2021 Yue Zhao, Zhi Qiao, Cao Xiao, Lucas Glass, Jimeng Sun

PyHealth consists of data preprocessing module, predictive modeling module, and evaluation module.

Benchmarking BIG-bench Machine Learning

STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological Regularization

no code implementations8 Dec 2020 Nikos Kargas, Cheng Qian, Nicholas D. Sidiropoulos, Cao Xiao, Lucas M. Glass, Jimeng Sun

Accurate prediction of the transmission of epidemic diseases such as COVID-19 is crucial for implementing effective mitigation measures.

Attribute

FLANNEL: Focal Loss Based Neural Network Ensemble for COVID-19 Detection

no code implementations30 Oct 2020 Zhi Qiao, Austin Bae, Lucas M. Glass, Cao Xiao, Jimeng Sun

To test the possibility of differentiating chest x-ray images of COVID-19 against other pneumonia and healthy patients using deep neural networks.

UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data

no code implementations22 Oct 2020 Chacha Chen, Junjie Liang, Fenglong Ma, Lucas M. Glass, Jimeng Sun, Cao Xiao

However, existing uncertainty estimation approaches often failed in handling high-dimensional data, which are present in multi-sourced data.

Clustering Variational Inference

MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization

1 code implementation5 Oct 2020 Tianfan Fu, Cao Xiao, Xinhao Li, Lucas M. Glass, Jimeng Sun

Molecule optimization is a fundamental task for accelerating drug discovery, with the goal of generating new valid molecules that maximize multiple drug properties while maintaining similarity to the input molecule.

Drug Discovery Type prediction +1

MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning

1 code implementation5 Oct 2020 Kexin Huang, Tianfan Fu, Dawood Khan, Ali Abid, Ali Abdalla, Abubakar Abid, Lucas M. Glass, Marinka Zitnik, Cao Xiao, Jimeng Sun

The efficacy of a drug depends on its binding affinity to the therapeutic target and pharmacokinetics.

SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization

1 code implementation4 Oct 2020 Yue Yu, Kexin Huang, Chao Zhang, Lucas M. Glass, Jimeng Sun, Cao Xiao

Furthermore, most previous works focus on binary DDI prediction whereas the multi-typed DDI pharmacological effect prediction is a more meaningful but harder task.

Data Integration Knowledge Graphs

COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching

1 code implementation15 Jun 2020 Junyi Gao, Cao Xiao, Lucas M. Glass, Jimeng Sun

The other path processes EHR with multi-granularity memory network that encodes structured patient records into multiple levels based on medical ontology.

Fast Graph Attention Networks Using Effective Resistance Based Graph Sparsification

no code implementations15 Jun 2020 Rakshith S Srinivasa, Cao Xiao, Lucas Glass, Justin Romberg, Jimeng Sun

The attention mechanism has demonstrated superior performance for inference over nodes in graph neural networks (GNNs), however, they result in a high computational burden during both training and inference.

Graph Attention Node Classification

CHEER: Rich Model Helps Poor Model via Knowledge Infusion

no code implementations21 May 2020 Cao Xiao, Trong Nghia Hoang, Shenda Hong, Tengfei Ma, Jimeng Sun

There is a growing interest in applying deep learning (DL) to healthcare, driven by the availability of data with multiple feature channels in rich-data environments (e. g., intensive care units).

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks

1 code implementation30 Apr 2020 Kexin Huang, Cao Xiao, Lucas Glass, Marinka Zitnik, Jimeng Sun

Here, we present SkipGNN, a graph neural network approach for the prediction of molecular interactions.

MolTrans: Molecular Interaction Transformer for Drug Target Interaction Prediction

1 code implementation23 Apr 2020 Kexin Huang, Cao Xiao, Lucas Glass, Jimeng Sun

Drug target interaction (DTI) prediction is a foundational task for in silico drug discovery, which is costly and time-consuming due to the need of experimental search over large drug compound space.

Drug Discovery molecular representation +1

SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection

1 code implementation11 Mar 2020 Yue Zhao, Xiyang Hu, Cheng Cheng, Cong Wang, Changlin Wan, Wen Wang, Jianing Yang, Haoping Bai, Zheng Li, Cao Xiao, Yunlong Wang, Zhi Qiao, Jimeng Sun, Leman Akoglu

Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection.

Dimensionality Reduction Fraud Detection +2

CLARA: Clinical Report Auto-completion

no code implementations26 Feb 2020 Siddharth Biswal, Cao Xiao, Lucas M. Glass, M. Brandon Westover, Jimeng Sun

Most existing methods try to generate the whole reports from the raw input with limited success because 1) generated reports often contain errors that need manual review and correction, 2) it does not save time when doctors want to write additional information into the report, and 3) the generated reports are not customized based on individual doctors' preference.

EEG Sentence

REST: Robust and Efficient Neural Networks for Sleep Monitoring in the Wild

1 code implementation29 Jan 2020 Rahul Duggal, Scott Freitas, Cao Xiao, Duen Horng Chau, Jimeng Sun

By deploying these models to an Android application on a smartphone, we quantitatively observe that REST allows models to achieve up to 17x energy reduction and 9x faster inference.

EEG Neural Network Compression +1

StageNet: Stage-Aware Neural Networks for Health Risk Prediction

1 code implementation24 Jan 2020 Junyi Gao, Cao Xiao, Yasha Wang, Wen Tang, Lucas M. Glass, Jimeng Sun

Compared to the best baseline model, StageNet achieves up to 12% higher AUPRC for risk prediction task on two real-world patient datasets.

DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment Prediction

no code implementations22 Jan 2020 Xingyao Zhang, Cao Xiao, Lucas M. Glass, Jimeng Sun

To address these challenges, we proposed DeepEnroll, a cross-modal inference learning model to jointly encode enrollment criteria (text) and patients records (tabular data) into a shared latent space for matching inference.

Sentence Sentence Embedding +1

Opportunities and Challenges of Deep Learning Methods for Electrocardiogram Data: A Systematic Review

1 code implementation28 Dec 2019 Shenda Hong, Yuxi Zhou, Junyuan Shang, Cao Xiao, Jimeng Sun

Methods:We extracted papers that applied deep learning (deep neural network) models to ECG data that were published between Jan. 1st of 2010 and Feb. 29th of 2020 from Google Scholar, PubMed, and the DBLP.

Denoising Sleep Staging

CONAN: Complementary Pattern Augmentation for Rare Disease Detection

no code implementations26 Nov 2019 Limeng Cui, Siddharth Biswal, Lucas M. Glass, Greg Lever, Jimeng Sun, Cao Xiao

How to further leverage patients with possibly uncertain diagnosis to improve detection?

Doctor2Vec: Dynamic Doctor Representation Learning for Clinical Trial Recruitment

no code implementations23 Nov 2019 Siddharth Biswal, Cao Xiao, Lucas M. Glass, Elizabeth Milkovits, Jimeng Sun

We propose doctor2vec which simultaneously learns 1) doctor representations from EHR data and 2) trial representations from the description and categorical information about the trials.

Clinical Knowledge Representation Learning

CORE: Automatic Molecule Optimization Using Copy & Refine Strategy

1 code implementation23 Nov 2019 Tianfan Fu, Cao Xiao, Jimeng Sun

The state-of-the-art approaches partition the molecules into a large set of substructures $S$ and grow the new molecule structure by iteratively predicting which substructure from $S$ to add.

CUP: Cluster Pruning for Compressing Deep Neural Networks

1 code implementation19 Nov 2019 Rahul Duggal, Cao Xiao, Richard Vuduc, Jimeng Sun

With CUP, we overcome two limitations of prior work-(1) non-uniform pruning: CUP can efficiently determine the ideal number of filters to prune in each layer of a neural network.

Clustering

SLEEPER: interpretable Sleep staging via Prototypes from Expert Rules

no code implementations14 Oct 2019 Irfan Al-Hussaini, Cao Xiao, M. Brandon Westover, Jimeng Sun

In this study, we propose Sleep staging via Prototypes from Expert Rules (SLEEPER), which combines deep learning models with expert defined rules using a prototype learning framework to generate simple interpretable models.

Automatic Sleep Stage Classification Sleep Staging

GENN: Predicting Correlated Drug-drug Interactions with Graph Energy Neural Networks

no code implementations4 Oct 2019 Tengfei Ma, Junyuan Shang, Cao Xiao, Jimeng Sun

We propose the graph energy neural network (GENN) to explicitly model link type correlations.

Link Prediction

Rare Disease Detection by Sequence Modeling with Generative Adversarial Networks

no code implementations1 Jul 2019 Kezi Yu, Yunlong Wang, Yong Cai, Cao Xiao, Emily Zhao, Lucas Glass, Jimeng Sun

Rare diseases affecting 350 million individuals are commonly associated with delay in diagnosis or misdiagnosis.

Predicting Treatment Initiation from Clinical Time Series Data via Graph-Augmented Time-Sensitive Model

no code implementations1 Jul 2019 Fan Zhang, Tong Wu, Yunlong Wang, Yong Cai, Cao Xiao, Emily Zhao, Lucas Glass, Jimeng Sun

Many computational models were proposed to extract temporal patterns from clinical time series for each patient and among patient group for predictive healthcare.

Time Series Time Series Analysis

Pre-training of Graph Augmented Transformers for Medication Recommendation

1 code implementation2 Jun 2019 Junyuan Shang, Tengfei Ma, Cao Xiao, Jimeng Sun

G-BERT is the first to bring the language model pre-training schema into the healthcare domain and it achieved state-of-the-art performance on the medication recommendation task.

Language Modelling Representation Learning +1

CGNF: Conditional Graph Neural Fields

no code implementations ICLR 2019 Tengfei Ma, Cao Xiao, Junyuan Shang, Jimeng Sun

By integrating the conditional random fields (CRF) in the graph convolutional networks, we explicitly model a joint probability of the entire set of node labels, thus taking advantage of neighborhood label information in the node label prediction task.

General Classification Node Classification

MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare

1 code implementation NeurIPS 2018 Edward Choi, Cao Xiao, Walter F. Stewart, Jimeng Sun

Deep learning models exhibit state-of-the-art performance for many predictive healthcare tasks using electronic health records (EHR) data, but these models typically require training data volume that exceeds the capacity of most healthcare systems.

Disease Prediction

AWE: Asymmetric Word Embedding for Textual Entailment

no code implementations11 Sep 2018 Tengfei Ma, Chiamin Wu, Cao Xiao, Jimeng Sun

It refers to the directional relation between text fragments such that the "premise" can infer "hypothesis".

Natural Language Inference Paraphrase Identification +4

Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders

1 code implementation NeurIPS 2018 Tengfei Ma, Jie Chen, Cao Xiao

We focus on the matrix representation of graphs and formulate penalty terms that regularize the output distribution of the decoder to encourage the satisfaction of validity constraints.

Time Series Time Series Analysis +1

GAMENet: Graph Augmented MEmory Networks for Recommending Medication Combination

1 code implementation6 Sep 2018 Junyuan Shang, Cao Xiao, Tengfei Ma, Hongyan Li, Jimeng Sun

Recent progress in deep learning is revolutionizing the healthcare domain including providing solutions to medication recommendations, especially recommending medication combination for patients with complex health conditions.

RDPD: Rich Data Helps Poor Data via Imitation

1 code implementation6 Sep 2018 Shenda Hong, Cao Xiao, Trong Nghia Hoang, Tengfei Ma, Hongyan Li, Jimeng Sun

In many situations, we need to build and deploy separate models in related environments with different data qualities.

Knowledge Distillation

Drug Similarity Integration Through Attentive Multi-view Graph Auto-Encoders

1 code implementation28 Apr 2018 Tengfei Ma, Cao Xiao, Jiayu Zhou, Fei Wang

In this paper, we propose to learn accurate and interpretable similarity measures from multiple types of drug features.

FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling

3 code implementations ICLR 2018 Jie Chen, Tengfei Ma, Cao Xiao

The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning.

Node Classification

Patient Subtyping via Time-Aware LSTM Networks

1 code implementation KDD '17 Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017 Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, Jiayu Zhou

We propose a patient subtyping model that leverages the proposed T-LSTM in an auto-encoder to learn a powerful single representation for sequential records of patients, which are then used to cluster patients into clinical subtypes.

Multivariate Time Series Forecasting

Cannot find the paper you are looking for? You can Submit a new open access paper.