no code implementations • 21 Feb 2025 • George H. Chen, Devavrat Shah
In terms of theory, our focus is on nonasymptotic statistical guarantees, which we state in the form of how many training data and what algorithm parameters ensure that a nearest neighbor prediction method achieves a user-specified error tolerance.
1 code implementation • 19 Nov 2024 • Mingzhu Liu, Angela H. Chen, George H. Chen
Time series foundation models are pre-trained on large datasets and are able to achieve state-of-the-art performance in diverse tasks.
2 code implementations • 1 Oct 2024 • George H. Chen
We further delve into two extensions of the basic time-to-event prediction setup: predicting which of several critical events will happen first along with the time until this earliest event happens (the competing risks setting), and predicting time-to-event outcomes given a time series that grows in length over time (the dynamic setting).
1 code implementation • 31 Aug 2024 • Shu Hu, George H. Chen
This decomposition does not hold for commonly used survival loss functions, including for the Cox proportional hazards model, its deep neural network variants, and many other recently developed models that use loss functions involving ranking or similarity score calculations.
1 code implementation • 10 Dec 2023 • Shahriar Noroozizadeh, Jeremy C. Weiss, George H. Chen
To solve this problem, we propose a supervised contrastive learning framework that learns an embedding representation for each time step of a patient time series.
1 code implementation • 17 Aug 2023 • Xiaobin Shen, Jonathan Elmer, George H. Chen
Our main experimental findings are that: (1) the classical Fine and Gray model which only uses a patient's static features and summary statistics from the patient's latest hour's worth of EEG data is highly competitive, achieving accuracy scores as high as the recently developed Dynamic-DeepHit model that uses substantially more of the patient's EEG data; and (2) in an ablation study, we show that our choice of modeling three competing risks results in a model that is at least as accurate while learning more information than simpler models (using two competing risks or a standard survival analysis setup with no competing risks).
1 code implementation • 29 Jun 2023 • Yan Ju, Shu Hu, Shan Jia, George H. Chen, Siwei Lyu
Despite the development of effective deepfake detectors in recent years, recent studies have demonstrated that biases in the data used to train these detectors can lead to disparities in detection accuracy across different races and genders.
1 code implementation • 11 May 2023 • George H. Chen
We then show how these visualization ideas extend to handling raw inputs that are images.
1 code implementation • 18 Nov 2022 • Shu Hu, George H. Chen
We propose a general approach for training survival analysis models that minimizes a worst-case error across all subpopulations that are large enough (occurring with at least a user-specified minimum probability).
1 code implementation • 21 Jun 2022 • George H. Chen
On four standard survival analysis datasets of varying sizes (up to roughly 3 million data points), we show that survival kernets are highly competitive compared to various baselines tested in terms of time-dependent concordance index.
2 code implementations • 21 Jun 2022 • Kay Liu, Yingtong Dou, Yue Zhao, Xueying Ding, Xiyang Hu, Ruitong Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li, George H. Chen, Zhihao Jia, Philip S. Yu
To bridge this gap, we present--to the best of our knowledge--the first comprehensive benchmark for unsupervised outlier node detection on static attributed graphs called BOND, with the following highlights.
no code implementations • 28 Mar 2022 • Gerardo Flores, George H. Chen, Tom Pollard, Joyce C. Ho, Tristan Naumann
A collection of invited non-archival papers for the Conference on Health, Inference, and Learning (CHIL) 2022.
2 code implementations • 2 Jan 2022 • Zheng Li, Yue Zhao, Xiyang Hu, Nicola Botta, Cezar Ionescu, George H. Chen
To address these issues, we present a simple yet effective algorithm called ECOD (Empirical-Cumulative-distribution-based Outlier Detection), which is inspired by the fact that outliers are often the "rare events" that appear in the tails of a distribution.
2 code implementations • 26 Oct 2021 • Yue Zhao, George H. Chen, Zhihao Jia
Outlier detection (OD) is a key learning task for finding rare and deviant data samples, with many time-critical applications such as fraud detection and intrusion detection.
1 code implementation • 25 Jul 2020 • George H. Chen
We also show how to use kernel functions to construct prediction intervals of survival time estimates that are statistically valid for individuals similar to a test subject.
1 code implementation • 15 Jul 2020 • George H. Chen, Linhong Li, Ren Zuo, Amanda Coston, Jeremy C. Weiss
We present a neural network framework for learning a survival model to predict a time-to-event outcome while simultaneously learning a topic model that reveals feature relationships.
1 code implementation • 2 Jun 2020 • Helen Zhou, Cheng Cheng, Zachary C. Lipton, George H. Chen, Jeremy C. Weiss
Finally, the PEER score is provided in the form of a nomogram for direct calculation of patient risk, and can be used to highlight at-risk patients among critical care patients eligible for ECMO.
no code implementations • 1 Jun 2020 • Emaad Manzoor, George H. Chen, Dokyun Lee, Michael D. Smith
Deliberation among individuals online plays a key role in shaping the opinions that drive votes, purchases, donations and other critical offline behavior.
no code implementations • ICLR 2020 • George H. Chen, Linhong Li, Ren Zuo, Amanda Coston, Jeremy C. Weiss
The two approaches we propose differ in the generality of topic models they can learn.
1 code implementation • NeurIPS 2019 • Wei Ma, George H. Chen
Recently, various papers have shown that we can reduce this bias in MNAR matrix completion if we know the probabilities of different matrix entries being missing.
no code implementations • 17 Jul 2019 • Lynn H. Kaack, George H. Chen, M. Granger Morgan
The road freight sector is responsible for a large and growing share of greenhouse gas emissions, but reliable data on the amount of freight that is moved on roads in many parts of the world are scarce.
1 code implementation • 13 May 2019 • George H. Chen
We establish the first nonasymptotic error bounds for Kaplan-Meier-based nearest neighbor and kernel survival probability estimators where feature vectors reside in metric spaces.
no code implementations • 2 Dec 2017 • George H. Chen, Jeremy C. Weiss
For example, by seeing "gallstones" in a document, we are fairly certain that the document is partially about medicine.
no code implementations • NeurIPS 2014 • Guy Bresler, George H. Chen, Devavrat Shah
Despite the prevalence of collaborative filtering in recommendation systems, there has been little theoretical development on why and how well it works, especially in the "online" setting, where items are recommended to users over time.
no code implementations • 22 Mar 2013 • George H. Chen, Christian Wachinger, Polina Golland
To this end, out-of-sample extensions are applied by constructing an interpolation function that maps from the input space to the low-dimensional manifold.
no code implementations • NeurIPS 2013 • George H. Chen, Stanislav Nikolov, Devavrat Shah
Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e. g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data.