Search Results for author: Haoran Zhang

Found 49 papers, 24 papers with code

Essay Quality Signals as Weak Supervision for Source-based Essay Scoring

no code implementations EACL (BEA) 2021 Haoran Zhang, Diane Litman

However, because AES typically uses supervised machine learning, a human-graded essay corpus is still required to train the AES model.

Automated Essay Scoring

TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables

no code implementations29 Feb 2024 Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Yong liu, Yunzhong Qiu, Haoran Zhang, Jianmin Wang, Mingsheng Long

Experimentally, TimeXer significantly improves time series forecasting with exogenous variables and achieves consistent state-of-the-art performance in twelve real-world forecasting benchmarks.

Time Series Time Series Forecasting

Tokenization Is More Than Compression

no code implementations28 Feb 2024 Craig W. Schmidt, Varshini Reddy, Haoran Zhang, Alec Alameddine, Omri Uzan, Yuval Pinter, Chris Tanner

Tokenization is a foundational step in Natural Language Processing (NLP) tasks, bridging raw text and language models.

Data Compression

Timer: Transformers for Time Series Analysis at Scale

1 code implementation4 Feb 2024 Yong liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long

Continuous progresses have been achieved as the emergence of large language models, exhibiting unprecedented ability in few-shot generalization, scalability, and task generality, which is however absent in time series models.

Anomaly Detection Imputation +2

A Literature Review on Fetus Brain Motion Correction in MRI

no code implementations30 Jan 2024 Haoran Zhang, Yun Wang

This paper provides a comprehensive review of the latest advancements in fetal motion correction in MRI.

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

1 code implementation24 Jan 2024 Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kaijing Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, Wenhao Huang, Noura Al Moubayed, Jie Fu, Chenghua Lin

We further annotate the image-text pairs with two-level subset-subcategory hierarchy annotations to facilitate a more comprehensive evaluation of the baselines.

Benchmarking Image Captioning +3

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

1 code implementation22 Jan 2024 Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Wenhu Chen, Jie Fu

We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context.

A Closer Look at AUROC and AUPRC under Class Imbalance

2 code implementations11 Jan 2024 Matthew B. A. McDermott, Lasse Hyldig Hansen, Haoran Zhang, Giovanni Angelotti, Jack Gallifant

In machine learning (ML), a widespread adage is that the area under the precision-recall curve (AUPRC) is a superior metric for model comparison to the area under the receiver operating characteristic (AUROC) for binary classification tasks with class imbalance.

Binary Classification

The Limits of Fair Medical Imaging AI In The Wild

1 code implementation11 Dec 2023 Yuzhe Yang, Haoran Zhang, Judy W Gichoya, Dina Katabi, Marzyeh Ghassemi

As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate or propagate healthcare disparities.

Fairness

Towards ultra-low-cost smartphone microscopy

no code implementations28 Nov 2023 Haoran Zhang, Weiyi Zhang, Zirui Zuo, Jianlong Yang

The outbreak of COVID-19 exposed the inadequacy of our technical tools for home health surveillance, and recent studies have shown the potential of smartphones as a universal optical microscopic imaging platform for such applications.

Image Enhancement

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

4 code implementations10 Oct 2023 Yong liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, Mingsheng Long

These forecasters leverage Transformers to model the global dependencies over temporal tokens of time series, with each token formed by multiple variates of the same timestamp.

Time Series Time Series Forecasting

Continuous Time Evidential Distributions for Irregular Time Series

1 code implementation25 Jul 2023 Taylor W. Killian, Haoran Zhang, Thomas Hartvigsen, Ava P. Amini

Prevalent in many real-world settings such as healthcare, irregular time series are challenging to formulate predictions from.

Irregular Time Series Time Series +1

Real-World Video for Zoom Enhancement based on Spatio-Temporal Coupling

no code implementations24 Jun 2023 Zhiling Guo, Yinqiang Zheng, Haoran Zhang, Xiaodan Shi, Zekun Cai, Ryosuke Shibasaki, Jinyue Yan

In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs.

Image Super-Resolution

Cross-attention learning enables real-time nonuniform rotational distortion correction in OCT

no code implementations7 Jun 2023 Haoran Zhang, Jianlong Yang, Jingqian Zhang, Shiqing Zhao, Aili Zhang

Nonuniform rotational distortion (NURD) correction is vital for endoscopic optical coherence tomography (OCT) imaging and its functional extensions, such as angiography and elastography.

Annotation-efficient learning for OCT segmentation

1 code implementation6 May 2023 Haoran Zhang, Jianlong Yang, Ce Zheng, Shiqing Zhao, Aili Zhang

Compared to the widely-used U-Net model with 100% training data, our method only requires ~10% of the data for achieving the same segmentation accuracy, and it speeds the training up to ~3. 5 times.

Segmentation

A Meta-Summary of Challenges in Building Products with ML Components -- Collecting Experiences from 4758+ Practitioners

no code implementations31 Mar 2023 Nadia Nahar, Haoran Zhang, Grace Lewis, Shurui Zhou, Christian Kästner

Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing challenges.

Change is Hard: A Closer Look at Subpopulation Shift

1 code implementation23 Feb 2023 Yuzhe Yang, Haoran Zhang, Dina Katabi, Marzyeh Ghassemi

Machine learning models often perform poorly on subgroups that are underrepresented in the training data.

Model Selection

SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling

1 code implementation NeurIPS 2023 Jiaxiang Dong, Haixu Wu, Haoran Zhang, Li Zhang, Jianmin Wang, Mingsheng Long

By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series.

Representation Learning Time Series +1

Efficient Estimation for Longitudinal Networks via Adaptive Merging

no code implementations15 Nov 2022 Haoran Zhang, Junhui Wang

Longitudinal network consists of a sequence of temporal edges among multiple nodes, where the temporal edges are observed in real time.

Tensor Decomposition

"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

1 code implementation19 Oct 2022 Haoran Zhang, Harvineet Singh, Marzyeh Ghassemi, Shalmali Joshi

In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms.

Signed Network Embedding with Application to Simultaneous Detection of Communities and Anomalies

no code implementations8 Jul 2022 Haoran Zhang, Junhui Wang

This paper develops a unified embedding model for signed networks to disentangle the intertwined balance structure and anomaly effect, which can greatly facilitate the downstream analysis, including community detection, anomaly detection, and network inference.

Anomaly Detection Community Detection +2

The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

no code implementations6 May 2022 Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi

Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups.

BIG-bench Machine Learning Fairness

Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches

no code implementations4 Apr 2022 Zifeng Zhao, Dongchao Yang, Rongzhi Gu, Haoran Zhang, Yuexian Zou

However, its performance is often inferior to that of a blind source separation (BSS) counterpart with a similar network architecture, due to the auxiliary speaker encoder may sometimes generate ambiguous speaker embeddings.

blind source separation Metric Learning +2

Improving the Fairness of Chest X-ray Classifiers

1 code implementation23 Mar 2022 Haoran Zhang, Natalie Dullerud, Karsten Roth, Lauren Oakden-Rayner, Stephen Robert Pfohl, Marzyeh Ghassemi

We also find that methods which achieve group fairness do so by worsening performance for all groups.

Fairness

Reinforcement Learning from Demonstrations by Novel Interactive Expert and Application to Automatic Berthing Control Systems for Unmanned Surface Vessel

no code implementations23 Feb 2022 Haoran Zhang, Chenkun Yin, Yanxin Zhang, Shangtai Jin, Zhenxuan Li

A new expert data generation method, called Model Predictive Based Expert (MPBE) which combines Model Predictive Control and Deep Deterministic Policy Gradient, is developed to provide high quality supervision data for RLfD algorithms.

Model Predictive Control reinforcement-learning +1

Learning Optimal Predictive Checklists

1 code implementation NeurIPS 2021 Haoran Zhang, Quaid Morris, Berk Ustun, Marzyeh Ghassemi

Our results show that our method can fit simple predictive checklists that perform well and that can easily be customized to obey a rich class of custom constraints.

Fairness

Differentiable Projection for Constrained Deep Learning

no code implementations21 Nov 2021 Dou Huang, Haoran Zhang, Xuan Song, Ryosuke Shibasaki

In this paper, we propose to use a differentiable projection layer in DNN instead of directly solving time-consuming KKT conditions.

Image Segmentation Semantic Segmentation

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

1 code implementation28 Oct 2021 Jinhui Yuan, Xinqi Li, Cheng Cheng, Juncheng Liu, Ran Guo, Shenghang Cai, Chi Yao, Fei Yang, Xiaodong Yi, Chuan Wu, Haoran Zhang, Jie Zhao

Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model.

An open GPS trajectory dataset and benchmark for travel mode detection

no code implementations17 Sep 2021 Jinyu Chen, Haoran Zhang, Xuan Song, Ryosuke Shibasaki

In this study, we propose and open GPS trajectory dataset marked with travel mode and benchmark for the travel mode detection.

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

1 code implementation27 Aug 2021 Stephen R. Pfohl, Haoran Zhang, Yizhe Xu, Agata Foryciarz, Marzyeh Ghassemi, Nigam H. Shah

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality.

Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing

1 code implementation27 Aug 2021 Sindhu C. M. Gowda, Shalmali Joshi, Haoran Zhang, Marzyeh Ghassemi

This systematic investigation underlines the importance of accounting for the underlying data-generating mechanisms and fortifying data-preprocessing pipelines with a causal framework to develop methods robust to confounding biases.

Benchmarking Data Augmentation +1

Invariance-based Multi-Clustering of Latent Space Embeddings for Equivariant Learning

no code implementations25 Jul 2021 Chandrajit Bajaj, Avik Roy, Haoran Zhang

Variational Autoencoders (VAEs) have been shown to be remarkably effective in recovering model latent spaces for several computer vision tasks.

Clustering

Reading Race: AI Recognises Patient's Racial Identity In Medical Images

no code implementations21 Jul 2021 Imon Banerjee, Ananth Reddy Bhimireddy, John L. Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghassemi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P Lungren, Lyle Palmer, Brandon J Price, Saptarshi Purkayastha, Ayis Pyrros, Luke Oakden-Rayner, Chima Okechukwu, Laleh Seyyed-Kalantari, Hari Trivedi, Ryan Wang, Zachary Zaiman, Haoran Zhang, Judy W Gichoya

Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of these models to generalize to external environments and across multiple imaging modalities, B) assessment of possible confounding anatomic and phenotype population features, such as disease distribution and body habitus as predictors of race, and C) investigation into the underlying mechanism by which AI models can recognize race.

An Empirical Framework for Domain Generalization in Clinical Settings

1 code implementation20 Mar 2021 Haoran Zhang, Natalie Dullerud, Laleh Seyyed-Kalantari, Quaid Morris, Shalmali Joshi, Marzyeh Ghassemi

In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data.

Domain Generalization Time Series +1

An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare

1 code implementation23 Nov 2020 Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Marzyeh Ghassemi

Reinforcement Learning (RL) has recently been applied to sequential estimation and prediction problems identifying and developing hypothetical treatment strategies for septic patients, with a particular focus on offline learning with observational data.

Open-Ended Question Answering reinforcement-learning +2

Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring

no code implementations ACL 2020 Haoran Zhang, Diane Litman

While automated essay scoring (AES) can reliably grade essays at scale, automated writing evaluation (AWE) additionally provides formative feedback to guide essay revision.

Automated Essay Scoring Automated Writing Evaluation

Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings

1 code implementation11 Mar 2020 Haoran Zhang, Amy X. Lu, Mohamed Abdalla, Matthew McDermott, Marzyeh Ghassemi

In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks.

Fairness Word Embeddings

Word Embedding for Response-To-Text Assessment of Evidence

no code implementations ACL 2017 Haoran Zhang, Diane Litman

Our long-term goal is to also use this scoring method to provide formative feedback to students and teachers about students' writing quality.

Automated Essay Scoring

Co-Attention Based Neural Network for Source-Dependent Essay Scoring

1 code implementation WS 2018 Haoran Zhang, Diane Litman

This paper presents an investigation of using a co-attention based neural network for source-dependent essay scoring.

Automated Essay Scoring

Dose-response modeling in high-throughput cancer drug screenings: An end-to-end approach

1 code implementation13 Dec 2018 Wesley Tansey, Kathy Li, Haoran Zhang, Scott W. Linderman, Raul Rabadan, David M. Blei, Chris H. Wiggins

Personalized cancer treatments based on the molecular profile of a patient's tumor are an emerging and exciting class of treatments in oncology.

Applications

The Holdout Randomization Test for Feature Selection in Black Box Models

3 code implementations1 Nov 2018 Wesley Tansey, Victor Veitch, Haoran Zhang, Raul Rabadan, David M. Blei

We propose the holdout randomization test (HRT), an approach to feature selection using black box predictive models.

Methodology

Cannot find the paper you are looking for? You can Submit a new open access paper.