no code implementations • 26 Nov 2024 • Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu
To gain deeper insights into this trend, we study over 1500 quantized LLM checkpoints of various sizes and at different training levels (undertrained or fully trained) in a controlled setting, deriving scaling laws for understanding the relationship between QiD and factors such as the number of training tokens, model size and bit width.
1 code implementation • 7 Nov 2024 • Walter Gerych, Haoran Zhang, Kimia Hamidieh, Eileen Pan, Maanas Sharma, Thomas Hartvigsen, Marzyeh Ghassemi
Vision-language model (VLM) embeddings have been shown to encode biases present in their training data, such as societal biases that prescribe negative characteristics to members of various racial and gender identities.
no code implementations • 1 Nov 2024 • Kimia Hamidieh, Haoran Zhang, Walter Gerych, Thomas Hartvigsen, Marzyeh Ghassemi
Finally, we conduct an analysis of the source of such biases, by showing that the same harmful stereotypes are also present in a large image-text dataset used to train CLIP models for examples of biases that we find.
1 code implementation • 29 Oct 2024 • Hang Yin, Yao Su, LiPing Liu, Thomas Hartvigsen, Xin Dai, Xiangnan Kong
Spike train classification has recently become an important topic in the machine learning community, where each spike train is a binary event sequence with \emph{temporal-sparsity of signals of interest} and \emph{temporal-noise} properties.
no code implementations • 28 Oct 2024 • Matthew Landers, Taylor W. Killian, Hugo Barnes, Thomas Hartvigsen, Afsaneh Doryab
Reinforcement learning problems often involve large action spaces arising from the simultaneous execution of multiple sub-actions, resulting in combinatorial action spaces.
no code implementations • 23 Oct 2024 • Dongliang Guo, Mengxuan Hu, Zihan Guan, Junfeng Guo, Thomas Hartvigsen, Sheng Li
Through empirical studies on the capability for performing backdoor attack in large pre-trained models ($\textit{e. g.,}$ ViT), we find the following unique challenges of attacking large pre-trained models: 1) the inability to manipulate or even access large training datasets, and 2) the substantial computational resources required for training or fine-tuning these models.
1 code implementation • 22 Oct 2024 • Bryan R. Christ, Zack Gottesman, Jonathan Kropko, Thomas Hartvigsen
MathNeuro builds on existing work by using weights and activations to calculate parameter importance, but isolates math-specific parameters by removing those important for general language tasks.
no code implementations • 30 Sep 2024 • Shan Chen, Mingye Gao, Kuleen Sasse, Thomas Hartvigsen, Brian Anthony, Lizhou Fan, Hugo Aerts, Jack Gallifant, Danielle Bitterman
Background: Large language models (LLMs) are trained to follow directions, but this introduces a vulnerability to blindly comply with user requests even if they generate wrong information.
1 code implementation • 11 Jul 2024 • Kumail Alhamoud, Yasir Ghunaim, Motasem Alfarra, Thomas Hartvigsen, Philip Torr, Bernard Ghanem, Adel Bibi, Marzyeh Ghassemi
In response, we introduce FedMedICL, a unified framework and benchmark to holistically evaluate federated medical imaging challenges, simultaneously capturing label, demographic, and temporal distribution shifts.
1 code implementation • 9 Jul 2024 • Arinbjorn Kolbeinsson, Kyle O'Brien, Tianjin Huang, ShangHua Gao, Shiwei Liu, Jonathan Richard Schwarz, Anurag Vaidya, Faisal Mahmood, Marinka Zitnik, Tianlong Chen, Thomas Hartvigsen
Test-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining.
3 code implementations • 22 Jun 2024 • Mingtian Tan, Mike A. Merrill, Vinayak Gupta, Tim Althoff, Thomas Hartvigsen
Large language models (LLMs) are being applied to time series forecasting.
1 code implementation • 17 Jun 2024 • Jack Gallifant, Shan Chen, Pedro Moreira, Nikolaj Munch, Mingye Gao, Jackson Pond, Leo Anthony Celi, Hugo Aerts, Thomas Hartvigsen, Danielle Bitterman
Medical knowledge is context-dependent and requires consistent reasoning across various natural language expressions of semantically equivalent phrases.
no code implementations • 29 May 2024 • Shenghuan Sun, Alexander Schubert, Gregory M. Goldgof, Zhiqing Sun, Thomas Hartvigsen, Atul J. Butte, Ahmed Alaa
For this purpose, we propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
2 code implementations • 15 May 2024 • Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap
Recent advances in large language models (LLMs) have led to their extensive global deployment, and ensuring their safety calls for comprehensive and multilingual toxicity evaluations.
1 code implementation • 23 Apr 2024 • Derek Powell, Walter Gerych, Thomas Hartvigsen
For example, in learning a korat is a type of cat, you also infer it is a mammal and has claws, ensuring your model of the world is consistent.
1 code implementation • 29 Feb 2024 • ShangHua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, Marinka Zitnik
We introduce UniTS, a unified multi-task time series model that utilizes task tokenization to integrate predictive and generative tasks into a single framework.
Ranked #9 on
Time Series Forecasting
on ETTh1 (336) Multivariate
2 code implementations • 24 Feb 2024 • Bryan R Christ, Jonathan Kropko, Thomas Hartvigsen
To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate.
1 code implementation • 13 Feb 2024 • Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen
Machine learning models for text classification often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs.
no code implementations • 6 Feb 2024 • Sujay Nagaraj, Walter Gerych, Sana Tonekaboni, Anna Goldenberg, Berk Ustun, Thomas Hartvigsen
We first demonstrate the importance of modelling the temporal nature of the label noise function and how existing methods will consistently underperform.
no code implementations • 1 Dec 2023 • Stefan Hegselmann, Antonio Parziale, Divya Shanmugam, Shengpu Tang, Mercy Nyamewaa Asiedu, Serina Chang, Thomas Hartvigsen, Harvineet Singh
A collection of the accepted Findings papers that were presented at the 3rd Machine Learning for Health symposium (ML4H 2023), which was held on December 10, 2023, in New Orleans, Louisiana, USA.
no code implementations • 4 Nov 2023 • Hang Yin, Yao Su, Xinyue Liu, Thomas Hartvigsen, Yanhua Li, Xiangnan Kong
We refer to such brain networks as multi-state, and this mixture can help us understand human behavior.
1 code implementation • 25 Jul 2023 • Taylor W. Killian, Haoran Zhang, Thomas Hartvigsen, Ava P. Amini
Prevalent in many real-world settings such as healthcare, irregular time series are challenging to formulate predictions from.
1 code implementation • 7 Apr 2023 • Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen, Xixin Wu, Danny Fox, Helen Meng, James Glass
Despite recent concerns about undesirable behaviors generated by large language models (LLMs), including non-factual, biased, and hateful language, we find LLMs are inherent multi-task language checkers based on their latent representations of natural and social knowledge.
no code implementations • 8 Feb 2023 • Thomas Hartvigsen, Jidapa Thadajarassiri, Xiangnan Kong, Elke Rundensteiner
Using this insight, we then propose CAT, a model that classifies multivariate ITS by explicitly seeking highly-relevant portions of an input series' timeline.
1 code implementation • NeurIPS 2023 • Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi
We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs.
1 code implementation • 11 Oct 2022 • Ramesh Doddaiah, Prathyush Parvatharaju, Elke Rundensteiner, Thomas Hartvigsen
Instead, when a classifier is choosing between many classes, an effective explanation must show what sets the chosen class apart from the rest.
1 code implementation • 21 Aug 2022 • Thomas Hartvigsen, Walter Gerych, Jidapa Thadajarassiri, Xiangnan Kong, Elke Rundensteiner
We bridge this gap and study early classification of irregular time series, a new setting for early classifiers that opens doors to more real-world problems.
no code implementations • LREC 2022 • Ruofan Hu, Dongyu Zhang, Dandan Tao, Thomas Hartvigsen, Hao Feng, Elke Rundensteiner
To accelerate the development of machine learning-based models for foodborne outbreak detection, we thus present TWEET-FID (TWEET-Foodborne Illness Detection), the first publicly available annotated dataset for multiple foodborne illness incident detection tasks.
no code implementations • 6 May 2022 • Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi
Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups.
1 code implementation • ACL 2022 • Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, Ece Kamar
To help mitigate these issues, we create ToxiGen, a new large-scale and machine-generated dataset of 274k toxic and benign statements about 13 minority groups.
no code implementations • 1 Jan 2021 • Walter Gerych, Thomas Hartvigsen, Luke Buquicchio, Kavin Chandrasekaran, Hamid Mansoor, Abdulaziz alajaji
In this work, we propose DeepSPU, the first method to address this sequential bias problem.
no code implementations • ACL 2020 • Cansu Sen, Thomas Hartvigsen, Biao Yin, Xiangnan Kong, Elke Rundensteiner
Motivated by human attention, computational attention mechanisms have been designed to help neural networks adjust their focus on specific parts of the input data.
no code implementations • 25 Sep 2019 • Thomas Hartvigsen, Cansu Sen, Xiangnan Kong, Elke Rundensteiner
As a result, even for high-dimensional hidden states, all dimensions are updated at each timestep regardless of the recurrent memory cell.
1 code implementation • KDD 2019 • Thomas Hartvigsen, Cansu Sen, Xiangnan Kong, Elke Rundensteiner
Early classification of time series is the prediction of the class label of a time series before it is observed in its entirety.