1 code implementation • 29 Apr 2024 • Aaron J. Li, Satyapriya Krishna, Himabindu Lakkaraju
The surge in Large Language Models (LLMs) development has led to improved performance on cognitive tasks as well as an urgent need to align these models with human values in order to safely exploit their power.
4 code implementations • 8 Apr 2024 • Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu
We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture.
no code implementations • 9 Feb 2024 • Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
The development of Large Language Models (LLMs) has notably transformed numerous sectors, offering impressive text generation capabilities.
no code implementations • 25 Jan 2024 • Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell
The effectiveness of an audit, however, depends on the degree of system access granted to auditors.
no code implementations • 6 Nov 2023 • Satyapriya Krishna
Large Language Models (LLMs) have demonstrated remarkable capabilities in performing complex cognitive tasks.
1 code implementation • 9 Oct 2023 • Nicholas Kroeger, Dan Ley, Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
To this end, several approaches have been proposed in recent literature to explain the behavior of complex predictive models in a post hoc fashion.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1
no code implementations • 28 Sep 2023 • Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
As machine learning models are increasingly being employed in various high-stakes settings, it becomes important to ensure that predictions of these models are not only adversarially robust, but also readily explainable to relevant stakeholders.
no code implementations • NeurIPS 2023 • Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh, Himabindu Lakkaraju
Large Language Models (LLMs) have demonstrated remarkable capabilities in performing complex tasks.
no code implementations • 8 Feb 2023 • Satyapriya Krishna, Jiaqi Ma, Himabindu Lakkaraju
The Right to Explanation and the Right to be Forgotten are two important principles outlined to regulate algorithmic decision making and data usage in real-world applications.
1 code implementation • 8 Jul 2022 • Dylan Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh
In real-world evaluations with humans, 73% of healthcare workers (e. g., doctors and nurses) agreed they would use TalkToModel over baseline point-and-click systems for explainability in a disease prediction task, and 85% of ML professionals agreed TalkToModel was easier to use for computing explanations.
2 code implementations • 22 Jun 2022 • Chirag Agarwal, Dan Ley, Satyapriya Krishna, Eshika Saxena, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, Himabindu Lakkaraju
OpenXAI comprises of the following key components: (i) a flexible synthetic data generator and a collection of diverse real-world datasets, pre-trained models, and state-of-the-art feature attribution methods, and (ii) open-source implementations of eleven quantitative metrics for evaluating faithfulness, stability (robustness), and fairness of explanation methods, in turn providing comparisons of several explanation methods across a wide variety of metrics, models, and datasets.
no code implementations • Findings (ACL) 2022 • Umang Gupta, Jwala Dhamala, Varun Kumar, Apurv Verma, Yada Pruksachatkun, Satyapriya Krishna, Rahul Gupta, Kai-Wei Chang, Greg Ver Steeg, Aram Galstyan
Language models excel at generating coherent text, and model compression techniques such as knowledge distillation have enabled their use in resource-constrained settings.
no code implementations • ACL 2022 • Satyapriya Krishna, Rahul Gupta, Apurv Verma, Jwala Dhamala, Yada Pruksachatkun, Kai-Wei Chang
With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions.
no code implementations • 14 Mar 2022 • Chirag Agarwal, Nari Johnson, Martin Pawelczyk, Satyapriya Krishna, Eshika Saxena, Marinka Zitnik, Himabindu Lakkaraju
As attribution-based explanation methods are increasingly used to establish model trustworthiness in high-stakes situations, it is critical to ensure that these explanations are stable, e. g., robust to infinitesimal perturbations to an input.
no code implementations • 3 Feb 2022 • Satyapriya Krishna, Tessa Han, Alex Gu, Javin Pombra, Shahin Jabbari, Steven Wu, Himabindu Lakkaraju
To this end, we first conduct interviews with data scientists to understand what constitutes disagreement between explanations generated by different methods for the same model prediction, and introduce a novel quantitative framework to formalize this understanding.
1 code implementation • Findings (EMNLP) 2021 • Justin Payan, Yuval Merhav, He Xie, Satyapriya Krishna, Anil Ramakrishna, Mukund Sridhar, Rahul Gupta
There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications.
no code implementations • Findings (ACL) 2021 • Yada Pruksachatkun, Satyapriya Krishna, Jwala Dhamala, Rahul Gupta, Kai-Wei Chang
Existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation, debiasing model embeddings, or adding fairness-based optimization objectives during training.
no code implementations • 3 Jun 2021 • Michiel de Jong, Satyapriya Krishna, Anuva Agarwal
Training a reinforcement learning agent to carry out natural language instructions is limited by the available supervision, i. e. knowing when the instruction has been carried out.
2 code implementations • EACL 2021 • Satyapriya Krishna, Rahul Gupta, Christophe Dupuy
We prove the theoretical privacy guarantee of our algorithm and assess its privacy leakage under Membership Inference Attacks(MIA) (Shokri et al., 2017) on models trained with transformed data.
1 code implementation • 27 Jan 2021 • Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruksachatkun, Kai-Wei Chang, Rahul Gupta
To systematically study and benchmark social biases in open-ended language generation, we introduce the Bias in Open-Ended Language Generation Dataset (BOLD), a large-scale dataset that consists of 23, 679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion, and political ideology.
no code implementations • 16 May 2020 • Aarsh Patel, Rahul Gupta, Mukund Harakere, Satyapriya Krishna, Aman Alok, Peng Liu
In this research work, we aim to achieve classification parity across explicit as well as implicit sensitive features.
no code implementations • 25 Oct 2019 • Yunzhe Tao, Saurabh Gupta, Satyapriya Krishna, Xiong Zhou, Orchid Majumder, Vineet Khare
Training deep neural networks from scratch on natural language processing (NLP) tasks requires significant amount of manually labeled text corpus and substantial time to converge, which usually cannot be satisfied by the customers.