no code implementations • 21 Nov 2024 • Victor-Alexandru Pădurean, Paul Denny, Adish Singla
This paper introduces BugSpotter, an innovative tool that leverages an LLM to generate buggy code from a problem description and verify the synthesized bugs via a test suite.
no code implementations • 17 Jun 2024 • Chao Wen, Jacqueline Staub, Adish Singla
The benchmark comprises 85 real-world tasks from the Mini-level of the XLogoOnline environment, each requiring a combination of different skills such as spatial planning, basic programming, and logical reasoning.
no code implementations • 14 Jun 2024 • Victor-Alexandru Pădurean, Adish Singla
To further boost the performance of these models, we fine-tune them using a novel synthetic data generation methodology.
no code implementations • 7 Jun 2024 • Nachiket Kotalwar, Alkis Gotovos, Adish Singla
To boost the feedback quality of small models compatible with in-browser inference engines, we develop a fine-tuning pipeline based on GPT-4 generated synthetic data.
1 code implementation • 3 May 2024 • Georgios Tzannetos, Parameswaran Kamalaruban, Adish Singla
a target distribution over complex tasks.
no code implementations • 29 Apr 2024 • Bahar Radmehr, Adish Singla, Tanja Käser
In this paper, we aim to enhance the generalization capabilities of agents in open-ended text-based learning environments by integrating Reinforcement Learning (RL) with Large Language Models (LLMs).
no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Adish Singla, Goran Radanović
We note that we are the first to provide such a characterization of the problem of learning approximate Nash Equilibrium policies in offline two-player zero-sum Markov games under data corruption.
no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban, Georgios Tzannetos, Goran Radanović, Adish Singla
Moreover, we extend our analysis to the approximate optimization setting and derive exponentially decaying convergence rates for both RLHF and DPO.
1 code implementation • 10 Feb 2024 • Rati Devidze, Parameswaran Kamalaruban, Adish Singla
Reward functions are central in specifying the task we want a reinforcement learning agent to perform.
no code implementations • 9 Feb 2024 • Debmalya Mandal, Andi Nika, Parameswaran Kamalaruban, Adish Singla, Goran Radanović
We aim to design algorithms that identify a near-optimal policy from the corrupted data, with provable guarantees.
no code implementations • 2 Feb 2024 • Paul Denny, Sumit Gulwani, Neil T. Heffernan, Tanja Käser, Steven Moore, Anna N. Rafferty, Adish Singla
This survey article has grown out of the GAIED (pronounced "guide") workshop organized by the authors at the NeurIPS 2023 conference.
no code implementations • 27 Dec 2023 • Timo Klein, Susanna Weinberger, Adish Singla, Sebastian Tschiatschek
We consider the problem of third-person imitation learning with the additional challenge that the learner must select the perspective from which they observe the expert.
no code implementations • 26 Nov 2023 • Shubham Kumar Bharti, Stephen Wright, Adish Singla, Xiaojin Zhu
The goal of the teacher is to teach a realizable target policy to the learner using minimum number of state demonstrations.
1 code implementation • 15 Oct 2023 • Manh Hung Nguyen, Sebastian Tschiatschek, Adish Singla
We instantiate several methods based on LLM-SS framework and evaluate them using an existing benchmark, StudentSyn, for student attempt synthesis in a visual programming domain.
2 code implementations • 5 Oct 2023 • Tung Phung, Victor-Alexandru Pădurean, Anjali Singh, Christopher Brooks, José Cambronero, Sumit Gulwani, Adish Singla, Gustavo Soares
We investigate the role of generative AI models in providing human tutor-style programming hints to help students resolve errors in their buggy programs.
no code implementations • 30 Jul 2023 • Adish Singla
In our work, we evaluate two models, ChatGPT (based on GPT-3. 5) and GPT-4, in visual programming domains for various scenarios and assess performance using expert-based annotations.
no code implementations • 29 Jun 2023 • Tung Phung, Victor-Alexandru Pădurean, José Cambronero, Sumit Gulwani, Tobias Kohn, Rupak Majumdar, Adish Singla, Gustavo Soares
In our work, we systematically evaluate two models, ChatGPT (based on GPT-3. 5) and GPT-4, and compare their performance with human tutors for a variety of scenarios.
2 code implementations • 5 Jun 2023 • Mridul Mahajan, Georgios Tzannetos, Goran Radanovic, Adish Singla
We present an information-theoretic framework to learn fixed-dimensional embeddings for tasks in reinforcement learning.
1 code implementation • 27 May 2023 • Alperen Tercan, Ahana Ghosh, Hasan Ferit Eniser, Maria Christakis, Adish Singla
We propose a novel synthesis algorithm that generates a progression of subtasks that are high-quality, well-spaced in terms of their complexity, and solving this progression leads to solving the reference task.
1 code implementation • 26 May 2023 • Victor-Alexandru Pădurean, Georgios Tzannetos, Adish Singla
Generative neural models hold great promise in enhancing programming education by synthesizing new content.
1 code implementation • 25 Apr 2023 • Georgios Tzannetos, Bárbara Gomes Ribeiro, Parameswaran Kamalaruban, Adish Singla
We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings.
1 code implementation • 28 Mar 2023 • Ahana Ghosh, Sebastian Tschiatschek, Sam Devlin, Adish Singla
We introduce a scaffolding framework based on pop quizzes presented as multi-choice programming tasks.
1 code implementation • 27 Feb 2023 • Mohammad Mohammadi, Jonathan Nöther, Debmalya Mandal, Adish Singla, Goran Radanovic
In this paper, we study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer.
no code implementations • 7 Feb 2023 • Debmalya Mandal, Goran Radanovic, Jiarui Gan, Adish Singla, Rupak Majumdar
We show that minimizing regret with this new general discounting is equivalent to minimizing regret with uncertain episode lengths.
1 code implementation • 24 Jan 2023 • Tung Phung, José Cambronero, Sumit Gulwani, Tobias Kohn, Rupak Majumdar, Adish Singla, Gustavo Soares
We investigate using LLMs to generate feedback for fixing syntax errors in Python programs, a key scenario in introductory programming.
1 code implementation • 18 Nov 2022 • Shubham Kumar Bharti, Xuezhou Zhang, Adish Singla, Xiaojin Zhu
Instead, our defense mechanism sanitizes the backdoor policy by projecting observed states to a 'safe subspace', estimated from a small number of interactions with a clean (non-triggered) environment.
1 code implementation • 13 Jun 2022 • Maria Christakis, Hasan Ferit Eniser, Jörg Hoffmann, Adish Singla, Valentin Wüstholz
Here, we show the wide applicability of $k$-safety properties for machine-learning models and present the first specification language for expressing them.
no code implementations • 4 May 2022 • Sebastian Tschiatschek, Maria Knobelsdorf, Adish Singla
We consider the equity and fairness of curricula derived from Knowledge Tracing models.
1 code implementation • 3 May 2022 • Adish Singla, Nikitas Theodoropoulos
We introduce a novel benchmark, StudentSyn, centered around the following challenge: For a given student, synthesize the student's attempt on a new target task after observing the student's attempt on a fixed reference task.
no code implementations • 1 Apr 2022 • Stelios Triantafyllou, Adish Singla, Goran Radanovic
Responsibility attribution is complementary and aims to identify the extent to which decision makers (agents) are responsible for this outcome.
no code implementations • 6 Jan 2022 • Kiarash Banihashem, Adish Singla, Jiarui Gan, Goran Radanovic
This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states.
1 code implementation • NeurIPS 2021 • Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla
By being explicable, we seek to capture two properties: (a) informativeness so that the rewards speed up the agent's convergence, and (b) sparseness as a proxy for ease of interpretability of the rewards.
no code implementations • NeurIPS 2021 • Akash Kumar, Yuxin Chen, Adish Singla
This learning paradigm has been extensively studied when the learner receives worst-case or random counterexamples; in this paper, we consider the optimal teacher who picks best-case counterexamples to teach the target hypothesis within a hypothesis class.
no code implementations • NeurIPS 2021 • Chaoqi Wang, Adish Singla, Yuxin Chen
Our focus is to design a teaching algorithm that can provide an informative sequence of contrastive examples to the learner to speed up the learning process.
no code implementations • 22 Oct 2021 • Anshuman Chhabra, Adish Singla, Prasant Mohapatra
As a first step, we propose a fairness degrading attack algorithm for k-median clustering that operates under a whitebox threat model -- where the clustering algorithm, fairness notion, and the input dataset are known to the adversary.
no code implementations • 23 Sep 2021 • Eleni Straitouri, Adish Singla, Vahid Balazadeh Meresht, Manuel Gomez-Rodriguez
Methods to learn under algorithmic triage have predominantly focused on supervised learning settings where each decision, or prediction, is independent of each other.
no code implementations • NeurIPS 2021 • Stelios Triantafyllou, Adish Singla, Goran Radanovic
We formalize desirable properties of blame attribution in the setting of interest, and we analyze the relationship between these properties and the studied blame attribution methods.
no code implementations • 15 Jul 2021 • Adish Singla, Anna N. Rafferty, Goran Radanovic, Neil T. Heffernan
This survey article has grown out of the RL4ED workshop organized by the authors at the Educational Data Mining (EDM) 2021 conference.
1 code implementation • NeurIPS 2021 • Gaurav Yengera, Rati Devidze, Parameswaran Kamalaruban, Adish Singla
In particular, we study how to design a personalized curriculum over demonstrations to speed up the learner's convergence.
no code implementations • 1 Jun 2021 • Anshuman Chhabra, Adish Singla, Prasant Mohapatra
Extensive experiments on different clustering algorithms and fairness notions show that our algorithms can achieve desired levels of fairness on many real-world datasets with a very small percentage of antidote data added.
1 code implementation • 10 May 2021 • Junaid Ali, Muhammad Bilal Zafar, Adish Singla, Krishna P. Gummadi
Motivated by extensive literature in behavioral economics and behavioral psychology (prospect theory), we propose a notion of fair updates that we refer to as loss-averse updates.
no code implementations • 16 Feb 2021 • Amin Rakhsha, Xuezhou Zhang, Xiaojin Zhu, Adish Singla
We study black-box reward poisoning attacks against reinforcement learning (RL), in which an adversary aims to manipulate the rewards to mislead a sequence of RL agents with unknown algorithms to learn a nefarious policy in an environment unknown to the adversary a priori.
no code implementations • 10 Feb 2021 • Kiarash Banihashem, Adish Singla, Goran Radanovic
As a threat model, we consider attacks that minimally alter rewards to make the attacker's target policy uniquely optimal under the poisoned rewards, with the optimality gap specified by an attack parameter.
no code implementations • 21 Nov 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We provide lower/upper bounds on the attack cost, and instantiate our attacks in two settings: (i) an offline setting where the agent is doing planning in the poisoned environment, and (ii) an online setting where the agent is learning a policy with poisoned feedback.
no code implementations • 27 Oct 2020 • Akash Kumar, Hanqi Zhang, Adish Singla, Yuxin Chen
As a warm-up, we show that the teaching complexity is $\Theta(d)$ for the exact teaching of linear perceptrons in $\mathbb{R}^d$, and $\Theta(d^k)$ for kernel perceptron with a polynomial kernel of order $k$.
no code implementations • 17 Oct 2020 • Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, Adish Singla
We analyze several properties of the teaching complexity parameter $TD(\sigma)$ associated with different families of the preference functions, e. g., comparison to the VC dimension of the hypothesis class and additivity/sub-additivity of $TD(\sigma)$ over disjoint domains.
no code implementations • 25 Jun 2020 • Akash Kumar, Adish Singla, Yisong Yue, Yuxin Chen
We investigate the average teaching complexity of the task, i. e., the minimal number of samples (halfspace queries) required by a teacher to help a version-space learner in locating a randomly selected target.
no code implementations • 23 Jun 2020 • Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
However, the applicability of potential-based reward shaping is limited in settings where (i) the state space is very large, and it is challenging to compute an appropriate potential function, (ii) the feedback signals are noisy, and even with shaped rewards the agent could be trapped in local optima, and (iii) changing the rewards alone is not sufficient, and effective shaping requires changing the dynamics.
1 code implementation • NeurIPS 2020 • Umair Z. Ahmed, Maria Christakis, Aleksandr Efremov, Nigel Fernandez, Ahana Ghosh, Abhik Roychoudhury, Adish Singla
Our task synthesis algorithm operates by first mutating code $\rm C^{in}$ to obtain a set of codes $\{\rm C^{out}\}$.
no code implementations • NeurIPS 2020 • Xuezhou Zhang, Yuzhe ma, Adish Singla
To address these challenges, we propose the \textit{task-agnostic RL} framework: In the exploration phase, the agent first collects trajectories by exploring the MDP without the guidance of a reward function.
no code implementations • 16 Jun 2020 • Xuezhou Zhang, Shubham Kumar Bharti, Yuzhe ma, Adish Singla, Xiaojin Zhu
Our TDim results provide the minimum number of samples needed for reinforcement learning, and we discuss their connections to standard PAC-style RL sample complexity and teaching-by-demonstration sample complexity results.
1 code implementation • ICML 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.
no code implementations • ICML 2020 • Xuezhou Zhang, Yuzhe ma, Adish Singla, Xiaojin Zhu
In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the environment reward $r_t$ into $r_t+\delta_t$ at each step, with the goal of forcing the RL agent to learn a nefarious policy.
no code implementations • 21 Mar 2020 • Rati Devidze, Farnam Mansouri, Luis Haug, Yuxin Chen, Adish Singla
Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task.
no code implementations • 11 Feb 2020 • Vahid Balazadeh, Abir De, Adish Singla, Manuel Gomez-Rodriguez
Reinforcement learning agents have been mostly developed and evaluated under the assumption that they will operate in a fully autonomous manner -- they will take all actions.
no code implementations • NeurIPS 2019 • Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, Adish Singla
In our framework, each function $\sigma \in \Sigma$ induces a teacher-learner pair with teaching complexity as $\TD(\sigma)$.
no code implementations • 5 Oct 2019 • Ahana Ghosh, Sebastian Tschiatschek, Hamed Mahdavi, Adish Singla
In the test phase, the AI agent has to interact with a user of unknown type.
no code implementations • 1 Sep 2019 • Abir De, Adish Singla, Utkarsh Upadhyay, Manuel Gomez-Rodriguez
As a result, she may feel compelled to use the feedback she receives to (re-)estimate her followers' preferences and decides which stories to share next to receive more (positive) feedback.
no code implementations • NeurIPS 2019 • Sebastian Tschiatschek, Ahana Ghosh, Luis Haug, Rati Devidze, Adish Singla
We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences.
no code implementations • 28 May 2019 • Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher.
1 code implementation • 22 May 2019 • Stratis Tsirtsis, Behzad Tabibian, Moein Khajehnejad, Adish Singla, Bernhard Schölkopf, Manuel Gomez-Rodriguez
Using this characterization, we first show that, in general, we cannot expect to find optimal decision policies in polynomial time and there are cases in which deterministic policies are suboptimal.
no code implementations • 16 May 2019 • Junaid Ali, Mahmoudreza Babaei, Abhijnan Chakraborty, Baharan Mirzasoleiman, Krishna P. Gummadi, Adish Singla
As we show in this paper, the time-criticality of the information could further exacerbate the disparity of influence across groups.
Social and Information Networks Computers and Society
no code implementations • 27 Feb 2019 • Rishav Chourasia, Adish Singla
These methods typically work by employing an aggregation mechanism over actions of different RL algorithms.
no code implementations • 23 Jan 2019 • Goran Radanovic, Rati Devidze, David C. Parkes, Adish Singla
We consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting.
no code implementations • 11 Dec 2018 • Paul Rolland, Ali Kavis, Alex Immer, Adish Singla, Volkan Cevher
We study the fundamental problem of learning an unknown, smooth probability function via pointwise Bernoulli tests.
no code implementations • 8 Nov 2018 • Teresa Yeo, Parameswaran Kamalaruban, Adish Singla, Arpit Merchant, Thibault Asselborn, Louis Faucon, Pierre Dillenbourg, Volkan Cevher
We consider the machine teaching problem in a classroom-like setting wherein the teacher has to deliver the same examples to a diverse group of students.
no code implementations • NeurIPS 2018 • Luis Haug, Sebastian Tschiatschek, Adish Singla
In this paper, we study the problem of learning from demonstrations in the setting where this is not the case, i. e., where there is a mismatch between the worldviews of the learner and the expert.
no code implementations • 2 Jul 2018 • Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P. Gummadi, Adish Singla, Adrian Weller, Muhammad Bilal Zafar
Further, our work reveals overlooked tradeoffs between different fairness notions: using our proposed measures, the overall individual-level unfairness of an algorithm can be decomposed into a between-group and a within-group component.
1 code implementation • NeurIPS 2018 • Isabel Valera, Adish Singla, Manuel Gomez Rodriguez
Societies often rely on human experts to take a wide variety of decisions affecting their members, from jail-or-release decisions taken by judges and stop-and-frisk decisions taken by police officers to accept-or-reject decisions taken by academics.
no code implementations • NeurIPS 2019 • Anette Hunziker, Yuxin Chen, Oisin Mac Aodha, Manuel Gomez Rodriguez, Andreas Krause, Pietro Perona, Yisong Yue, Adish Singla
Our framework is both generic, allowing the design of teaching schedules for different memory models, and also interactive, allowing the teacher to adapt the schedule to the underlying forgetting mechanisms of the learner.
no code implementations • NeurIPS 2018 • Yuxin Chen, Adish Singla, Oisin Mac Aodha, Pietro Perona, Yisong Yue
We highlight that adaptivity does not speed up the teaching process when considering existing models of version space learners, such as "worst-case" (the learner picks the next hypothesis randomly from the version space) and "preference-based" (the learner picks hypothesis according to some global preference).
no code implementations • 18 Jan 2018 • Xiaojin Zhu, Adish Singla, Sandra Zilles, Anna N. Rafferty
In this paper we try to organize machine teaching as a coherent set of ideas.
no code implementations • 24 Nov 2017 • Sebastian Tschiatschek, Adish Singla, Manuel Gomez Rodriguez, Arpit Merchant, Andreas Krause
The main objective of our work is to minimize the spread of misinformation by stopping the propagation of fake news in the network.
Social and Information Networks
no code implementations • 17 Nov 2017 • Christoph Hirnschall, Adish Singla, Sebastian Tschiatschek, Andreas Krause
We provide formal guarantees on the performance of our algorithm and test the viability of our approach in a user study with data of apartments on Airbnb.
no code implementations • 16 Feb 2017 • Adish Singla, Hamed Hassani, Andreas Krause
In our setting, the feedback at any time $t$ is limited in a sense that it is only available to the expert $i^t$ that has been selected by the central algorithm (forecaster), \emph{i. e.}, only the expert $i^t$ receives feedback from the environment and gets to learn at time $t$.
no code implementations • 9 Feb 2017 • Christoph Hirnschall, Adish Singla, Sebastian Tschiatschek, Andreas Krause
We study an online multi-task learning setting, in which instances of related tasks arrive sequentially, and are handled by task-specific online learners.
no code implementations • 23 May 2016 • Adish Singla, Sebastian Tschiatschek, Andreas Krause
We propose an active learning algorithm that substantially reduces this sample complexity by exploiting the structural constraints on the version space of hemimetrics.
no code implementations • 23 Nov 2015 • Adish Singla, Sebastian Tschiatschek, Andreas Krause
When the underlying submodular function is unknown, users' feedback can provide noisy evaluations of the function that we seek to maximize.
no code implementations • 12 Aug 2015 • Adish Singla, Eric Horvitz, Pushmeet Kohli, Andreas Krause
Furthermore, we consider an embedding of the tasks and workers in an underlying graph that may arise from task similarities or social ties, and that can provide additional side-observations for faster learning.
no code implementations • 8 Aug 2015 • Besmira Nushi, Adish Singla, Anja Gruenheid, Erfan Zamanian, Andreas Krause, Donald Kossmann
Based on this intuitive idea, we introduce the Access Path Model (APM), a novel crowd model that leverages the notion of access paths as an alternative way of retrieving information.
no code implementations • 27 Apr 2015 • Yuyin Sun, Adish Singla, Dieter Fox, Andreas Krause
Hierarchies of concepts are useful in many applications from navigation to organization of objects.
no code implementations • 24 Apr 2015 • Adish Singla, Eric Horvitz, Pushmeet Kohli, Ryen White, Andreas Krause
How should we gather information in a network, where each node's visibility is limited to its local neighborhood?
no code implementations • 22 Apr 2014 • Adish Singla, Eric Horvitz, Ece Kamar, Ryen White
Users may be willing to share private information in return for better quality of service or for incentives, or in return for assurances about the nature and extend of the logging of data.
no code implementations • 10 Feb 2014 • Adish Singla, Ilija Bogunovic, Gábor Bartók, Amin Karbasi, Andreas Krause
How should we present training examples to learners to teach them classification rules?
no code implementations • 19 Aug 2013 • Adish Singla, Andreas Krause
Community sensing, fusing information from populations of privately-held sensors, presents a great opportunity to create efficient and cost-effective sensing applications.