no code implementations • ICML 2020 • Karthik Abinav Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein
Through novel theoretical and experimental results, we show how the neural net architecture affects gradient confusion, and thus the efficiency of training.
1 code implementation • 21 Oct 2024 • Yun He, Di Jin, Chaoqi Wang, Chloe Bi, Karishma Mandyam, Hejia Zhang, Chen Zhu, Ning li, Tengyu Xu, Hongjiang Lv, Shruti Bhosale, Chenguang Zhu, Karthik Abinav Sankararaman, Eryk Helenowski, Melanie Kambadur, Aditya Tayade, Hao Ma, Han Fang, Sinong Wang
To address this gap, we introduce Multi-IF, a new benchmark designed to assess LLMs' proficiency in following multi-turn and multilingual instructions.
no code implementations • 16 Oct 2024 • Chaoqi Wang, Zhuokai Zhao, Chen Zhu, Karthik Abinav Sankararaman, Michal Valko, Xuefei Cao, Zhaorun Chen, Madian Khabsa, Yuxin Chen, Hao Ma, Sinong Wang
However, current post-training methods such as reinforcement learning from human feedback (RLHF) and direct alignment from preference methods (DAP) primarily utilize single-sample comparisons.
no code implementations • 30 Sep 2024 • Tengyu Xu, Eryk Helenowski, Karthik Abinav Sankararaman, Di Jin, Kaiyan Peng, Eric Han, Shaoliang Nie, Chen Zhu, Hejia Zhang, Wenxuan Zhou, Zhouhao Zeng, Yun He, Karishma Mandyam, Arya Talabzadeh, Madian Khabsa, Gabriel Cohen, Yuandong Tian, Hao Ma, Sinong Wang, Han Fang
However, RLHF has limitations in multi-task learning (MTL) due to challenges of reward hacking and extreme multi-objective optimization (i. e., trade-off of multiple and/or sometimes conflicting objectives).
1 code implementation • 29 Sep 2023 • Xiaotian Han, Hanqing Zeng, Yu Chen, Shaoliang Nie, Jingzhou Liu, Kanika Narang, Zahra Shakeri, Karthik Abinav Sankararaman, Song Jiang, Madian Khabsa, Qifan Wang, Xia Hu
We establish this equivalence mathematically by demonstrating that graph convolution networks (GCN) and simplified graph convolution (SGC) can be expressed as a form of Mixup.
2 code implementations • 27 Sep 2023 • Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma
We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.
no code implementations • 14 Nov 2022 • Aleksandrs Slivkins, Xingyu Zhou, Karthik Abinav Sankararaman, Dylan J. Foster
We consider contextual bandits with linear constraints (CBwLC), a variant of contextual bandits in which the algorithm consumes multiple resources subject to linear constraints on total consumption.
no code implementations • 2 Jun 2022 • Karthik Abinav Sankararaman, Sinong Wang, Han Fang
Transformer has become ubiquitous due to its dominant performance in various NLP and image processing tasks.
no code implementations • NeurIPS 2021 • Karthik Abinav Sankararaman, Aleksandrs Slivkins
Third, we provide a "generalreduction" from BwK to bandits which takes advantage of some known helpful structure, and apply this reduction to combinatorial semi-bandits, linear contextual bandits, and multinomial-logit bandits.
no code implementations • 16 Mar 2021 • Vashist Avadhanula, Riccardo Colini-Baldeschi, Stefano Leonardi, Karthik Abinav Sankararaman, Okke Schrijvers
We modify the algorithm proposed in Badanidiyuru \emph{et al.,} to extend it to the case of multiple platforms to obtain an algorithm for both the discrete and continuous bid-spaces.
no code implementations • 12 Mar 2021 • Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman
We design decentralized algorithms for regret minimization in the two-sided matching market with one-sided bandit feedback that significantly improves upon the prior works (Liu et al. 2020a, 2020b, Sankararaman et al. 2020).
no code implementations • 14 Jul 2020 • Karthik Abinav Sankararaman, Anand Louis, Navin Goyal
First, for a large and well-studied class of LSEMs, namely ``bow free'' models, we provide a sufficient condition on model parameters under which robust identifiability holds, thereby removing the restriction of paths required by prior work.
no code implementations • 26 Jun 2020 • Abishek Sankararaman, Soumya Basu, Karthik Abinav Sankararaman
Online learning in a two-sided matching market, with demand side agents continuously competing to be matched with supply side (arms), abstracts the complex interactions under partial information on matching platforms (e. g. UpWork, TaskRabbit).
no code implementations • 1 Feb 2020 • Karthik Abinav Sankararaman, Aleksandrs Slivkins
Third, we provide a general "reduction" from BwK to bandits which takes advantage of some known helpful structure, and apply this reduction to combinatorial semi-bandits, linear contextual bandits, and multinomial-logit bandits.
1 code implementation • 18 Dec 2019 • Vedant Nanda, Pan Xu, Karthik Abinav Sankararaman, John P. Dickerson, Aravind Srinivasan
Moreover, if in such a scenario, the assignment of requests to drivers (by the platform) is made only to maximize profit and/or minimize wait time for riders, requests of a certain type (e. g. from a non-popular pickup location, or to a non-popular drop-off location) might never be assigned to a driver.
no code implementations • 30 Nov 2019 • Michael J. Curry, John P. Dickerson, Karthik Abinav Sankararaman, Aravind Srinivasan, Yuhao Wan, Pan Xu
Rideshare platforms such as Uber and Lyft dynamically dispatch drivers to match riders' requests.
no code implementations • 16 May 2019 • Karthik Abinav Sankararaman, Anand Louis, Navin Goyal
First we prove that under a sufficient condition, for a certain sub-class of $\LSEM$ that are \emph{bow-free} (Brito and Pearl (2002)), the parameter recovery is stable.
no code implementations • 28 Nov 2018 • Nicole Immorlica, Karthik Abinav Sankararaman, Robert Schapire, Aleksandrs Slivkins
We suggest a new algorithm for the stochastic version, which builds on the framework of regret minimization in repeated games and admits a substantially simpler analysis compared to prior work.
no code implementations • 22 Apr 2018 • Brian Brubach, Karthik Abinav Sankararaman, Aravind Srinivasan, Pan Xu
On the upper bound side, we show that this framework, combined with a black-box adapted from Bansal et al., (Algorithmica, 2012), yields an online algorithm which nearly doubles the ratio to 0. 46.
no code implementations • 23 May 2017 • Karthik Abinav Sankararaman, Aleksandrs Slivkins
We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks (BwK) and combinatorial semi-bandits.