no code implementations • 24 Oct 2024 • Ankit Singh Rawat, Veeranjaneyulu Sadhanala, Afshin Rostamizadeh, Ayan Chakrabarti, Wittawat Jitkrittum, Vladimir Feinberg, Seungyeon Kim, Hrayr Harutyunyan, Nikunj Saunshi, Zachary Nado, Rakesh Shivanna, Sashank J. Reddi, Aditya Krishna Menon, Rohan Anil, Sanjiv Kumar
In particular, this paradigm relies on an SLM to both (1) provide soft labels as additional training supervision, and (2) select a small subset of valuable ("informative" and "hard") training examples.
no code implementations • 27 Jan 2023 • Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR).
no code implementations • 29 Dec 2021 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Addison J. Hu, Ryan J. Tibshirani
We study a multivariate version of trend filtering, called Kronecker trend filtering or KTF, for the case in which the design points form a lattice in $d$ dimensions.
no code implementations • 24 Mar 2019 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Aaditya Ramdas, Ryan J. Tibshirani
We present an extension of the Kolmogorov-Smirnov (KS) two-sample test, which can be more sensitive to differences in the tails.
no code implementations • NeurIPS 2017 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, James L. Sharpnack, Ryan J. Tibshirani
To move past this, we define two new higher-order TV classes, based on two ways of compiling the discrete derivatives of a parameter across the nodes.
no code implementations • 16 Feb 2017 • Veeranjaneyulu Sadhanala, Ryan J. Tibshirani
We study additive models built with trend filtering, i. e., additive models whose components are each regularized by the (discrete) total variation of their $k$th (discrete) derivative, for a chosen integer $k \geq 0$.
no code implementations • NeurIPS 2016 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Ryan Tibshirani
Lastly, we investigate the problem of adaptivity of the total variation denoiser to these smaller Sobolev function spaces.
no code implementations • 22 Sep 2014 • Yu-Xiang Wang, Veeranjaneyulu Sadhanala, Wei Dai, Willie Neiswanger, Suvrit Sra, Eric P. Xing
We develop parallel and distributed Frank-Wolfe algorithms; the former on shared memory machines with mini-batching, and the latter in a delayed update framework.