no code implementations • 9 Jun 2023 • Sid Mittal, Vineet Gupta, Frederick Liu, Mukund Sundararajan
Our contributions are: We identify a hard prompt that adapts chain-of-thought prompting to policy violation tasks.
no code implementations • 3 Aug 2021 • Rohan Anil, Badih Ghazi, Vineet Gupta, Ravi Kumar, Pasin Manurangsi
In this work, we study the large-scale pretraining of BERT-Large with differentially private SGD (DP-SGD).
no code implementations • 1 Jan 2021 • Rohan Anil, Vineet Gupta, Tomer Koren, Kevin Regan, Yoram Singer
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent.
1 code implementation • 20 Feb 2020 • Rohan Anil, Vineet Gupta, Tomer Koren, Kevin Regan, Yoram Singer
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent.
1 code implementation • NeurIPS 2019 • Rohan Anil, Vineet Gupta, Tomer Koren, Yoram Singer
Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for achieving state-of-the-art performance in machine translation and language modeling.
3 code implementations • 30 Jan 2019 • Rohan Anil, Vineet Gupta, Tomer Koren, Yoram Singer
Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for achieving state-of-the-art performance in machine translation and language modeling.
Ranked #31 on Machine Translation on WMT2014 English-French
1 code implementation • ICLR 2019 • Hanie Sedghi, Vineet Gupta, Philip M. Long
We characterize the singular values of the linear transformation associated with a standard 2D multi-channel convolutional layer, enabling their efficient computation.
3 code implementations • ICML 2018 • Vineet Gupta, Tomer Koren, Yoram Singer
Preconditioned gradient methods are among the most general and powerful tools in optimization.
no code implementations • 20 Jun 2017 • Vineet Gupta, Tomer Koren, Yoram Singer
We describe a framework for deriving and analyzing online optimization algorithms that incorporate adaptive, data-dependent regularization, also termed preconditioning.
no code implementations • 22 Mar 2017 • Amit Daniely, Roy Frostig, Vineet Gupta, Yoram Singer
We describe and analyze a simple random feature scheme (RFS) from prescribed compositional kernels.
no code implementations • 19 Nov 2015 • Ramanathan V. Guha, Vineet Gupta
Messages often refer to entities such as people, places and events.