Search Results for author: David D. Cox

Found 14 papers, 4 papers with code

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

no code implementations23 Aug 2024 Yikang Shen, Matthew Stallone, Mayank Mishra, Gaoyuan Zhang, Shawn Tan, Aditya Prasad, Adriana Meza Soria, David D. Cox, Rameswar Panda

This is not only because there is a complicated correlation between learning rate, batch size, number of training tokens, model size, and other hyperparameters but also because it is prohibitively expensive to perform a hyperparameter search for large language models with Billions or Trillions of parameters.

LAB: Large-Scale Alignment for ChatBots

no code implementations2 Mar 2024 Shivchander Sudalairaj, Abhishek Bhandwaldar, Aldo Pareja, Kai Xu, David D. Cox, Akash Srivastava

This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training.

Instruction Following Language Modelling +2

Neural Population Geometry Reveals the Role of Stochasticity in Robust Perception

1 code implementation NeurIPS 2021 Joel Dapello, Jenelle Feather, Hang Le, Tiago Marques, David D. Cox, Josh H. McDermott, James J. DiCarlo, SueYeon Chung

Adversarial examples are often cited by neuroscientists and machine learning researchers as an example of how computational models diverge from biological sensory systems.

Adversarial Robustness

Object-Centric Diagnosis of Visual Reasoning

no code implementations21 Dec 2020 Jianwei Yang, Jiayuan Mao, Jiajun Wu, Devi Parikh, David D. Cox, Joshua B. Tenenbaum, Chuang Gan

In contrast, symbolic and modular models have a relatively better grounding and robustness, though at the cost of accuracy.

Object Question Answering +2

CZ-GEM: A FRAMEWORK FOR DISENTANGLED REPRESENTATION LEARNING

no code implementations ICLR 2020 Akash Srivastava, Yamini Bansal, Yukun Ding, Bernhard Egger, Prasanna Sattigeri, Josh Tenenbaum, David D. Cox, Dan Gutfreund

In this work, we tackle a slightly more intricate scenario where the observations are generated from a conditional distribution of some known control variate and some latent noise variate.

Disentanglement

Minnorm training: an algorithm for training over-parameterized deep neural networks

no code implementations3 Jun 2018 Yamini Bansal, Madhu Advani, David D. Cox, Andrew M. Saxe

To solve this constrained optimization problem, our method employs Lagrange multipliers that act as integrators of error over training and identify `support vector'-like examples.

Generalization Bounds

Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

no code implementations17 Feb 2015 Chuan-Yung Tsai, David D. Cox

A central challenge in sensory neuroscience is describing how the activity of populations of neurons can represent useful features of the external environment.

Object Recognition

Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms

1 code implementation SCIPY 2013 2013 James Bergstra, Dan Yamins, David D. Cox

Sequential model-based optimization (also known as Bayesian optimization) is one of the most efficient methods (per function evaluation) of function minimization.

Bayesian Optimization BIG-bench Machine Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.