Search Results for author: Arka Pal

Found 9 papers, 8 papers with code

LiveBench: A Challenging, Contamination-Free LLM Benchmark

1 code implementation27 Jun 2024 Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann Lecun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

In this work, we introduce a new benchmark for LLMs designed to be immune to both test set contamination and the pitfalls of LLM judging and human crowdsourcing.

Instruction Following Math

Large Language Models Must Be Taught to Know What They Don't Know

1 code implementation12 Jun 2024 Sanyam Kapoor, Nate Gruver, Manley Roberts, Katherine Collins, Arka Pal, Umang Bhatt, Adrian Weller, Samuel Dooley, Micah Goldblum, Andrew Gordon Wilson

We show that a thousand graded examples are sufficient to outperform baseline methods and that training through the features of a model is necessary for good performance and tractable for large open-source models when using LoRA.

Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive

1 code implementation20 Feb 2024 Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White

In this work, first we show theoretically that the standard DPO loss can lead to a reduction of the model's likelihood of the preferred examples, as long as the relative probability between the preferred and dispreferred classes increases.

Giraffe: Adventures in Expanding Context Lengths in LLMs

1 code implementation21 Aug 2023 Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu

To use these models on sequences longer than the train-time context length, one might employ techniques from the growing family of context length extrapolation methods -- most of which focus on modifying the system of positional encodings used in the attention mechanism to indicate where tokens or activations are located in the input sequence.

16k 4k

Understanding disentangling in $β$-VAE

23 code implementations10 Apr 2018 Christopher P. Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, Alexander Lerchner

We present new intuitions and theoretical assessments of the emergence of disentangled representation in variational autoencoders.

SCAN: Learning Hierarchical Compositional Visual Concepts

no code implementations ICLR 2018 Irina Higgins, Nicolas Sonnerat, Loic Matthey, Arka Pal, Christopher P. Burgess, Matko Bosnjak, Murray Shanahan, Matthew Botvinick, Demis Hassabis, Alexander Lerchner

SCAN learns concepts through fast symbol association, grounding them in disentangled visual primitives that are discovered in an unsupervised manner.

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

6 code implementations ICLR 2017 Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, Alexander Lerchner

Learning an interpretable factorised representation of the independent data generative factors of the world without supervision is an important precursor for the development of artificial intelligence that is able to learn and reason in the same way that humans do.

Disentanglement

Early Visual Concept Learning with Unsupervised Deep Learning

1 code implementation17 Jun 2016 Irina Higgins, Loic Matthey, Xavier Glorot, Arka Pal, Benigno Uria, Charles Blundell, Shakir Mohamed, Alexander Lerchner

Automated discovery of early visual concepts from raw image data is a major open challenge in AI research.

Cannot find the paper you are looking for? You can Submit a new open access paper.