Search Results for author: Kushal Arora

Found 10 papers, 6 papers with code

A Critical Evaluation of AI Feedback for Aligning Large Language Models

1 code implementation19 Feb 2024 Archit Sharma, Sedrick Keh, Eric Mitchell, Chelsea Finn, Kushal Arora, Thomas Kollar

RLAIF first performs supervised fine-tuning (SFT) using demonstrations from a teacher model and then further fine-tunes the model with reinforcement learning (RL), using feedback from a critic model.

Instruction Following reinforcement-learning +1

The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation

no code implementations14 Feb 2023 Kushal Arora, Timothy J. O'Donnell, Doina Precup, Jason Weston, Jackie C. K. Cheung

State-of-the-art language generation models can degenerate when applied to open-ended generation problems such as text completion, story generation, or dialog modeling.

Story Generation

Lexi: Self-Supervised Learning of the UI Language

1 code implementation23 Jan 2023 Pratyay Banerjee, Shweti Mahajan, Kushal Arora, Chitta Baral, Oriana Riva

Along with text, these resources include visual content such as UI screenshots and images of application icons referenced in the text.

Image Retrieval Language Modelling +2

Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback

no code implementations5 Aug 2022 Jing Xu, Megan Ung, Mojtaba Komeili, Kushal Arora, Y-Lan Boureau, Jason Weston

We then study various algorithms for improving from such feedback, including standard supervised learning, rejection sampling, model-guiding and reward-based learning, in order to make recommendations on which type of feedback and algorithms work best.

Retrieval

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

2 code implementations5 Aug 2022 Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

DIRECTOR: Generator-Classifiers For Supervised Language Modeling

1 code implementation15 Jun 2022 Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions.

Language Modelling

A Compositional Approach to Language Modeling

no code implementations1 Apr 2016 Kushal Arora, Anand Rangarajan

Traditional language models treat language as a finite state automaton on a probability space over words.

Language Modelling Sentence

Contrastive Entropy: A new evaluation metric for unnormalized language models

no code implementations3 Jan 2016 Kushal Arora, Anand Rangarajan

In this paper, we address the last problem and propose a new discriminative entropy based intrinsic metric that works for both traditional word level models and unnormalized language models like sentence level models.

Language Modelling Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.