Search Results for author: Margaret Li

Found 16 papers, 7 papers with code

Overconfidence in the Face of Ambiguity with Adversarial Data

1 code implementation NAACL (DADC) 2022 Margaret Li, Julian Michael

Adversarial data collection has shown promise as a method for building models which are more robust to the spurious correlations that generally appear in naturalistic data.

Natural Language Inference

Predicting vs. Acting: A Trade-off Between World Modeling & Agent Modeling

no code implementations2 Jul 2024 Margaret Li, Weijia Shi, Artidoro Pagnoni, Peter West, Ari Holtzman

RLHF-aligned LMs have shown unprecedented ability on both benchmarks and long-form text generation, yet they struggle with one foundational task: next-token prediction.

Text Generation

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

1 code implementation19 Jan 2024 Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer

Despite their popularity in non-English NLP, multilingual language models often underperform monolingual ones due to inter-language competition for model parameters.

In-context Pretraining: Language Modeling Beyond Document Boundaries

1 code implementation16 Oct 2023 Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion.

In-Context Learning Language Modelling +1

Scaling Expert Language Models with Unsupervised Domain Discovery

1 code implementation24 Mar 2023 Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

Large language models are typically trained densely: all parameters are updated with respect to all inputs.

Language Modelling

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

2 code implementations5 Aug 2022 Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

New ELMs are learned by branching from (mixtures of) ELMs in the current set, further training the parameters on data for the new domain, and then merging the resulting model back into the set for future use.

Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers

no code implementations26 Jul 2021 Danielle Rothermel, Margaret Li, Tim Rocktäschel, Jakob Foerster

After carefully redesigning the empirical setup, we find that when tuning learning rates properly, pretrained transformers do outperform or match training from scratch in all of our tasks, but only as long as the entire model is finetuned.

Bot-Adversarial Dialogue for Safe Conversational Agents

no code implementations NAACL 2021 Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

Conversational agents trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior.

Recipes for Safety in Open-domain Chatbots

no code implementations14 Oct 2020 Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases.

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

no code implementations22 Jun 2020 Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.

Continual Learning

Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training

1 code implementation ACL 2020 Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston

Generative dialogue models currently suffer from a number of problems which standard maximum likelihood training does not address.

ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons

no code implementations6 Sep 2019 Margaret Li, Jason Weston, Stephen Roller

While dialogue remains an important end-goal of natural language research, the difficulty of evaluation is an oft-quoted reason why it remains troublesome to make real progress towards its solution.

Dialogue Evaluation

I Know the Feeling: Learning to Converse with Empathy

no code implementations ICLR 2019 Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau

Beyond understanding what is being discussed, human communication requires an awareness of what someone is feeling.

Dialogue Generation

Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset

9 code implementations ACL 2019 Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau

One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill.

Dialogue Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.