Search Results for author: Eric Michael Smith

Found 17 papers, 8 papers with code

Towards Safety and Helpfulness Balanced Responses via Controllable Large Language Models

no code implementations • 1 Apr 2024 • Yi-Lin Tuan, Xilun Chen, Eric Michael Smith, Louis Martin, Soumya Batra, Asli Celikyilmaz, William Yang Wang, Daniel M. Bikel

As large language models (LLMs) become easily accessible nowadays, the trade-off between safety and helpfulness can significantly impact user experience.

Paper
Add Code

ROBBIE: Robust Bias Evaluation of Large Generative Language Models

no code implementations • 29 Nov 2023 • David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric Michael Smith

In this work, our focus is two-fold: (1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs.

Benchmarking Fairness

Paper
Add Code

The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages

1 code implementation • 31 Aug 2023 • Benjamin Muller, Belen Alastruey, Prangthip Hansanti, Elahe Kalbassi, Christophe Ropers, Eric Michael Smith, Adina Williams, Luke Zettlemoyer, Pierre Andrews, Marta R. Costa-jussà

We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.

Data Augmentation Text Generation

152

Paper
Code

Llama 2: Open Foundation and Fine-Tuned Chat Models

14 code implementations • 18 Jul 2023 • Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Ranked #2 on Question Answering on PubChemQA

Arithmetic Reasoning +5

52,476

Paper
Code

Improving Open Language Models by Learning from Organic Interactions

no code implementations • 7 Jun 2023 • Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety.

Paper
Add Code

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

2 code implementations • 5 Aug 2022 • Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

10,425

Paper
Code

"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset

2 code implementations • 18 May 2022 • Eric Michael Smith, Melissa Hall, Melanie Kambadur, Eleonora Presani, Adina Williams

As language models grow in popularity, it becomes increasingly important to clearly measure all possible markers of demographic identity in order to avoid perpetuating existing societal harms.

Sentence

152

Paper
Code

Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents

no code implementations • NLP4ConvAI (ACL) 2022 • Eric Michael Smith, Orion Hsu, Rebecca Qian, Stephen Roller, Y-Lan Boureau, Jason Weston

At the heart of improving conversational AI is the open problem of how to evaluate conversations.

Dialogue Evaluation

Paper
Add Code

Hi, my name is Martha: Using names to measure and mitigate bias in generative dialogue models

no code implementations • 7 Sep 2021 • Eric Michael Smith, Adina Williams

All AI models are susceptible to learning biases in data that they are trained on.

Paper
Add Code

Multi-Modal Open-Domain Dialogue

no code implementations • EMNLP 2021 • Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston

Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020).

Ranked #1 on Visual Dialog on Wizard of Wikipedia

Visual Dialog

Paper
Add Code

Controlling Style in Generated Dialogue

1 code implementation • 22 Sep 2020 • Eric Michael Smith, Diana Gonzalez-Rico, Emily Dinan, Y-Lan Boureau

Open-domain conversation models have become good at generating natural-sounding dialogue, using very large architectures with billions of trainable parameters.

Dialogue Generation

10,425

Paper
Code

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

no code implementations • 22 Jun 2020 • Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.

Continual Learning

Paper
Add Code

Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills

2 code implementations • ACL 2020 • Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau

Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent.

168

Paper
Code

Zero-Shot Fine-Grained Style Transfer: Leveraging Distributed Continuous Style Representations to Transfer To Unseen Styles

no code implementations • 10 Nov 2019 • Eric Michael Smith, Diana Gonzalez-Rico, Emily Dinan, Y-Lan Boureau

Text style transfer is usually performed using attributes that can take a handful of discrete values (e. g., positive to negative reviews).

Attribute Style Transfer +1

Paper
Add Code

I Know the Feeling: Learning to Converse with Empathy

no code implementations • ICLR 2019 • Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau

Beyond understanding what is being discussed, human communication requires an awareness of what someone is feeling.

Dialogue Generation

Paper
Add Code

Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset

9 code implementations • ACL 2019 • Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau

One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill.

Dialogue Generation

10,425

Paper
Code

Multiple-Attribute Text Style Transfer

3 code implementations • 1 Nov 2018 • Sandeep Subramanian, Guillaume Lample, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau

The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style".

Attribute Disentanglement +3

222

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.