Search Results for author: Charith Peris

Found 14 papers, 3 papers with code

Partial Federated Learning

no code implementations • 3 Mar 2024 • Tiantian Feng, Anil Ramakrishna, Jimit Majmudar, Charith Peris, Jixuan Wang, Clement Chung, Richard Zemel, Morteza Ziyadi, Rahul Gupta

Federated Learning (FL) is a popular algorithm to train machine learning models on user data constrained to edge devices (for example, mobile phones) due to privacy concerns.

Contrastive Learning Federated Learning

Paper
Add Code

On the steerability of large language models toward data-driven personas

no code implementations • 8 Nov 2023 • Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta

Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.

Collaborative Filtering Language Modelling +1

Paper
Add Code

Coordinated Replay Sample Selection for Continual Federated Learning

no code implementations • 23 Oct 2023 • Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

Continual Federated Learning (CFL) combines Federated Learning (FL), the decentralized learning of a central model on a number of client devices that may not communicate their data, and Continual Learning (CL), the learning of a model from a continual stream of data without keeping the entire history.

Continual Learning Federated Learning

Paper
Add Code

Holistic Survey of Privacy and Fairness in Machine Learning

no code implementations • 28 Jul 2023 • Sina Shaham, Arash Hajisafi, Minh K Quan, Dinh C Nguyen, Bhaskar Krishnamachari, Charith Peris, Gabriel Ghinita, Cyrus Shahabi, Pubudu N. Pathirana

Privacy and fairness are two crucial pillars of responsible Artificial Intelligence (AI) and trustworthy Machine Learning (ML).

Fairness

Paper
Add Code

Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning

1 code implementation • 19 May 2023 • Mustafa Safa Ozdayi, Charith Peris, Jack FitzGerald, Christophe Dupuy, Jimit Majmudar, Haidar Khan, Rahil Parikh, Rahul Gupta

We present a novel approach which uses prompt-tuning to control the extraction rates of memorized content in LLMs.

Paper
Code

The Massively Multilingual Natural Language Understanding 2022 (MMNLU-22) Workshop and Competition

no code implementations • 13 Dec 2022 • Christopher Hench, Charith Peris, Jack FitzGerald, Kay Rottmann

Despite recent progress in Natural Language Understanding (NLU), the creation of multilingual NLU systems remains a challenge.

intent-classification Intent Classification +3

Paper
Add Code

Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks

no code implementations • 10 Oct 2022 • Charith Peris, Lizhen Tan, Thomas Gueudre, Turan Gojayev, Pan Wei, Gokmen Oz

Yet, the generic corpora used to pretrain the teacher and the corpora associated with the downstream target domain are often significantly different, which raises a natural question: should the student be distilled over the generic corpora, so as to learn from high-quality teacher predictions, or over the downstream task corpora to align with finetuning?

domain classification intent-classification +5

Paper
Add Code

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

1 code implementation • 2 Aug 2022 • Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.

Ranked #14 on Natural Language Inference on CommitmentBank

Causal Language Modeling Common Sense Reasoning +8

363

Paper
Code

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

no code implementations • 15 Jun 2022 • Jack FitzGerald, Shankar Ananthakrishnan, Konstantine Arkoudas, Davide Bernardi, Abhishek Bhagia, Claudio Delli Bovi, Jin Cao, Rakesh Chada, Amit Chauhan, Luoxin Chen, Anurag Dwarakanath, Satyam Dwivedi, Turan Gojayev, Karthik Gopalakrishnan, Thomas Gueudre, Dilek Hakkani-Tur, Wael Hamza, Jonathan Hueser, Kevin Martin Jose, Haidar Khan, Beiye Liu, Jianhua Lu, Alessandro Manzotti, Pradeep Natarajan, Karolina Owczarzak, Gokmen Oz, Enrico Palumbo, Charith Peris, Chandana Satya Prakash, Stephen Rawls, Andy Rosenbaum, Anjali Shenoy, Saleh Soltan, Mukund Harakere Sridhar, Liz Tan, Fabian Triefenbach, Pan Wei, Haiyang Yu, Shuai Zheng, Gokhan Tur, Prem Natarajan

We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9. 3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system.

Cross-Lingual Natural Language Inference intent-classification +5

Paper
Add Code

Differentially Private Decoding in Large Language Models

no code implementations • 26 May 2022 • Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, Richard Zemel

Recent large-scale natural language processing (NLP) systems use a pre-trained Large Language Model (LLM) on massive and diverse corpora as a headstart.

Language Modelling Large Language Model +1

Paper
Add Code

MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages

5 code implementations • 18 Apr 2022 • Jack FitzGerald, Christopher Hench, Charith Peris, Scott Mackie, Kay Rottmann, Ana Sanchez, Aaron Nash, Liam Urbach, Vishesh Kakarala, Richa Singh, Swetha Ranganath, Laurie Crist, Misha Britan, Wouter Leeuwis, Gokhan Tur, Prem Natarajan

We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation.

Ranked #1 on Slot Filling on MASSIVE

intent-classification Intent Classification +4

531

Paper
Code

Generative Adversarial Networks for Annotated Data Augmentation in Data Sparse NLU

no code implementations • ICON 2020 • Olga Golovneva, Charith Peris

In this paper, we present our results on boosting NLU model performance through training data augmentation using a sequential generative adversarial network (GAN).

Data Augmentation Generative Adversarial Network +3

Paper
Add Code

Using multiple ASR hypotheses to boost i18n NLU performance

no code implementations • ICON 2020 • Charith Peris, Gokmen Oz, Khadige Abboud, Venkata sai Varada, Prashan Wanigasekara, Haidar Khan

For IC and NER multi-task experiments, when evaluating on the mismatched test set, we see improvements across all domains in German and in 17 out of 19 domains in Portuguese (improvements based on change in SeMER scores).

Abstractive Text Summarization Automatic Speech Recognition +10

Paper
Add Code

Using Alternate Representations of Text for Natural Language Understanding

no code implementations • WS 2020 • Venkat Varada, Charith Peris, Yangsook Park, Christopher Dipersio

One of the core components of voice assistants is the Natural Language Understanding (NLU) model.

Natural Language Understanding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.