Search Results for author: Ganesh Jawahar

Found 14 papers, 8 papers with code

LLM Performance Predictors are good initializers for Architecture Search

no code implementations • 25 Oct 2023 • Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Dujian Ding

We show that HS-NAS performs very similar to SOTA NAS across benchmarks, reduces search hours by 50% roughly, and in some cases, improves latency, GFLOPs, and model size.

Machine Translation Neural Architecture Search

Paper
Add Code

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

no code implementations • 8 Jun 2023 • Ganesh Jawahar, Haichuan Yang, Yunyang Xiong, Zechun Liu, Dilin Wang, Fei Sun, Meng Li, Aasish Pappu, Barlas Oguz, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Raghuraman Krishnamoorthi, Vikas Chandra

In addition, the proposed method achieves the SOTA performance in NAS for building fast machine translation models, yielding better latency-BLEU tradeoff compared to HAT, state-of-the-art NAS for MT.

Language Modelling Machine Translation +2

Paper
Add Code

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

3 code implementations • 5 Jun 2023 • Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah

To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka. ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs.

Imitation Learning Knowledge Distillation

Paper
Code

AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation

1 code implementation • 14 Oct 2022 • Ganesh Jawahar, Subhabrata Mukherjee, Xiaodong Liu, Young Jin Kim, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah, Sebastien Bubeck, Jianfeng Gao

Furthermore, existing MoE works do not consider computational constraints (e. g., FLOPs, latency) to guide their design.

Machine Translation Neural Architecture Search +1

Paper
Code

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

no code implementations • 6 Oct 2022 • Ganesh Jawahar, Subhabrata Mukherjee, Debadeepta Dey, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Caio Cesar Teodoro Mendes, Gustavo Henrique de Rosa, Shital Shah

In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e. g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models.

Inductive Bias

Paper
Add Code

Automatic Detection of Entity-Manipulated Text using Factual Knowledge

1 code implementation • ACL 2022 • Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

We propose a neural network based detector that detects manipulated news articles by reasoning about the facts mentioned in the article.

Paper
Code

Contrastive Learning of Sociopragmatic Meaning in Social Media

1 code implementation • 15 Mar 2022 • Chiyu Zhang, Muhammad Abdul-Mageed, Ganesh Jawahar

Recent progress in representation and contrastive learning in NLP has not widely considered the class of \textit{sociopragmatic meaning} (i. e., meaning in interaction within different language communities).

Contrastive Learning

Paper
Code

Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora

1 code implementation • ACL 2020 • Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg

The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science.

Word Embeddings

Paper
Code

Exploring Text-to-Text Transformers for English to Hinglish Machine Translation with Synthetic Code-Mixing

no code implementations • NAACL (CALCS) 2021 • Ganesh Jawahar, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs.

Language Modelling Machine Translation +1

Paper
Add Code

Automatic Detection of Machine Generated Text: A Critical Survey

1 code implementation • COLING 2020 • Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs.

Paper
Code

Contextualized Diachronic Word Representations

1 code implementation • WS 2019 • Ganesh Jawahar, Djam{\'e} Seddah

We devise a novel attentional model, based on Bernoulli word embeddings, that are conditioned on contextual extra-linguistic (social) features such as network, spatial and socio-economic variables, which are associated with Twitter users, as well as topic-based features.

Diachronic Word Embeddings Inductive Bias +1

Paper
Code

What Does BERT Learn about the Structure of Language?

1 code implementation • ACL 2019 • Ganesh Jawahar, Beno{\^\i}t Sagot, Djam{\'e} Seddah

BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks.

109

Paper
Code

ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing

no code implementations • CONLL 2018 • Ganesh Jawahar, Benjamin Muller, Amal Fethi, Louis Martin, {\'E}ric Villemonte de la Clergerie, Beno{\^\i}t Sagot, Djam{\'e} Seddah

We augment the deep Biaffine (BiAF) parser (Dozat and Manning, 2016) with novel features to perform competitively: we utilize an indomain version of ELMo features (Peters et al., 2018) which provide context-dependent word representations; we utilize disambiguated, embedded, morphosyntactic features from lexicons (Sagot, 2018), which complements the existing feature set.

Dependency Parsing Language Modelling

Paper
Add Code

Improving Distributed Representations of Tweets - Present and Future

no code implementations • ACL 2017 • Ganesh Jawahar

Information Retrieval Representation Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.