1 code implementation • 25 Oct 2023 • Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Dujian Ding
In this work, we utilize Large Language Models (LLMs) for a novel use case: constructing Performance Predictors (PP) that estimate the performance of specific deep neural network architectures on downstream tasks.
1 code implementation • 8 Jun 2023 • Ganesh Jawahar, Haichuan Yang, Yunyang Xiong, Zechun Liu, Dilin Wang, Fei Sun, Meng Li, Aasish Pappu, Barlas Oguz, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Raghuraman Krishnamoorthi, Vikas Chandra
In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between supernet and training from scratch for the same model architecture, necessitating retraining post optimal architecture identification.
4 code implementations • 5 Jun 2023 • Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah
To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka. ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs.
1 code implementation • 14 Oct 2022 • Ganesh Jawahar, Subhabrata Mukherjee, Xiaodong Liu, Young Jin Kim, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah, Sebastien Bubeck, Jianfeng Gao
Furthermore, existing MoE works do not consider computational constraints (e. g., FLOPs, latency) to guide their design.
no code implementations • 6 Oct 2022 • Ganesh Jawahar, Subhabrata Mukherjee, Debadeepta Dey, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Caio Cesar Teodoro Mendes, Gustavo Henrique de Rosa, Shital Shah
In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e. g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models.
1 code implementation • ACL 2022 • Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan
We propose a neural network based detector that detects manipulated news articles by reasoning about the facts mentioned in the article.
1 code implementation • 15 Mar 2022 • Chiyu Zhang, Muhammad Abdul-Mageed, Ganesh Jawahar
Recent progress in representation and contrastive learning in NLP has not widely considered the class of \textit{sociopragmatic meaning} (i. e., meaning in interaction within different language communities).
1 code implementation • ACL 2020 • Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg
The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science.
no code implementations • NAACL (CALCS) 2021 • Ganesh Jawahar, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan
We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs.
1 code implementation • COLING 2020 • Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan
Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs.
1 code implementation • WS 2019 • Ganesh Jawahar, Djam{\'e} Seddah
We devise a novel attentional model, based on Bernoulli word embeddings, that are conditioned on contextual extra-linguistic (social) features such as network, spatial and socio-economic variables, which are associated with Twitter users, as well as topic-based features.
1 code implementation • ACL 2019 • Ganesh Jawahar, Beno{\^\i}t Sagot, Djam{\'e} Seddah
BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks.
no code implementations • CONLL 2018 • Ganesh Jawahar, Benjamin Muller, Amal Fethi, Louis Martin, {\'E}ric Villemonte de la Clergerie, Beno{\^\i}t Sagot, Djam{\'e} Seddah
We augment the deep Biaffine (BiAF) parser (Dozat and Manning, 2016) with novel features to perform competitively: we utilize an indomain version of ELMo features (Peters et al., 2018) which provide context-dependent word representations; we utilize disambiguated, embedded, morphosyntactic features from lexicons (Sagot, 2018), which complements the existing feature set.