Search Results for author: Vedanuj Goswami

Found 17 papers, 10 papers with code

Llama 2: Open Foundation and Fine-Tuned Chat Models

14 code implementations • 18 Jul 2023 • Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Ranked #2 on Question Answering on PubChemQA

Arithmetic Reasoning +5

52,869

Paper
Code

Multilingual Speech-to-Speech Translation into Multiple Target Languages

no code implementations • 17 Jul 2023 • Hongyu Gong, Ning Dong, Sravya Popuri, Vedanuj Goswami, Ann Lee, Juan Pino

Despite a few studies on multilingual S2ST, their focus is the multilinguality on the source side, i. e., the translation from multiple source languages to one target language.

Language Identification Speech-to-Speech Translation +1

Paper
Add Code

Revisiting Machine Translation for Cross-lingual Classification

no code implementations • 23 May 2023 • Mikel Artetxe, Vedanuj Goswami, Shruti Bhosale, Angela Fan, Luke Zettlemoyer

Machine Translation (MT) has been widely used for cross-lingual classification, either by translating the test set into English and running inference with a monolingual model (translate-test), or translating the training set into the target languages and finetuning a multilingual model (translate-train).

Classification Cross-Lingual Transfer +2

Paper
Add Code

Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity

1 code implementation • 3 May 2023 • Haoran Xu, Maha Elbayad, Kenton Murray, Jean Maillard, Vedanuj Goswami

Mixture-of-experts (MoE) models that employ sparse activation have demonstrated effectiveness in significantly increasing the number of parameters while maintaining low computational requirements per token.

Machine Translation Translation

Paper
Code

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

1 code implementation • 1 Mar 2023 • Mohamed Anwar, Bowen Shi, Vedanuj Goswami, Wei-Ning Hsu, Juan Pino, Changhan Wang

We introduce MuAViC, a multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation providing 1200 hours of audio-visual speech in 9 languages.

Audio-Visual Speech Recognition Robust Speech Recognition +4

335

Paper
Code

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

1 code implementation • 10 Feb 2023 • Haoran Xu, Jean Maillard, Vedanuj Goswami

In this work, we first investigate how to utilize intra-distillation to learn more *language-specific* parameters and then show the importance of these language-specific parameters.

Cross-Lingual Transfer Denoising +3

Paper
Code

Causes and Cures for Interference in Multilingual Translation

no code implementations • 14 Dec 2022 • Uri Shaham, Maha Elbayad, Vedanuj Goswami, Omer Levy, Shruti Bhosale

Multilingual machine translation models can benefit from synergy between different language pairs, but also suffer from interference.

Machine Translation Translation

Paper
Add Code

No Language Left Behind: Scaling Human-Centered Machine Translation

7 code implementations • Meta AI 2022 • NLLB team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang

Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today.

Ranked #1 on Machine Translation on IWSLT2017 French-English (SacreBLEU metric)

Machine Translation Translation

29,237

Paper
Code

FLAVA: A Foundational Language And Vision Alignment Model

3 code implementations • CVPR 2022 • Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, Douwe Kiela

State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic pretraining for obtaining good performance on a variety of downstream tasks.

Ranked #4 on Image Retrieval on MS COCO

Image Retrieval Image-to-Text Retrieval +3

1,288

Paper
Code

Tricks for Training Sparse Translation Models

no code implementations • NAACL 2022 • Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan

Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.

Machine Translation Multi-Task Learning +1

Paper
Add Code

Human-Adversarial Visual Question Answering

no code implementations • NeurIPS 2021 • Sasha Sheng, Amanpreet Singh, Vedanuj Goswami, Jose Alberto Lopez Magana, Wojciech Galuba, Devi Parikh, Douwe Kiela

Human subjects interact with a state-of-the-art VQA model, and for each image in the dataset, attempt to find a question where the model's predicted answer is incorrect.

Question Answering Visual Question Answering

Paper
Add Code

Creative Sketch Generation

1 code implementation • ICLR 2021 • Songwei Ge, Vedanuj Goswami, C. Lawrence Zitnick, Devi Parikh

Sketching or doodling is a popular creative activity that people engage in.

Generative Adversarial Network

103

Paper
Code

The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

5 code implementations • NeurIPS 2020 • Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, Davide Testuggine

This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes.

Binary Classification Classification +1

5,414

Paper
Code

MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond

1 code implementation • ICLR 2021 • Duy-Kien Nguyen, Vedanuj Goswami, Xinlei Chen

This paper focuses on visual counting, which aims to predict the number of occurrences given a natural image and a query (e. g. a question or a category).

Ranked #1 on Object Counting on HowMany-QA

Object Counting Question Answering +1

5,414

Paper
Code

Are we pretraining it right? Digging deeper into visio-linguistic pretraining

no code implementations • 19 Apr 2020 • Amanpreet Singh, Vedanuj Goswami, Devi Parikh

Surprisingly, we show that automatically generated data in a domain closer to the downstream task (e. g., VQA v2) is a better choice for pretraining than "natural" data but of a slightly different domain (e. g., Conceptual Captions).

Visual Question Answering (VQA)

Paper
Add Code

12-in-1: Multi-Task Vision and Language Representation Learning

5 code implementations • CVPR 2020 • Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee

Much of vision-and-language research focuses on a small but diverse set of independent tasks and supporting datasets often studied in isolation; however, the visually-grounded language understanding skills required for success at these tasks overlap significantly.

Image Retrieval Question Answering +3

790

Paper
Code

Only Time Can Tell: Discovering Temporal Data for Temporal Modeling

no code implementations • 19 Jul 2019 • Laura Sevilla-Lara, Shengxin Zha, Zhicheng Yan, Vedanuj Goswami, Matt Feiszli, Lorenzo Torresani

However, in current video datasets it has been observed that action classes can often be recognized without any temporal information from a single frame of video.

Benchmarking Motion Estimation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.