Search Results for author: Harm de Vries

Found 27 papers, 19 papers with code

Talk The Walk: Navigating Grids in New York City through Grounded Dialogue

no code implementations • ICLR 2019 • Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela

We introduce `"Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.

Paper
Add Code

StarCoder 2 and The Stack v2: The Next Generation

no code implementations • 29 Feb 2024 • Anton Lozhkov, Raymond Li, Loubna Ben allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size.

Ranked #25 on Code Generation on MBPP

Code Completion Code Generation +1

Paper
Add Code

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

2 code implementations • 1 Jan 2024 • Terry Yue Zhuo, Armel Zebaze, Nitchakarn Suppattarachai, Leandro von Werra, Harm de Vries, Qian Liu, Niklas Muennighoff

Through investigations across 5 tasks and 8 different datasets encompassing both code comprehension and code generation tasks, we find that FFT generally leads to the best downstream performance across all scales, and PEFT methods differ significantly in their efficacy based on the model scale.

Code Generation

966

Paper
Code

The BigCode Project Governance Card

no code implementations • 6 Dec 2023 • BigCode collaboration, Sean Hughes, Harm de Vries, Jennifer Robinson, Carlos Muñoz Ferrandis, Loubna Ben allal, Leandro von Werra, Jennifer Ding, Sebastien Paquet, Yacine Jernite

This document serves as an overview of the different mechanisms and areas of governance in the BigCode project.

Paper
Add Code

RepoFusion: Training Code Models to Understand Your Repository

no code implementations • 19 Jun 2023 • Disha Shrivastava, Denis Kocetkov, Harm de Vries, Dzmitry Bahdanau, Torsten Scholak

We find these results to be a novel and compelling demonstration of the gains that training with repository context can bring.

Code Completion

Paper
Add Code

StarCoder: may the source be with you!

4 code implementations • 9 May 2023 • Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.

Ranked #43 on Code Generation on MBPP

8k Code Generation

7,142

Paper
Code

The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents

1 code implementation • 3 Apr 2023 • Xing Han Lu, Siva Reddy, Harm de Vries

We introduce the StatCan Dialogue Dataset consisting of 19, 379 conversation turns between agents working at Statistics Canada and online users looking for published data tables.

Ranked #1 on Table Retrieval on Statcan Dialogue Dataset

Dialogue Generation Table Retrieval

Paper
Code

SantaCoder: don't reach for the stars!

5 code implementations • 9 Jan 2023 • Loubna Ben allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra

The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code.

Code Generation

7,142

Paper
Code

The Stack: 3 TB of permissively licensed source code

no code implementations • 20 Nov 2022 • Denis Kocetkov, Raymond Li, Loubna Ben allal, Jia Li, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, Dzmitry Bahdanau, Leandro von Werra, Harm de Vries

Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)--not only for natural language processing but also for code understanding and generation.

Paper
Add Code

The Power of Prompt Tuning for Low-Resource Semantic Parsing

no code implementations • ACL 2022 • Nathan Schucher, Siva Reddy, Harm de Vries

Prompt tuning has recently emerged as an effective method for adapting pre-trained language models to a number of language understanding and generation tasks.

Semantic Parsing

Paper
Add Code

TopiOCQA: Open-domain Conversational Question Answering with Topic Switching

1 code implementation • 2 Oct 2021 • Vaibhav Adlakha, Shehzaad Dhuliawala, Kaheer Suleman, Harm de Vries, Siva Reddy

On average, a conversation in our dataset spans 13 question-answer turns and involves four topics (documents).

Conversational Question Answering Retrieval +1

Paper
Code

DuoRAT: Towards Simpler Text-to-SQL Models

1 code implementation • NAACL 2021 • Torsten Scholak, Raymond Li, Dzmitry Bahdanau, Harm de Vries, Chris Pal

Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases.

Text-To-SQL

Paper
Code

Towards Ecologically Valid Research on Language User Interfaces

no code implementations • 28 Jul 2020 • Harm de Vries, Dzmitry Bahdanau, Christopher Manning

To this end, we describe what we deem an ideal methodology for machine learning research on LUIs and categorize five common ways in which recent benchmarks deviate from it.

BIG-bench Machine Learning valid

Paper
Add Code

Generative Compositional Augmentations for Scene Graph Prediction

1 code implementation • ICCV 2021 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky

However, test images might contain zero- and few-shot compositions of objects and relationships, e. g. <cup, on, surfboard>.

Graph Generation Language Modelling +1

126

Paper
Code

Graph Density-Aware Losses for Novel Compositions in Scene Graph Generation

1 code implementation • 17 May 2020 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky

We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA.

Graph Generation Scene Graph Generation

126

Paper
Code

CLOSURE: Assessing Systematic Generalization of CLEVR Models

3 code implementations • 12 Dec 2019 • Dzmitry Bahdanau, Harm de Vries, Timothy J. O'Donnell, Shikhar Murty, Philippe Beaudoin, Yoshua Bengio, Aaron Courville

In this work, we study how systematic the generalization of such models is, that is to which extent they are capable of handling novel combinations of known linguistic constructs.

Few-Shot Learning Systematic Generalization +1

Paper
Code

Systematic Generalization: What Is Required and Can It Be Learned?

2 code implementations • ICLR 2019 • Dzmitry Bahdanau, Shikhar Murty, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville

Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantiated.

Systematic Generalization Visual Question Answering (VQA)

Paper
Code

Visual Reasoning with Multi-hop Feature Modulation

1 code implementation • ECCV 2018 • Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

Question Answering Visual Dialog +2

Paper
Code

Talk the Walk: Navigating New York City through Grounded Dialogue

1 code implementation • 9 Jul 2018 • Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela

We introduce "Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.

Navigate

113

Paper
Code

FiLM: Visual Reasoning with a General Conditioning Layer

6 code implementations • 22 Sep 2017 • Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation.

Ranked #3 on Visual Question Answering (VQA) on CLEVR-Humans

Image Retrieval with Multi-Modal Query Visual Question Answering (VQA) +1

304

Paper
Code

Learning Visual Reasoning Without Strong Priors

2 code implementations • 10 Jul 2017 • Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville

Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.

Visual Reasoning

304

Paper
Code

Modulating early visual processing by language

3 code implementations • NeurIPS 2017 • Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron Courville

It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected.

Question Answering Visual Question Answering

Paper
Code

End-to-end optimization of goal-driven and visually grounded dialogue systems

2 code implementations • 15 Mar 2017 • Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning.

Decoder Dialogue Management +2

Paper
Code

GuessWhat?! Visual object discovery through multi-modal dialogue

4 code implementations • CVPR 2017 • Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville

Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.

Object Object Discovery

Paper
Code

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation • 9 May 2016 • The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

9,856

Paper
Code

Can deep learning help you find the perfect match?

no code implementations • 2 May 2015 • Harm de Vries, Jason Yosinski

The answer to this question depends on the personal preferences of the one asking it.

Paper
Add Code

Equilibrated adaptive learning rates for non-convex optimization

2 code implementations • NeurIPS 2015 • Yann N. Dauphin, Harm de Vries, Yoshua Bengio

Parameter-specific adaptive learning rate methods are computationally efficient ways to reduce the ill-conditioning problems encountered when training large deep networks.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.