no code implementations • 20 Sep 2024 • David Herel, Vojtech Bartek, Tomas Mikolov
Who is the US President?
no code implementations • 14 May 2024 • David Herel, Tomas Mikolov
How much is 56 times 37?
1 code implementation • 2 Apr 2024 • David Herel, Tomas Mikolov
In various fields of knowledge creation, including science, new ideas often build on pre-existing information.
no code implementations • 9 Feb 2024 • Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, Jianfeng Gao
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022.
1 code implementation • 28 Nov 2023 • David Herel, Tomas Mikolov
In this paper, we propose a simple framework that should help advance the state of the art in language modeling in terms of generalization.
Ranked #7 on Language Modelling on WikiText-103
1 code implementation • 8 Nov 2022 • David Herel, Hugo Cisneros, Tomas Mikolov
Our method outperforms existing sentence encoders used in adversarial attacks by achieving 1. 2x - 5. 1x better real attack success rate.
2 code implementations • 29 Sep 2022 • Hugo Cisneros, Josef Sivic, Tomas Mikolov
In this paper, we introduce a benchmark of increasingly difficult tasks together with a data efficiency metric to measure how quickly machine learning models learn from training data.
no code implementations • 27 Jun 2022 • David Herel, Dominika Zogatova, Matej Kripner, Tomas Mikolov
This leads to an emergence of a novel behavior of the agents.
no code implementations • 1 Apr 2021 • Hugo Cisneros, Josef Sivic, Tomas Mikolov
Emergent processes in complex systems such as cellular automata can perform computations of increasing complexity, and could possibly lead to artificial evolution.
no code implementations • 15 Mar 2021 • Germán Kruszewski, Tomas Mikolov
One of the main goals of Artificial Life is to research the conditions for the emergence of life, not necessarily as it is, but as it could be.
no code implementations • 31 Aug 2020 • Barbora Hudcova, Tomas Mikolov
In order to develop systems capable of modeling artificial life, we need to identify, which systems can produce complex behavior.
1 code implementation • 7 Apr 2020 • Germán Kruszewski, Ionut-Teodor Sorodoc, Tomas Mikolov
Online Continual Learning (OCL) studies learning over a continuous data stream without observing any single example more than once, a setting that is closer to the experience of humans and systems that must learn "on-the-wild".
1 code implementation • 17 Mar 2020 • Germán Kruszewski, Tomas Mikolov
An explanatory model for the emergence of evolvable units must display emerging structures that (1) preserve themselves in time (2) self-reproduce and (3) tolerate a certain amount of variation when reproducing.
1 code implementation • 4 Nov 2019 • Hugo Cisneros, Josef Sivic, Tomas Mikolov
In this paper we propose an approach for measuring growth of complexity of emerging patterns in complex systems such as cellular automata.
no code implementations • 14 Oct 2019 • Piotr Bojanowski, Onur Celebi, Tomas Mikolov, Edouard Grave, Armand Joulin
In this paper, we focus on the problem of adapting word vector-based models to new textual data.
no code implementations • 29 Sep 2019 • Carl Yang, Do Huy Hoang, Tomas Mikolov, Jiawei Han
Thanks to the advancing mobile location services, people nowadays can post about places to share visiting experience on-the-go.
4 code implementations • EMNLP 2018 • Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, Edouard Grave
Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space.
2 code implementations • LREC 2018 • Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov
Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance.
Ranked #12 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (using extra training data)
5 code implementations • LREC 2018 • Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin
Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl.
1 code implementation • 30 Oct 2017 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Maximilian Nickel, Tomas Mikolov
This paper shows that a simple baseline based on a Bag-of-Words (BoW) representation learns surprisingly good knowledge graph embeddings.
no code implementations • 26 Mar 2017 • Alexander G. Ororbia II, Tomas Mikolov, David Reitter
The Differential State Framework (DSF) is a simple and high-performing design that unifies previously introduced gated neural models.
no code implementations • 31 Jan 2017 • Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov
With machine learning successfully applied to new daunting problems almost every day, general AI starts looking like an attainable goal.
43 code implementations • 12 Dec 2016 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov
We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.
no code implementations • 18 Nov 2016 • Yacine Jernite, Edouard Grave, Armand Joulin, Tomas Mikolov
Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data.
53 code implementations • TACL 2017 • Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov
A vector representation is associated to each character $n$-gram; words being represented as the sum of these representations.
Ranked #3 on Word Similarity on WS353
64 code implementations • EACL 2017 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
This paper explores a simple and efficient baseline for text classification.
Ranked #1 on Sentiment Analysis on Sogou News
Emotion Recognition in Conversation General Classification +2
1 code implementation • 25 Nov 2015 • Tomas Mikolov, Armand Joulin, Marco Baroni
The development of intelligent machines is one of the biggest unsolved challenges in computer science.
1 code implementation • 23 Nov 2015 • Wojciech Zaremba, Tomas Mikolov, Armand Joulin, Rob Fergus
We present an approach for learning simple algorithms such as copying, multi-digit addition and single digit multiplication directly from examples.
1 code implementation • 19 Nov 2015 • Piotr Bojanowski, Armand Joulin, Tomas Mikolov
The first one consists on conditioning the character level representation on the previous word representation.
4 code implementations • NeurIPS 2015 • Armand Joulin, Tomas Mikolov
Despite the recent achievements in machine learning, we are still very far from achieving real artificial intelligence.
20 code implementations • 19 Feb 2015 • Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, Tomas Mikolov
One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.
5 code implementations • 24 Dec 2014 • Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, Marc'Aurelio Ranzato
In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent.
4 code implementations • 17 Dec 2014 • Grégoire Mesnil, Tomas Mikolov, Marc'Aurelio Ranzato, Yoshua Bengio
Sentiment analysis is a common task in natural language processing that aims to detect polarity of a text document (typically a consumer review).
27 code implementations • 16 May 2014 • Quoc V. Le, Tomas Mikolov
Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models.
Ranked #4 on Question Answering on QASent
2 code implementations • 19 Dec 2013 • Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S. Corrado, Jeffrey Dean
In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage.
Ranked #8 on Multi-label zero-shot learning on Open Images V4
3 code implementations • 11 Dec 2013 • Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, Tony Robinson
We propose a new benchmark corpus to be used for measuring progress in statistical language modeling.
Ranked #24 on Language Modelling on One Billion Word
no code implementations • NeurIPS 2013 • Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, Tomas Mikolov
Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories.
Ranked #15 on Zero-Shot Action Recognition on Kinetics
51 code implementations • NeurIPS 2013 • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean
Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
8 code implementations • 17 Sep 2013 • Tomas Mikolov, Quoc V. Le, Ilya Sutskever
Dictionaries and phrase tables are the basis of modern statistical machine translation systems.
81 code implementations • 16 Jan 2013 • Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean
We propose two novel model architectures for computing continuous vector representations of words from very large data sets.
no code implementations • 21 Nov 2012 • Razvan Pascanu, Tomas Mikolov, Yoshua Bengio
There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994).