Search Results for author: Tomas Mikolov

Found 42 papers, 27 papers with code

Collapse of Self-trained Language Models

1 code implementation • 2 Apr 2024 • David Herel, Tomas Mikolov

In various fields of knowledge creation, including science, new ideas often build on pre-existing information.

Paper
Code

Large Language Models: A Survey

no code implementations • 9 Feb 2024 • Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, Jianfeng Gao

Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022.

Paper
Add Code

Advancing State of the Art in Language Modeling

1 code implementation • 28 Nov 2023 • David Herel, Tomas Mikolov

In this paper, we propose a simple framework that should help advance the state of the art in language modeling in terms of generalization.

Ranked #7 on Language Modelling on WikiText-103

Language Modelling

Paper
Code

Preserving Semantics in Textual Adversarial Attacks

1 code implementation • 8 Nov 2022 • David Herel, Hugo Cisneros, Tomas Mikolov

Our method outperforms existing sentence encoders used in adversarial attacks by achieving 1. 2x - 5. 1x better real attack success rate.

Adversarial Attack Sentence +2

Paper
Code

Benchmarking Learning Efficiency in Deep Reservoir Computing

2 code implementations • 29 Sep 2022 • Hugo Cisneros, Josef Sivic, Tomas Mikolov

In this paper, we introduce a benchmark of increasingly difficult tasks together with a data efficiency metric to measure how quickly machine learning models learn from training data.

Benchmarking

Paper
Code

Emergence of Novelty in Evolutionary Algorithms

no code implementations • 27 Jun 2022 • David Herel, Dominika Zogatova, Matej Kripner, Tomas Mikolov

This leads to an emergence of a novel behavior of the agents.

Atari Games Evolutionary Algorithms

Paper
Add Code

Visualizing computation in large-scale cellular automata

no code implementations • 1 Apr 2021 • Hugo Cisneros, Josef Sivic, Tomas Mikolov

Emergent processes in complex systems such as cellular automata can perform computations of increasing complexity, and could possibly lead to artificial evolution.

Clustering

Paper
Add Code

Emergence of Self-Reproducing Metabolisms as Recursive Algorithms in an Artificial Chemistry

no code implementations • 15 Mar 2021 • Germán Kruszewski, Tomas Mikolov

One of the main goals of Artificial Life is to research the conditions for the emergence of life, not necessarily as it is, but as it could be.

Artificial Life

Paper
Add Code

Classification of Complex Systems Based on Transients

no code implementations • 31 Aug 2020 • Barbora Hudcova, Tomas Mikolov

In order to develop systems capable of modeling artificial life, we need to identify, which systems can produce complex behavior.

Artificial Life Classification +1

Paper
Add Code

Evaluating Online Continual Learning with CALM

1 code implementation • 7 Apr 2020 • Germán Kruszewski, Ionut-Teodor Sorodoc, Tomas Mikolov

Online Continual Learning (OCL) studies learning over a continuous data stream without observing any single example more than once, a setting that is closer to the experience of humans and systems that must learn "on-the-wild".

Continual Learning Language Modelling

Paper
Code

Combinatory Chemistry: Towards a Simple Model of Emergent Evolution

1 code implementation • 17 Mar 2020 • Germán Kruszewski, Tomas Mikolov

An explanatory model for the emergence of evolvable units must display emerging structures that (1) preserve themselves in time (2) self-reproduce and (3) tolerate a certain amount of variation when reproducing.

Artificial Life

Paper
Code

Evolving Structures in Complex Systems

1 code implementation • 4 Nov 2019 • Hugo Cisneros, Josef Sivic, Tomas Mikolov

In this paper we propose an approach for measuring growth of complexity of emerging patterns in complex systems such as cellular automata.

Artificial Life

Paper
Code

Updating Pre-trained Word Vectors and Text Classifiers using Monolingual Alignment

no code implementations • 14 Oct 2019 • Piotr Bojanowski, Onur Celebi, Tomas Mikolov, Edouard Grave, Armand Joulin

In this paper, we focus on the problem of adapting word vector-based models to new textual data.

text-classification Text Classification

Paper
Add Code

Place Deduplication with Embeddings

no code implementations • 29 Sep 2019 • Carl Yang, Do Huy Hoang, Tomas Mikolov, Jiawei Han

Thanks to the advancing mobile location services, people nowadays can post about places to share visiting experience on-the-go.

Paper
Add Code

Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion

4 code implementations • EMNLP 2018 • Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, Edouard Grave

Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space.

regression Retrieval +2

25,564

Paper
Code

Learning Word Vectors for 157 Languages

2 code implementations • LREC 2018 • Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance.

Ranked #12 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (using extra training data)

Only Connect Walls Dataset Task 1 (Grouping)

Paper
Code

Advances in Pre-Training Distributed Word Representations

5 code implementations • LREC 2018 • Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin

Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl.

945

Paper
Code

Fast Linear Model for Knowledge Graph Embeddings

1 code implementation • 30 Oct 2017 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Maximilian Nickel, Tomas Mikolov

This paper shows that a simple baseline based on a Bag-of-Words (BoW) representation learns surprisingly good knowledge graph embeddings.

General Classification Knowledge Base Completion +2

25,561

Paper
Code

Learning Simpler Language Models with the Differential State Framework

no code implementations • 26 Mar 2017 • Alexander G. Ororbia II, Tomas Mikolov, David Reitter

The Differential State Framework (DSF) is a simple and high-performing design that unifies previously introduced gated neural models.

Language Modelling

Paper
Add Code

CommAI: Evaluating the first steps towards a useful general AI

no code implementations • 31 Jan 2017 • Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov

With machine learning successfully applied to new daunting problems almost every day, general AI starts looking like an attainable goal.

BIG-bench Machine Learning Continual Learning +2

Paper
Add Code

FastText.zip: Compressing text classification models

43 code implementations • 12 Dec 2016 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.

General Classification Quantization +2

25,561

Paper
Code

Variable Computation in Recurrent Neural Networks

no code implementations • 18 Nov 2016 • Yacine Jernite, Edouard Grave, Armand Joulin, Tomas Mikolov

Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data.

Paper
Add Code

Enriching Word Vectors with Subword Information

53 code implementations • TACL 2017 • Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov

A vector representation is associated to each character $n$-gram; words being represented as the sum of these representations.

Word Embeddings Word Similarity

25,561

Paper
Code

Bag of Tricks for Efficient Text Classification

64 code implementations • EACL 2017 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov

This paper explores a simple and efficient baseline for text classification.

Ranked #1 on Sentiment Analysis on Sogou News

Emotion Recognition in Conversation General Classification +2

25,561

Paper
Code

A Roadmap towards Machine Intelligence

1 code implementation • 25 Nov 2015 • Tomas Mikolov, Armand Joulin, Marco Baroni

The development of intelligent machines is one of the biggest unsolved challenges in computer science.

1,329

Paper
Code

Learning Simple Algorithms from Examples

1 code implementation • 23 Nov 2015 • Wojciech Zaremba, Tomas Mikolov, Armand Joulin, Rob Fergus

We present an approach for learning simple algorithms such as copying, multi-digit addition and single digit multiplication directly from examples.

Q-Learning

180

Paper
Code

Alternative structures for character-level RNNs

1 code implementation • 19 Nov 2015 • Piotr Bojanowski, Armand Joulin, Tomas Mikolov

The first one consists on conditioning the character level representation on the previous word representation.

Language Modelling

Paper
Code

Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets

4 code implementations • NeurIPS 2015 • Armand Joulin, Tomas Mikolov

Despite the recent achievements in machine learning, we are still very far from achieving real artificial intelligence.

419

Paper
Code

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

20 code implementations • 19 Feb 2015 • Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, Tomas Mikolov

One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.

Question Answering Reading Comprehension

10,425

Paper
Code

Learning Longer Memory in Recurrent Neural Networks

5 code implementations • 24 Dec 2014 • Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, Marc'Aurelio Ranzato

In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent.

Language Modelling

169

Paper
Code

Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews

4 code implementations • 17 Dec 2014 • Grégoire Mesnil, Tomas Mikolov, Marc'Aurelio Ranzato, Yoshua Bengio

Sentiment analysis is a common task in natural language processing that aims to detect polarity of a text document (typically a consumer review).

Binary Classification General Classification +1

246

Paper
Code

Using Neural Networks for Modeling and Representing Natural Languages

no code implementations • COLING 2014 • Tomas Mikolov

Machine Translation Speech Recognition

Paper
Add Code

Distributed Representations of Sentences and Documents

27 code implementations • 16 May 2014 • Quoc V. Le, Tomas Mikolov

Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models.

Ranked #4 on Question Answering on QASent

Question Answering Sentiment Analysis +1

404

Paper
Code

Zero-Shot Learning by Convex Combination of Semantic Embeddings

2 code implementations • 19 Dec 2013 • Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S. Corrado, Jeffrey Dean

In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage.

Ranked #8 on Multi-label zero-shot learning on Open Images V4

Multi-label zero-shot learning

912

Paper
Code

One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

3 code implementations • 11 Dec 2013 • Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, Tony Robinson

We propose a new benchmark corpus to be used for measuring progress in statistical language modeling.

Ranked #22 on Language Modelling on One Billion Word

Language Modelling

76,571

Paper
Code

DeViSE: A Deep Visual-Semantic Embedding Model

no code implementations • NeurIPS 2013 • Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, Tomas Mikolov

Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories.

Ranked #13 on Zero-Shot Action Recognition on Kinetics

Object Object Recognition +1

Paper
Add Code

Distributed Representations of Words and Phrases and their Compositionality

51 code implementations • NeurIPS 2013 • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean

Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

13,661

Paper
Code

Exploiting Similarities among Languages for Machine Translation

8 code implementations • 17 Sep 2013 • Tomas Mikolov, Quoc V. Le, Ilya Sutskever

Dictionaries and phrase tables are the basis of modern statistical machine translation systems.

Machine Translation Translation

76,571

Paper
Code

Linguistic Regularities in Continuous Space Word Representations

no code implementations • NAACL 2013 • Tomas Mikolov, Wen-tau Yih, Geoffrey Zweig

Language Modelling

Paper
Add Code

Combining Heterogeneous Models for Measuring Relational Similarity

no code implementations • NAACL 2013 • Alisa Zhila, Wen-tau Yih, Christopher Meek, Geoffrey Zweig, Tomas Mikolov

Question Answering

Paper
Add Code

Efficient Estimation of Word Representations in Vector Space

77 code implementations • 16 Jan 2013 • Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean

We propose two novel model architectures for computing continuous vector representations of words from very large data sets.

Word Similarity

13,661

Paper
Code

On the difficulty of training Recurrent Neural Networks

no code implementations • 21 Nov 2012 • Razvan Pascanu, Tomas Mikolov, Yoshua Bengio

There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.