Search Results for author: Paul M. B. Vitanyi

Found 8 papers, 2 papers with code

Identification of Probabilities

no code implementations4 Aug 2017 Paul M. B. Vitanyi, Nick Chater

But there is a more fundamental question: is the problem of inferring a probabilistic model from a sample possible even in principle?

Web Similarity in Sets of Search Terms using Database Queries

no code implementations20 Feb 2015 Andrew R. Cohen, Paul M. B. Vitanyi

Normalized web distance (NWD) is a similarity or normalized semantic distance based on the World Wide Web or another large electronic database, for instance Wikipedia, and a search engine that returns reliable aggregate page counts.

A Fast Quartet Tree Heuristic for Hierarchical Clustering

no code implementations12 Sep 2014 Rudi L. Cilibrasi, Paul M. B. Vitanyi

We also present a greatly improved heuristic, reducing the running time by a factor of order a thousand to ten thousand.

Clustering

Algorithmic Identification of Probabilities

no code implementations28 Nov 2013 Paul M. B. Vitanyi, Nick Chater

TThe problem is to identify a probability associated with a set of natural numbers, given an infinite data sequence of elements from the set.

Normalized Compression Distance of Multisets with Applications

no code implementations22 Dec 2012 Andrew R. Cohen, Paul M. B. Vitanyi

We also applied the new NCD to handwritten digit recognition and improved classification accuracy significantly over that of pairwise NCD by incorporating both the pairwise and NCD for multisets.

Classification General Classification +1

Identification of Probabilities of Languages

no code implementations24 Aug 2012 Paul M. B. Vitanyi, Nick Chater

There is an effective procedure to identify by infinite recurrence a nonempty subset of the computable measures according to which the data is typical.

Normalized Information Distance

1 code implementation15 Sep 2008 Paul M. B. Vitanyi, Frank J. Balbach, Rudi L. Cilibrasi, Ming Li

These practical realizations of the normalized information distance can then be applied to machine learning tasks, expecially clustering, to perform feature-free and parameter-free data mining.

Clustering Machine Translation +1

The Google Similarity Distance

1 code implementation21 Dec 2004 Rudi Cilibrasi, Paul M. B. Vitanyi

We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87% with the expert crafted WordNet categories.

Binary Classification Clustering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.