no code implementations • 4 Aug 2017 • Paul M. B. Vitanyi, Nick Chater
But there is a more fundamental question: is the problem of inferring a probabilistic model from a sample possible even in principle?
no code implementations • 20 Feb 2015 • Andrew R. Cohen, Paul M. B. Vitanyi
Normalized web distance (NWD) is a similarity or normalized semantic distance based on the World Wide Web or another large electronic database, for instance Wikipedia, and a search engine that returns reliable aggregate page counts.
no code implementations • 12 Sep 2014 • Rudi L. Cilibrasi, Paul M. B. Vitanyi
We also present a greatly improved heuristic, reducing the running time by a factor of order a thousand to ten thousand.
no code implementations • 28 Nov 2013 • Paul M. B. Vitanyi, Nick Chater
TThe problem is to identify a probability associated with a set of natural numbers, given an infinite data sequence of elements from the set.
no code implementations • 22 Dec 2012 • Andrew R. Cohen, Paul M. B. Vitanyi
We also applied the new NCD to handwritten digit recognition and improved classification accuracy significantly over that of pairwise NCD by incorporating both the pairwise and NCD for multisets.
no code implementations • 24 Aug 2012 • Paul M. B. Vitanyi, Nick Chater
There is an effective procedure to identify by infinite recurrence a nonempty subset of the computable measures according to which the data is typical.
1 code implementation • 15 Sep 2008 • Paul M. B. Vitanyi, Frank J. Balbach, Rudi L. Cilibrasi, Ming Li
These practical realizations of the normalized information distance can then be applied to machine learning tasks, expecially clustering, to perform feature-free and parameter-free data mining.
1 code implementation • 21 Dec 2004 • Rudi Cilibrasi, Paul M. B. Vitanyi
We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87% with the expert crafted WordNet categories.