2 code implementations • 16 Oct 2021 • Haoyin Xu, Jayanta Dey, Sambit Panda, Joshua T. Vogelstein
Nonetheless, we found that those state-of-the-art algorithms suffer from a number of drawbacks, including performing very poorly on some problems and requiring a huge amount of memory on others.
2 code implementations • 31 Aug 2021 • Haoyin Xu, Kaleab A. Kinfu, Will LeVine, Sambit Panda, Jayanta Dey, Michael Ainsworth, Yu-Chung Peng, Madi Kusmanov, Florian Engert, Christopher M. White, Joshua T. Vogelstein, Carey E. Priebe
Empirically, we compare these two strategies on hundreds of tabular data settings, as well as several vision and auditory settings.
1 code implementation • 27 Dec 2019 • Cencheng Shen, Sambit Panda, Joshua T. Vogelstein
One major bottleneck is the testing process: because the null distribution of distance correlation depends on the underlying random variables and metric choice, it typically requires a permutation test to estimate the null and compute the p-value, which is very costly for large amount of data.
no code implementations • 20 Oct 2019 • Sambit Panda, Cencheng Shen, Ronan Perry, Jelle Zorn, Antoine Lutz, Carey E. Priebe, Joshua T. Vogelstein
The $k$-sample testing problem tests whether or not $k$ groups of data points are sampled from the same distribution.
4 code implementations • 3 Jul 2019 • Sambit Panda, Satish Palaniappan, Junhao Xiong, Eric W. Bridgeford, Ronak Mehta, Cencheng Shen, Joshua T. Vogelstein
We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing.
no code implementations • 30 Nov 2018 • Cencheng Shen, Sambit Panda, Joshua T. Vogelstein
It has been demonstrated that these proximity matrices can be thought of as kernels, connecting the decision forest literature to the extensive kernel machine literature.