no code implementations • 4 Mar 2024 • Tung Le, Khai Nguyen, Shanlin Sun, Nhat Ho, Xiaohui Xie
In the realm of computer vision and graphics, accurately establishing correspondences between geometric 3D shapes is pivotal for applications like object tracking, registration, texture transfer, and statistical shape analysis.
no code implementations • 7 Feb 2024 • Huy Nguyen, Khai Nguyen, Nhat Ho
We consider the parameter estimation problem in the deviated Gaussian mixture of experts in which the data are generated from $(1 - \lambda^{\ast}) g_0(Y| X)+ \lambda^{\ast} \sum_{i = 1}^{k_{\ast}} p_{i}^{\ast} f(Y|(a_{i}^{\ast})^{\top}X+b_i^{\ast},\sigma_{i}^{\ast})$, where $X, Y$ are respectively a covariate vector and a response variable, $g_{0}(Y|X)$ is a known function, $\lambda^{\ast} \in [0, 1]$ is true but unknown mixing proportion, and $(p_{i}^{\ast}, a_{i}^{\ast}, b_{i}^{\ast}, \sigma_{i}^{\ast})$ for $1 \leq i \leq k^{\ast}$ are unknown parameters of the Gaussian mixture of experts.
no code implementations • 5 Feb 2024 • Huy Nguyen, Nhat Ho, Alessandro Rinaldo
Mixture of experts (MoE) model is a statistical machine learning design that aggregates multiple expert networks using a softmax gating function in order to form a more intricate and expressive model.
no code implementations • 5 Feb 2024 • Xing Han, Huy Nguyen, Carl Harris, Nhat Ho, Suchi Saria
As machine learning models in critical fields increasingly grapple with multimodal data, they face the dual challenges of handling a wide array of modalities, often incomplete due to missing elements, and the temporal irregularity and sparsity of collected samples.
no code implementations • 4 Feb 2024 • Quang Pham, Giang Do, Huy Nguyen, TrungTin Nguyen, Chenghao Liu, Mina Sartipi, Binh T. Nguyen, Savitha Ramasamy, XiaoLi Li, Steven Hoi, Nhat Ho
Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width.
no code implementations • 3 Feb 2024 • Duy M. H. Nguyen, Nina Lukashina, Tai Nguyen, An T. Le, TrungTin Nguyen, Nhat Ho, Jan Peters, Daniel Sonntag, Viktor Zaverkin, Mathias Niepert
Contrary to prior work, we propose a novel 2D--3D aggregation mechanism based on a differentiable solver for the \emph{Fused Gromov-Wasserstein Barycenter} problem and the use of an efficient online conformer generation method based on distance geometry.
no code implementations • 29 Jan 2024 • Khai Nguyen, Shujian Zhang, Tam Le, Nhat Ho
From the RPD, we derive the random-path slicing distribution (RPSD) and two variants of sliced Wasserstein, i. e., the Random-Path Projection Sliced Wasserstein (RPSW) and the Importance Weighted Random-Path Projection Sliced Wasserstein (IWRPSW).
no code implementations • 28 Jan 2024 • Nicola Bariletto, Nhat Ho
Training machine learning and statistical models often involves optimizing a data-driven risk criterion.
no code implementations • 25 Jan 2024 • Huy Nguyen, Pedram Akbarian, Nhat Ho
We demonstrate that due to interactions between the temperature and other model parameters via some partial differential equations, the convergence rates of parameter estimations are slower than any polynomial rates, and could be as slow as $\mathcal{O}(1/\log(n))$, where $n$ denotes the sample size.
no code implementations • 4 Jan 2024 • Hien Dang, Tho Tran, Tan Nguyen, Nhat Ho
However, when the training dataset is class-imbalanced, some NC properties will no longer be true.
no code implementations • 18 Nov 2023 • Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert
Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging.
no code implementations • 22 Oct 2023 • Huy Nguyen, Pedram Akbarian, TrungTin Nguyen, Nhat Ho
Mixture-of-experts (MoE) model incorporates the power of multiple submodels via gating functions to achieve greater performance in numerous regression and classification applications.
no code implementations • 25 Sep 2023 • Huy Nguyen, Pedram Akbarian, Fanqi Yan, Nhat Ho
When the true number of experts $k_{\ast}$ is known, we demonstrate that the convergence rates of density and parameter estimations are both parametric on the sample size.
1 code implementation • 21 Sep 2023 • Khai Nguyen, Nicola Bariletto, Nhat Ho
Monte Carlo (MC) integration has been employed as the standard approximation method for the Sliced Wasserstein (SW) distance, whose analytical expression involves an intractable expectation.
1 code implementation • NeurIPS 2023 • Duy M. H. Nguyen, Hoang Nguyen, Nghiem T. Diep, Tan N. Pham, Tri Cao, Binh T. Nguyen, Paul Swoboda, Nhat Ho, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag, Mathias Niepert
While pre-trained deep networks on ImageNet and vision-language foundation models trained on web-scale data are prevailing approaches, their effectiveness on medical tasks is limited due to the significant domain shift between natural and medical images.
no code implementations • 13 Jun 2023 • Disha Makhija, Joydeep Ghosh, Nhat Ho
Moreover, the need for uncertainty quantification and data privacy constraints are often particularly amplified for clients that have limited local data.
no code implementations • 8 Jun 2023 • Hien Dang, Tho Tran, Tan Nguyen, Nhat Ho
Specifically, via a non-trivial theoretical analysis of linear conditional VAE and hierarchical VAE with two levels of latent, we prove that the cause of posterior collapses in these models includes the correlation between the input and output of the conditional VAE and the effect of learnable encoder variance in the hierarchical VAE.
no code implementations • 27 May 2023 • Tung Le, Khai Nguyen, Shanlin Sun, Kun Han, Nhat Ho, Xiaohui Xie
The metric is defined by sliced Wasserstein distance on meshes represented as probability measures that generalize the set-based approach.
1 code implementation • 12 May 2023 • Huy Nguyen, TrungTin Nguyen, Khai Nguyen, Nhat Ho
Originally introduced as a neural network for ensemble learning, mixture of experts (MoE) has recently become a fundamental building block of highly successful modern deep neural networks for heterogeneous data analysis in several applications of machine learning and statistics.
1 code implementation • 30 Apr 2023 • Khai Nguyen, Nhat Ho
To bridge the literature on variance reduction and the literature on the SW distance, we propose computationally efficient control variates to reduce the variance of the empirical estimation of the SW distance.
1 code implementation • NeurIPS 2023 • Khai Nguyen, Nhat Ho
The second approach is optimizing for the best distribution which belongs to a parametric family of distributions and can maximize the expected distance.
1 code implementation • 12 Jan 2023 • Khai Nguyen, Dang Nguyen, Nhat Ho
Despite being efficient, Max-SW and its amortized version cannot guarantee metricity property due to the sub-optimality of the projected gradient ascent and the amortization gap.
1 code implementation • NeurIPS 2023 • Khai Nguyen, Tongzheng Ren, Nhat Ho
Sliced Wasserstein (SW) distance suffers from redundant projections due to independent uniform random projecting directions.
2 code implementations • 1 Jan 2023 • Hien Dang, Tho Tran, Stanley Osher, Hung Tran-The, Nhat Ho, Tan Nguyen
Modern deep neural networks have achieved impressive performance on tasks from image classification to natural language processing.
no code implementations • 4 Dec 2022 • Duy M. H. Nguyen, Hoang Nguyen, Mai T. N. Truong, Tri Cao, Binh T. Nguyen, Nhat Ho, Paul Swoboda, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag
Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data.
1 code implementation • 28 Nov 2022 • Khang Nguyen, Hieu Nong, Vinh Nguyen, Nhat Ho, Stanley Osher, Tan Nguyen
Graph Neural Networks (GNNs) had been demonstrated to be inherently susceptible to the problems of over-smoothing and over-squashing.
no code implementations • 24 Nov 2022 • Hoang Phan, Lam Tran, Ngoc N. Tran, Nhat Ho, Dinh Phung, Trung Le
Multi-Task Learning (MTL) is a widely-used and powerful learning paradigm for training deep neural networks that allows learning more than one objective by a single backbone.
no code implementations • 19 Oct 2022 • Dung Le, Huy Nguyen, Khai Nguyen, Trang Nguyen, Nhat Ho
Generalized sliced Wasserstein distance is a variant of sliced Wasserstein distance that exploits the power of non-linear projection through a given defining function to better capture the complex structures of the probability distributions.
no code implementations • 29 Sep 2022 • Anh Do, Duy Dinh, Tan Nguyen, Khuong Nguyen, Stanley Osher, Nhat Ho
Generative Flow Networks (GFlowNets) are recently proposed models for learning stochastic policies that generate compositional objects by sequences of actions with the probability proportional to a given reward function.
1 code implementation • 27 Sep 2022 • Khai Nguyen, Tongzheng Ren, Huy Nguyen, Litu Rout, Tan Nguyen, Nhat Ho
We explain the usage of these projections by introducing Hierarchical Radon Transform (HRT) which is constructed by applying Radon Transform variants recursively.
1 code implementation • 4 Jun 2022 • Hoang Phan, Ngoc Tran, Trung Le, Toan Tran, Nhat Ho, Dinh Phung
Furthermore, when analysing its asymptotic properties, SVGD reduces exactly to a single-objective optimization problem and can be viewed as a probabilistic version of this single-objective optimization problem.
no code implementations • 1 Jun 2022 • Tan Nguyen, Minh Pham, Tam Nguyen, Khai Nguyen, Stanley J. Osher, Nhat Ho
Multi-head attention empowers the recent success of transformers, the state-of-the-art models that have achieved remarkable success in sequence modeling and beyond.
no code implementations • 27 May 2022 • Xing Han, Tongzheng Ren, Jing Hu, Joydeep Ghosh, Nhat Ho
To attain this goal, each time series is first assigned the forecast for its cluster representative, which can be considered as a "shrinkage prior" for the set of time series it represents.
no code implementations • 25 May 2022 • Disha Makhija, Nhat Ho, Joydeep Ghosh
As the field advances, two key challenges that still remain to be addressed are: (1) system heterogeneity - variability in the compute and/or data resources present on each client, and (2) lack of labeled data in certain federated settings.
no code implementations • 23 May 2022 • Tongzheng Ren, Fuheng Cui, Sujay Sanghavi, Nhat Ho
However, when the models are over-specified, namely, the chosen number of components to fit the data is larger than the unknown true number of components, EM needs a polynomial number of iterations in terms of the sample size to reach the final statistical radius; this is computationally expensive in practice.
no code implementations • 16 May 2022 • Nhat Ho, Tongzheng Ren, Sujay Sanghavi, Purnamrita Sarkar, Rachel Ward
Therefore, the total computational complexity of the EGD algorithm is \emph{optimal} and exponentially cheaper than that of the GD for solving parameter estimation in non-regular statistical models while being comparable to that of the GD in regular statistical settings.
2 code implementations • 4 Apr 2022 • Khai Nguyen, Nhat Ho
Finally, we demonstrate the favorable performance of CSW over the conventional sliced Wasserstein in comparing probability measures over images and in training deep generative modeling on images.
1 code implementation • 25 Mar 2022 • Khai Nguyen, Nhat Ho
Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications.
1 code implementation • 1 Mar 2022 • Hoang Phan, Trung Le, Trung Phung, Tuan Anh Bui, Nhat Ho, Dinh Phung
First, they purely focus on local regularization to strengthen model robustness, missing a global regularization effect which is useful in many real-world applications (e. g., domain adaptation, domain generalization, and adversarial machine learning).
1 code implementation • 17 Feb 2022 • Tudor Manole, Nhat Ho
These new loss functions accurately capture the heterogeneity in convergence rates of fitted mixture components, and we use them to sharpen existing pointwise and uniform convergence rates in various classes of mixture models.
no code implementations • 15 Feb 2022 • Disha Makhija, Xing Han, Nhat Ho, Joydeep Ghosh
With growing concerns regarding data privacy and rapid increase in data volume, Federated Learning(FL) has become an important learning paradigm.
no code implementations • 9 Feb 2022 • Tongzheng Ren, Jiacheng Zhuo, Sujay Sanghavi, Nhat Ho
This computational complexity is cheaper than that of the fixed step-size gradient descent algorithm, which is of the order $\mathcal{O}(n^{\tau})$ for some $\tau > 1$, to reach the same statistical radius.
no code implementations • 5 Feb 2022 • Dat Do, Nhat Ho, XuanLong Nguyen
As we collect additional samples from a data population for which a known density function estimate may have been previously obtained by a black box method, the increased complexity of the data set may result in the true density being deviated from the known estimate by a mixture distribution.
no code implementations • 10 Jan 2022 • Nhat Ho, Stephen G. Walker
We present simple conditions for Bayesian consistency in the supremum metric.
no code implementations • 29 Oct 2021 • Trung Le, Dat Do, Tuan Nguyen, Huy Nguyen, Hung Bui, Nhat Ho, Dinh Phung
We study the label shift problem between the source and target domains in general domain adaptation (DA) settings.
no code implementations • 29 Oct 2021 • Dang Nguyen, Trang Nguyen, Khai Nguyen, Dinh Phung, Hung Bui, Nhat Ho
To address this issue, we propose a novel model fusion framework, named CLAFusion, to fuse neural networks with a different number of layers, which we refer to as heterogeneous neural networks, via cross-layer alignment.
1 code implementation • 16 Oct 2021 • Tam Nguyen, Tan M. Nguyen, Dung D. Le, Duy Khuong Nguyen, Viet-Anh Tran, Richard G. Baraniuk, Nhat Ho, Stanley J. Osher
Inspired by this observation, we propose Transformer with a Mixture of Gaussian Keys (Transformer-MGK), a novel transformer architecture that replaces redundant heads in transformers with a mixture of keys at each head.
no code implementations • 15 Oct 2021 • Tongzheng Ren, Fuheng Cui, Alexia Atsidakou, Sujay Sanghavi, Nhat Ho
We study the statistical and computational complexities of the Polyak step size gradient descent algorithm under generalized smoothness and Lojasiewicz conditions of the population loss function, namely, the limit of the empirical loss function when the sample size goes to infinity, and the stability between the gradients of the empirical and population loss functions, namely, the polynomial growth on the concentration bound between the gradients of sample and population loss functions.
no code implementations • 24 Aug 2021 • Khang Le, Dung Le, Huy Nguyen, Dat Do, Tung Pham, Nhat Ho
When the metric is the inner product, which we refer to as inner product Gromov-Wasserstein (IGW), we demonstrate that the optimal transportation plans of entropic IGW and its unbalanced variant are (unbalanced) Gaussian distributions.
2 code implementations • 22 Aug 2021 • Khai Nguyen, Dang Nguyen, The-Anh Vu-Le, Tung Pham, Nhat Ho
Mini-batch optimal transport (m-OT) has been widely used recently to deal with the memory issue of OT in large-scale applications.
no code implementations • 18 Aug 2021 • Khang Le, Huy Nguyen, Tung Pham, Nhat Ho
We demonstrate that the ApproxMPOT algorithm can approximate the optimal value of multimarginal POT problem with a computational complexity upper bound of the order $\tilde{\mathcal{O}}(m^3(n+1)^{m}/ \varepsilon^2)$ where $\varepsilon > 0$ stands for the desired tolerance.
no code implementations • 22 Jul 2021 • Nhat Ho, Stephen G. Walker
We introduce a class of integral theorems based on cyclic functions and Riemann sums approximating integrals.
no code implementations • 11 Jun 2021 • Nhat Ho, Stephen G. Walker
Taking the Fourier integral theorem as our starting point, in this paper we focus on natural Monte Carlo and fully nonparametric estimators of multivariate distributions and conditional distribution functions.
no code implementations • NeurIPS 2021 • Son Nguyen, Duong Nguyen, Khai Nguyen, Khoat Than, Hung Bui, Nhat Ho
Approximate inference in Bayesian deep networks exhibits a dilemma of how to yield high fidelity posterior approximations while maintaining computational efficiency and scalability.
no code implementations • NeurIPS 2021 • Khang Le, Huy Nguyen, Quang Nguyen, Tung Pham, Hung Bui, Nhat Ho
We consider robust variants of the standard optimal transport, named robust optimal transport, where marginal constraints are relaxed via Kullback-Leibler divergence.
2 code implementations • 11 Feb 2021 • Khai Nguyen, Dang Nguyen, Quoc Nguyen, Tung Pham, Hung Bui, Dinh Phung, Trung Le, Nhat Ho
To address these problems, we propose a novel mini-batch scheme for optimal transport, named Batch of Mini-batches Optimal Transport (BoMb-OT), that finds the optimal coupling between mini-batches and it can be seen as an approximation to a well-defined distance on the space of probability measures.
1 code implementation • ICCV 2021 • Trung Nguyen, Quang-Hieu Pham, Tam Le, Tung Pham, Nhat Ho, Binh-Son Hua
From this study, we propose to use sliced Wasserstein distance and its variants for learning representations of 3D point clouds.
no code implementations • 27 Jan 2021 • Jiacheng Zhuo, Jeongyeol Kwon, Nhat Ho, Constantine Caramanis
We consider solving the low rank matrix sensing problem with Factorized Gradient Descend (FGD) method when the true rank is unknown and over-specified, which we refer to as over-parameterized matrix sensing.
no code implementations • 28 Dec 2020 • Nhat Ho, Stephen G. Walker
Starting with the Fourier integral theorem, we present natural Monte Carlo estimators of multivariate functions including densities, mixing densities, transition densities, regression functions, and the search for modes of multivariate density functions (modal regression).
2 code implementations • ICLR 2021 • Khai Nguyen, Son Nguyen, Nhat Ho, Tung Pham, Hung Bui
To improve the discrepancy and consequently the relational regularization, we propose a new relational discrepancy, named spherical sliced fused Gromov Wasserstein (SSFG), that can find an important area of projections characterized by a von Mises-Fisher distribution.
no code implementations • NeurIPS 2020 • Tianyi Lin, Chenyou Fan, Nhat Ho, Marco Cuturi, Michael. I. Jordan
Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a robust variant of the Wasserstein distance.
1 code implementation • 11 Jun 2020 • Mingzhang Yin, Nhat Ho, Bowei Yan, Xiaoning Qian, Mingyuan Zhou
This paper proposes a novel optimization method to solve the exact L0-regularized regression problem, which is also known as the best subset selection.
Methodology
no code implementations • 4 Jun 2020 • Jeongyeol Kwon, Nhat Ho, Constantine Caramanis
In the low SNR regime where the SNR is below $\mathcal{O}((d/n)^{1/4})$, we show that EM converges to a $\mathcal{O}((d/n)^{1/4})$ neighborhood of the true parameters, after $\mathcal{O}((n/d)^{1/2})$ iterations.
1 code implementation • 1 Jun 2020 • Tudor Manole, Nhat Ho
We derive uniform convergence rates for the maximum likelihood estimator and minimax lower bounds for parameter estimation in two-component location-scale Gaussian mixture models with unequal variances.
no code implementations • 22 May 2020 • Nhat Ho, Koulik Khamaru, Raaz Dwivedi, Martin J. Wainwright, Michael. I. Jordan, Bin Yu
Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an important special case.
1 code implementation • ICLR 2021 • Khai Nguyen, Nhat Ho, Tung Pham, Hung Bui
Sliced-Wasserstein distance (SW) and its variant, Max Sliced-Wasserstein distance (Max-SW), have been used widely in the recent years due to their fast computation and scalability even when the probability measures lie in a very high dimensional space.
no code implementations • NeurIPS 2020 • Tianyi Lin, Nhat Ho, Xi Chen, Marco Cuturi, Michael. I. Jordan
We study the fixed-support Wasserstein barycenter problem (FS-WBP), which consists in computing the Wasserstein barycenter of $m$ discrete probability measures supported on a finite metric space of size $n$.
1 code implementation • ICML 2020 • Khiem Pham, Khang Le, Nhat Ho, Tung Pham, Hung Bui
We provide a computational complexity analysis for the Sinkhorn algorithm that solves the entropic regularized Unbalanced Optimal Transport (UOT) problem between two measures of possibly different masses with at most $n$ components.
no code implementations • 11 Dec 2019 • Wenlong Mou, Nhat Ho, Martin J. Wainwright, Peter L. Bartlett, Michael. I. Jordan
We study the problem of sampling from the power posterior distribution in Bayesian Gaussian mixture models, a robust version of the classical posterior.
1 code implementation • 10 Oct 2019 • Tam Le, Nhat Ho, Makoto Yamada
By leveraging a tree structure, we propose to align \textit{flows} from a root to each support instead of pair-wise tree metrics of supports, i. e., flows from a support to another, in GW.
no code implementations • 10 Oct 2019 • Tam Le, Viet Huynh, Nhat Ho, Dinh Phung, Makoto Yamada
We study in this paper a variant of Wasserstein barycenter problem, which we refer to as tree-Wasserstein barycenter, by leveraging a specific class of ground metrics, namely tree metrics, for Wasserstein distance.
no code implementations • 30 Sep 2019 • Tianyi Lin, Nhat Ho, Marco Cuturi, Michael. I. Jordan
This provides a first \textit{near-linear time} complexity bound guarantee for approximating the MOT problem and matches the best known complexity bound for the Sinkhorn algorithm in the classical OT setting when $m = 2$.
1 code implementation • 19 Sep 2019 • Viet Huynh, Nhat Ho, Nhan Dam, XuanLong Nguyen, Mikhail Yurochkin, Hung Bui, and Dinh Phung
We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data.
no code implementations • 9 Jul 2019 • Nhat Ho, Chiao-Yu Yang, Michael. I. Jordan
We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks.
no code implementations • 1 Jun 2019 • Tianyi Lin, Nhat Ho, Michael. I. Jordan
We prove that APDAMD achieves the complexity bound of $\widetilde{O}(n^2\sqrt{\delta}\varepsilon^{-1})$ in which $\delta>0$ stands for the regularity of $\phi$.
no code implementations • 23 May 2019 • Chiao-Yu Yang, Eric Xia, Nhat Ho, Michael I. Jordan
In this work, we provide a rigorous study for the posterior distribution of the number of clusters in DPMM under different prior distributions on the parameters and constraints on the distributions of the data.
no code implementations • 23 May 2019 • Wenshuo Guo, Nhat Ho, Michael. I. Jordan
First, we introduce the \emph{accelerated primal-dual randomized coordinate descent} (APDRCD) algorithm for computing the OT distance.
no code implementations • ICLR 2019 • Nhat Ho, Tan Nguyen, Ankit B. Patel, Anima Anandkumar, Michael. I. Jordan, Richard G. Baraniuk
The conjugate prior yields a new regularizer for learning based on the paths rendered in the generative model for training CNNs–the Rendering Path Normalization (RPN).
no code implementations • 16 Apr 2019 • Nhat Ho, Tianyi Lin, Michael. I. Jordan
We also conduct experiments on real datasets and the numerical results demonstrate the effectiveness of our algorithms.
no code implementations • 1 Feb 2019 • Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Martin J. Wainwright, Michael. I. Jordan, Bin Yu
We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on $n$ i. i. d.
no code implementations • 19 Jan 2019 • Tianyi Lin, Nhat Ho, Michael. I. Jordan
We show that a greedy variant of the classical Sinkhorn algorithm, known as the \emph{Greenkhorn algorithm}, can be improved to $\widetilde{\mathcal{O}}(n^2\varepsilon^{-2})$, improving on the best known complexity bound of $\widetilde{\mathcal{O}}(n^2\varepsilon^{-3})$.
Data Structures and Algorithms
no code implementations • 15 Nov 2018 • Trung Le, Khanh Nguyen, Nhat Ho, Hung Bui, Dinh Phung
The underlying idea of deep domain adaptation is to bridge the gap between source and target domains in a joint space so that a supervised classifier trained on labeled source data can be nicely transferred to the target domain.
no code implementations • 1 Nov 2018 • Tan Nguyen, Nhat Ho, Ankit Patel, Anima Anandkumar, Michael. I. Jordan, Richard G. Baraniuk
This conjugate prior yields a new regularizer based on paths rendered in the generative model for training CNNs-the Rendering Path Normalization (RPN).
no code implementations • 29 Oct 2018 • Nhat Ho, Viet Huynh, Dinh Phung, Michael. I. Jordan
We propose a novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence.
no code implementations • 1 Oct 2018 • Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Michael. I. Jordan, Martin J. Wainwright, Bin Yu
A line of recent work has analyzed the behavior of the Expectation-Maximization (EM) algorithm in the well-specified setting, in which the population likelihood is locally strongly concave around its maximizing argument.
1 code implementation • ICML 2017 • Nhat Ho, XuanLong Nguyen, Mikhail Yurochkin, Hung Hai Bui, Viet Huynh, Dinh Phung
We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data.
no code implementations • 9 Sep 2016 • Nhat Ho, XuanLong Nguyen
Our study makes explicit the deep links between model singularities, parameter estimation convergence rates and minimax lower bounds, and the algebraic geometry of the parameter space for mixtures of continuous distributions.
no code implementations • 11 Jan 2015 • Nhat Ho, XuanLong Nguyen
This paper studies identifiability and convergence behaviors for parameters of multiple types in finite mixtures, and the effects of model fitting with extra mixing components.