Search Results for author: TrungTin Nguyen

Found 9 papers, 3 papers with code

Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence

no code implementations • 19 Apr 2024 • Mark Chiu Chong, Hien Duy Nguyen, TrungTin Nguyen

We consider the problem of estimating probability density functions based on sample data, using a finite mixture of densities from some component class.

Paper
Add Code

CompeteSMoE - Effective Training of Sparse Mixture of Experts via Competition

no code implementations • 4 Feb 2024 • Quang Pham, Giang Do, Huy Nguyen, TrungTin Nguyen, Chenghao Liu, Mina Sartipi, Binh T. Nguyen, Savitha Ramasamy, XiaoLi Li, Steven Hoi, Nhat Ho

Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width.

Paper
Add Code

Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks

no code implementations • 3 Feb 2024 • Duy M. H. Nguyen, Nina Lukashina, Tai Nguyen, An T. Le, TrungTin Nguyen, Nhat Ho, Jan Peters, Daniel Sonntag, Viktor Zaverkin, Mathias Niepert

Contrary to prior work, we propose a novel 2D--3D aggregation mechanism based on a differentiable solver for the \emph{Fused Gromov-Wasserstein Barycenter} problem and the use of an efficient online conformer generation method based on distance geometry.

Molecular Property Prediction Property Prediction

Paper
Add Code

HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts

1 code implementation • 12 Dec 2023 • Giang Do, Khiem Le, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Bint T. Nguyen, Chenghao Liu, Savitha Ramasamy, XiaoLi Li, Steven Hoi

By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models.

Paper
Code

A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts

no code implementations • 22 Oct 2023 • Huy Nguyen, Pedram Akbarian, TrungTin Nguyen, Nhat Ho

Mixture-of-experts (MoE) model incorporates the power of multiple submodels via gating functions to achieve greater performance in numerous regression and classification applications.

Density Estimation regression

Paper
Add Code

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

1 code implementation • 12 May 2023 • Huy Nguyen, TrungTin Nguyen, Khai Nguyen, Nhat Ho

Originally introduced as a neural network for ensemble learning, mixture of experts (MoE) has recently become a fundamental building block of highly successful modern deep neural networks for heterogeneous data analysis in several applications of machine learning and statistics.

Ensemble Learning

Paper
Code

Non-asymptotic model selection in block-diagonal mixture of polynomial experts models

no code implementations • 18 Apr 2021 • TrungTin Nguyen, Faicel Chamroukhi, Hien Duy Nguyen, Florence Forbes

This model selection criterion allows us to handle the challenging problem of inferring the number of mixture components, the degree of polynomial mean functions, and the hidden block-diagonal structures of the covariance matrices, which reduces the number of parameters to be estimated and leads to a trade-off between complexity and sparsity in the model.

Model Selection regression

Paper
Add Code

A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models

1 code implementation • 6 Apr 2021 • TrungTin Nguyen, Hien Duy Nguyen, Faicel Chamroukhi, Florence Forbes

Mixture of experts (MoE) are a popular class of statistical and machine learning models that have gained attention over the years due to their flexibility and efficiency.

Model Selection

Paper
Code

Non-asymptotic oracle inequalities for the Lasso in high-dimensional mixture of experts

no code implementations • 22 Sep 2020 • TrungTin Nguyen, Hien D. Nguyen, Faicel Chamroukhi, Geoffrey J. McLachlan

Mixture of experts (MoE) has a well-principled finite mixture model construction for prediction, allowing the gating network (mixture weights) to learn from the predictors (explanatory variables) together with the experts' network (mixture component densities).

feature selection Model Selection +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.