no code implementations • 19 Mar 2025 • Abhi Kamboj, Minh N. Do
Multimodal alignment aims to construct a joint latent vector space where two modalities representing the same concept map to the same vector.
no code implementations • 3 Dec 2024 • Renan A. Rojas-Gomez, Minh N. Do
GIST replaces the standard Neural Style Transfer autoencoding framework with a multiscale image expansion, preserving scene details without the need for post-processing or training.
no code implementations • 2 Dec 2024 • Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do
Test-time adaptation (TTA) has emerged as a promising solution to tackle the continual domain shift in machine learning by allowing model parameters to change at test time, via self-supervised learning on unlabeled testing data.
no code implementations • 11 Dec 2023 • Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do
We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue.
1 code implementation • 30 Nov 2023 • Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do
Current test-time adaptation (TTA) approaches aim to adapt a machine learning model to environments that change continuously.
no code implementations • CVPR 2024 • Renan A. Rojas-Gomez, Teck-Yian Lim, Minh N. Do, Raymond A. Yeh
For computer vision, Vision Transformers (ViTs) have become one of the go-to deep net architectures.
1 code implementation • 9 Mar 2023 • Minh-Quan Le, Tam V. Nguyen, Trung-Nghia Le, Thanh-Toan Do, Minh N. Do, Minh-Triet Tran
To overcome the disadvantage of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and $K-$shot information.
1 code implementation • 20 Nov 2022 • Quan Nguyen, Hieu H. Pham, Kok-Seng Wong, Phi Le Nguyen, Truong Thao Nguyen, Minh N. Do
FedDCT reduces the memory requirements and allows low-end devices to participate in FL.
1 code implementation • 14 Oct 2022 • Renan A. Rojas-Gomez, Teck-Yian Lim, Alexander G. Schwing, Minh N. Do, Raymond A. Yeh
We propose learnable polyphase sampling (LPS), a pair of learnable down/upsampling layers that enable truly shift-invariant and equivariant convolutional networks.
1 code implementation • 12 Jul 2022 • Khoi-Nguyen C. Mac, Minh N. Do, Minh P. Vo
We validate the system on EPIC-KITCHENS and UCF-101 datasets for action recognition, and show that our proposed approach can greatly speed up inference with a tolerable loss of accuracy compared with those from state-of-the-art baselines.
no code implementations • 15 May 2022 • Trung-Hieu Hoang, Mona Zehni, Huaijin Xu, George Heintz, Christopher Zallek, Minh N. Do
In this paper, we propose an accessible vision-based exam and documentation solution called Digitized Neurological Examination (DNE) to expand exam biomarker recording options and clinical applications using a smartphone/tablet.
1 code implementation • 27 Nov 2021 • Vaishnavi Subramanian, Tanveer Syeda-Mahmood, Minh N. Do
We propose a two-stage prediction pipeline using pCCA embeddings generated with deflation for latent variable prediction by combining all the above.
1 code implementation • 24 Nov 2021 • Qian Jiang, Xiaofan Zhang, Deming Chen, Minh N. Do, Raymond A. Yeh
In this work, we propose End-to-end Hardware-aware DNAS (EH-DNAS), a seamless integration of end-to-end hardware benchmarking, and fully automated DNAS to deliver hardware-efficient deep neural networks on various platforms, including Edge GPUs, Edge TPUs, Mobile CPUs, and customized accelerators.
no code implementations • 13 Aug 2021 • Spencer Markowitz, Corey Snyder, Yonina C. Eldar, Minh N. Do
Background foreground separation (BFS) is a popular computer vision problem where dynamic foreground objects are separated from the static background of a scene.
1 code implementation • 13 Jun 2021 • Renan A. Rojas-Gomez, Raymond A. Yeh, Minh N. Do, Anh Nguyen
Despite unconditional feature inversion being the foundation of many image synthesis applications, training an inverter demands a high computational budget, large decoding capacity and imposing conditions such as autoregressive priors.
1 code implementation • 9 Mar 2021 • Vaishnavi Subramanian, Tanveer Syeda-Mahmood, Minh N. Do
Effective understanding of a disease such as cancer requires fusing multiple sources of information captured across physical scales by multimodal data.
1 code implementation • 5 Feb 2020 • Vaishnavi Subramanian, Minh N. Do, Tanveer Syeda-Mahmood
Lung cancer has a high rate of recurrence in early-stage patients.
no code implementations • 9 Nov 2019 • Ramanpreet Singh Pahwa, Kennard Yanting Chan, Jiamin Bai, Vincensius Billy Saputra, Minh N. Do, Shaohui Foong
In this work, we design a UAV with a single rotating camera to accomplish the task.
4 code implementations • ICCV 2019 • Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, JinJun Xiong, Rogerio S. Feris, Minh N. Do
Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction.
no code implementations • 27 Jun 2018 • Ramanpreet Singh Pahwa, Wei Kiat Leong, Shaohui Foong, Karianto Leman, Minh N. Do
Traditional image stitching algorithms use transforms such as homography to combine different views of a scene.
2 code implementations • 31 May 2018 • Benjamin Chidester, Minh N. Do, Jian Ma
Performance of neural networks can be significantly improved by encoding known invariance for particular tasks.
no code implementations • NeurIPS 2017 • Raymond A. Yeh, JinJun Xiong, Wen-mei W. Hwu, Minh N. Do, Alexander G. Schwing
Textual grounding is an important but challenging task for human-computer interaction, robotics and knowledge mining.
no code implementations • CVPR 2018 • Raymond A. Yeh, Minh N. Do, Alexander G. Schwing
Textual grounding, i. e., linking words to objects in images, is a challenging but important task for robotics and human-computer interaction.
1 code implementation • 25 Feb 2018 • Mona Zehni, Minh N. Do, Zhizhen Zhao
Instead of trying to locate the segment within the sequence through pair-wise matching, we propose a new approach that uses shift-invariant features to estimate both the underlying signal and the distribution of the positions of the segments.
Signal Processing
no code implementations • 19 Dec 2017 • Ramanpreet Singh Pahwa, Tian Tsong Ng, Minh N. Do
3D object proposals, quickly detected regions in a 3D scene that likely contain an object of interest, are an effective approach to improve the computational efficiency and accuracy of the object detection framework.
no code implementations • 8 Sep 2017 • Ramanpreet Singh Pahwa, Jiangbo Lu, Nianjuan Jiang, Tian Tsong Ng, Minh N. Do
Using efficient but robust registration enables us to combine multiple frames of a scene in near real time and generate 3D bounding boxes for potential 3D regions of interest.
no code implementations • 8 Sep 2017 • Ramanpreet Singh Pahwa, Minh N. Do, Tian Tsong Ng, Binh-Son Hua
Depth sensing devices have created various new applications in scientific and commercial research with the advent of Microsoft Kinect and PMD (Photon Mixing Device) cameras.
no code implementations • 15 Sep 2016 • Johann A. Bengua, Ho N. Phien, Hoang D. Tuan, Minh N. Do
This paper introduces matrix product state (MPS) decomposition as a new and systematic method to compress multidimensional data represented by higher-order tensors.
7 code implementations • CVPR 2017 • Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do
In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning on the available data.
no code implementations • 14 Jul 2016 • Johann A. Bengua, Hoang D. Tuan, Ho N. Phien, Minh N. Do
The proposed framework performs image completion by concatenating copies of a single image that has missing entries into a third-order tensor, applying a dimensionality augmentation technique to the tensor, utilizing a tensor completion algorithm for recovering its missing entries, and finally extracting the recovered image from the tensor.
no code implementations • 5 Jun 2016 • Johann A. Bengua, Ho N. Phien, Hoang D. Tuan, Minh N. Do
The approach is based on the tensor train (TT) rank, which is able to capture hidden information from tensors thanks to its definition from a well-balanced matricization scheme.
Numerical Analysis Data Structures and Algorithms
no code implementations • 27 Apr 2016 • Seungryong Kim, Dongbo Min, Bumsub Ham, Minh N. Do, Kwanghoon Sohn
In this paper, we propose a novel dense descriptor, called dense adaptive self-correlation (DASC), to estimate multi-modal and multi-spectral dense correspondences.
no code implementations • ICCV 2015 • Yu Li, Dongbo Min, Michael S. Brown, Minh N. Do, Jiangbo Lu
However, the quality of the PMBP solution is tightly coupled with the local window size, over which the raw data cost is aggregated to mitigate ambiguity in the data constraint.
no code implementations • ICCV 2015 • Siying Liu, Tian-Tsong Ng, Kalyan Sunkavalli, Minh N. Do, Eli Shechtman, Nathan Carr
In this work, we investigate the problem of automatically inferring the lattice structure of near-regular textures (NRT) in real-world images.
no code implementations • CVPR 2015 • Nianjuan Jiang, Daniel Lin, Minh N. Do, Jiangbo Lu
Most conventional structure-from-motion (SFM) techniques require camera pose estimation before computing any scene structure.
no code implementations • CVPR 2015 • Seungryong Kim, Dongbo Min, Bumsub Ham, Seungchul Ryu, Minh N. Do, Kwanghoon Sohn
To further improve the matching quality and runtime efficiency, we propose a patch-wise receptive field pooling, in which a sampling pattern is optimized with a discriminative learning.
no code implementations • 20 Apr 2015 • Yu Zhang, Xiu-Shen Wei, Jianxin Wu, Jianfei Cai, Jiangbo Lu, Viet-Anh Nguyen, Minh N. Do
Most existing works heavily rely on object / part detectors to build the correspondence between object parts by using object or object part annotations inside training images.
no code implementations • 2 Mar 2015 • Johann A. Bengua, Ho N. Phien, Hoang D. Tuan, Minh N. Do
This paper introduces matrix product state (MPS) decomposition as a computational tool for extracting features of multidimensional data represented by higher-order tensors.
no code implementations • CVPR 2013 • Jiangbo Lu, Hongsheng Yang, Dongbo Min, Minh N. Do
Recent studies on fast cost volume filtering based on efficient edge-aware filters have provided a fast alternative to solve discrete labeling problems, with the complexity independent of the support window size.