no code implementations • 8 Aug 2024 • Khanh Doan, Long Tung Vuong, Tuan Nguyen, Anh Tuan Bui, Quyen Tran, Thanh-Toan Do, Dinh Phung, Trung Le
Diffusion models (DM) have become fundamental components of generative models, excelling across various domains such as image creation, audio generation, and complex data interpolation.
2 code implementations • 20 Jul 2024 • Cuong Pham, Hoang Anh Dung, Cuong C. Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do
The transformation network modifies the original calibration data and the modified data will be used as the training set to learn the quantized model with the objective that the quantized model achieves a good performance on the original calibration data.
no code implementations • 9 Jul 2024 • Zheng Zhang, Wenjie Ai, Kevin Wells, David Rosewarne, Thanh-Toan Do, Gustavo Carneiro
This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users.
no code implementations • NeurIPS 2023 • Cuong Pham, Cuong C. Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do
Bayesian Neural Networks (BNNs) offer probability distributions for model parameters, enabling uncertainty quantification in predictions.
no code implementations • 18 Jun 2024 • Tuan-Luc Huynh, Thuy-Trang Vu, Weiqing Wang, Yinwei Wei, Trung Le, Dragan Gasevic, Yuan-Fang Li, Thanh-Toan Do
Differentiable Search Index (DSI) utilizes Pre-trained Language Models (PLMs) for efficient document retrieval without relying on external indexes.
no code implementations • 11 Jun 2024 • Van-Anh Nguyen, Quyen Tran, Tuan Truong, Thanh-Toan Do, Dinh Phung, Trung Le
Sharpness-aware minimization (SAM) has been instrumental in improving deep neural network training by minimizing both the training loss and the sharpness of the loss landscape, leading the model into flatter minima that are associated with better generalization properties.
no code implementations • 25 Apr 2024 • Parul Gupta, Munawar Hayat, Abhinav Dhall, Thanh-Toan Do
Few-shot image synthesis entails generating diverse and realistic images of novel categories using only a few example images.
1 code implementation • 9 Mar 2024 • Cuong Pham, Van-Anh Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do
Inspired by the benefits of the frequency domain, we propose a novel module that functions as an attention mechanism in the frequency domain.
no code implementations • 1 Jan 2024 • Parul Gupta, Tuan Nguyen, Abhinav Dhall, Munawar Hayat, Trung Le, Thanh-Toan Do
The task of Visual Relationship Recognition (VRR) aims to identify relationships between two interacting objects in an image and is particularly challenging due to the widely-spread and highly imbalanced distribution of <subject, relation, object> triplets.
no code implementations • 22 Nov 2023 • Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, Gustavo Carneiro
The ill-posedness of the LNL task requires the adoption of strong assumptions or the use of multiple noisy labels per training image, resulting in accurate models that work well in isolation but fail to optimise human-AI collaborative classification (HAI-CC).
no code implementations • 29 Aug 2023 • Minh-Quan Le, Minh-Triet Tran, Trung-Nghia Le, Tam V. Nguyen, Thanh-Toan Do
Camouflaged object detection (COD) and camouflaged instance segmentation (CIS) aim to recognize and segment objects that are blended into their surroundings, respectively.
no code implementations • 29 Aug 2023 • Anh-Khoa Nguyen Vu, Thanh-Toan Do, Vinh-Tiep Nguyen, Tam Le, Minh-Triet Tran, Tam V. Nguyen
Our overarching goal is to train a generator that captures the data variations of the base dataset.
1 code implementation • NeurIPS 2023 • Van-Anh Nguyen, Trung Le, Anh Tuan Bui, Thanh-Toan Do, Dinh Phung
Interestingly, our developed theories allow us to flexibly incorporate the concept of sharpness awareness into training, whether it's a single model, ensemble models, or Bayesian Neural Networks, by considering specific forms of the center model distribution.
1 code implementation • 31 May 2023 • Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro
To address IDN, Label Noise Learning (LNL) incorporates a sample selection stage to differentiate clean and noisy-label samples.
1 code implementation • 15 Apr 2023 • Thanh-Danh Nguyen, Anh-Khoa Nguyen Vu, Nhat-Duy Nguyen, Vinh-Tiep Nguyen, Thanh Duc Ngo, Thanh-Toan Do, Minh-Triet Tran, Tam V. Nguyen
Camouflaged object detection and segmentation is a new and challenging research topic in computer vision.
Camouflaged Object Segmentation Few-shot Instance Segmentation +1
1 code implementation • 15 Apr 2023 • Anh-Khoa Nguyen Vu, Thanh-Toan Do, Nhat-Duy Nguyen, Vinh-Tiep Nguyen, Thanh Duc Ngo, Tam V. Nguyen
In this paper, we exploit the hierarchical information to leverage discriminative and relevant features of base classes to effectively classify novel objects.
no code implementations • 20 Mar 2023 • Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro
The prevalence of noisy-label samples poses a significant challenge in deep learning, inducing overfitting effects.
1 code implementation • 9 Mar 2023 • Minh-Quan Le, Tam V. Nguyen, Trung-Nghia Le, Thanh-Toan Do, Minh N. Do, Minh-Triet Tran
To overcome the disadvantage of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and $K-$shot information.
no code implementations • 4 Jan 2023 • Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro
To meet this requirement without relying on additional $2C - 2$ manual annotations per instance, we propose a method that automatically generates additional noisy labels by estimating the noisy label distribution based on nearest neighbours.
no code implementations • 4 Jan 2023 • Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro
Developing meta-learning algorithms that are un-biased toward a subset of training tasks often requires hand-designed criteria to weight tasks, potentially resulting in sub-optimal solutions.
no code implementations • 27 Oct 2022 • Cuong Pham, Tuan Hoang, Thanh-Toan Do
Knowledge distillation which learns a lightweight student model by distilling knowledge from a cumbersome teacher model is an attractive approach for learning compact deep neural networks (DNNs).
1 code implementation • 14 Oct 2022 • Van-Anh Nguyen, Khanh Pham Dinh, Long Tung Vuong, Thanh-Toan Do, Quan Hung Tran, Dinh Phung, Trung Le
Our approach departs from the computational process of ViTs with a focus on visualizing the local and global information in input images and the latent feature embeddings at multiple levels.
1 code implementation • 2 Sep 2022 • Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro
Noisy labels are unavoidable yet troublesome in the ecosystem of deep learning because models can easily overfit them.
Ranked #1 on Learning with noisy labels on CIFAR-100
no code implementations • 13 Dec 2021 • Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung
First, to learn informative representations that can preserve both intra- and inter-modal similarities, we leverage the recent advances in estimating variational lower-bound of MI to maximize the MI between the binary representations and input features and between binary representations of different modalities.
1 code implementation • 1 Nov 2021 • Manolis Pitsikalis, Thanh-Toan Do, Alexei Lisitsa, Shan Luo
The shipping industry is an important component of the global trade and economy, however in order to ensure law compliance and safety it needs to be monitored.
1 code implementation • 9 Jun 2021 • Cuong C. Nguyen, Thanh-Toan Do, Gustavo Carneiro
We propose probabilistic task modelling -- a generative probabilistic model for collections of tasks used in meta-learning.
no code implementations • 31 Mar 2021 • Trung-Nghia Le, Yubo Cao, Tan-Cong Nguyen, Minh-Quan Le, Khanh-Duy Nguyen, Thanh-Toan Do, Minh-Triet Tran, Tam V. Nguyen
We also provide a benchmark suite for the task of camouflaged instance segmentation.
1 code implementation • 27 Jan 2021 • Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro
Recent advances in meta-learning has led to remarkable performances on several few-shot learning benchmarks.
no code implementations • 26 Dec 2020 • Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung
With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels.
1 code implementation • 23 Sep 2020 • Tuong Do, Binh X. Nguyen, Huy Tran, Erman Tjiputra, Quang D. Tran, Thanh-Toan Do
Different approaches have been proposed to Visual Question Answering (VQA).
1 code implementation • 9 Sep 2020 • Binh X. Nguyen, Binh D. Nguyen, Gustavo Carneiro, Erman Tjiputra, Quang D. Tran, Thanh-Toan Do
Based on pseudo labels, we propose a novel unsupervised metric loss which enforces the positive concentration and negative separation of samples in the embedding space.
no code implementations • 1 Aug 2020 • Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung
This paper presents a novel framework, namely Deep Cross-modality Spectral Hashing (DCSH), to tackle the unsupervised learning problem of binary hash codes for efficient cross-modal retrieval.
no code implementations • Pattern Recognition Journal 2020 • Jinnan Yan, Trung-Nghia Le, Khanh-Duy Nguyen, Minh-Triet Tran, Thanh-Toan Do, Tam V. Nguyen
Differently from existing networks for segmentation, our proposed network possesses two segmentation streams: the main stream and the mirror stream corresponding with the original image and its flipped image, respectively.
Ranked #8 on Camouflaged Object Segmentation on CAMO
no code implementations • 15 May 2020 • Flávia Alves, Martin Gairing, Frans A. Oliehoek, Thanh-Toan Do
In HAR, the development of Activity Recognition models is dependent upon the data captured by these devices and the methods used to analyse them, which directly affect performance metrics.
1 code implementation • 5 Mar 2020 • Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro
We introduce a new and rigorously-formulated PAC-Bayes meta-learning algorithm that solves few-shot learning.
1 code implementation • ICCV 2019 • Tuong Do, Thanh-Toan Do, Huy Tran, Erman Tjiputra, Quang D. Tran
In Visual Question Answering (VQA), answers have a great correlation with question meaning and visual contents.
Ranked #2 on Visual Question Answering (VQA) on TDIUC
2 code implementations • 26 Sep 2019 • Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran
Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training.
Ranked #11 on Medical Visual Question Answering on VQA-RAD (Open-ended Accuracy metric, using extra training data)
no code implementations • ICCV 2019 • Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Thanh-Toan Do, Ian Reid
Our experiments show that, compared to state-of-the-art techniques, our method has much greater potential for large-scale place recognition for autonomous driving.
1 code implementation • 27 Jul 2019 • Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro
We introduce a new, rigorously-formulated Bayesian meta-learning algorithm that learns a probability distribution of model parameter prior for few-shot learning.
no code implementations • 26 Apr 2019 • Toan Tran, Thanh-Toan Do, Ian Reid, Gustavo Carneiro
Deep learning models have demonstrated outstanding performance in several problems, but their training process tends to require immense amounts of computational and human resources for training and labeling, constraining the types of problems that can be tackled.
no code implementations • 24 Apr 2019 • Thanh-Toan Do, Khoa Le, Tuan Hoang, Huu Le, Tam V. Nguyen, Ngai-Man Cheung
This global vector is then subjected to a hashing function to generate a binary hash code.
no code implementations • CVPR 2019 • Thanh-Toan Do, Toan Tran, Ian Reid, Vijay Kumar, Tuan Hoang, Gustavo Carneiro
Another approach explored in the field relies on an ad-hoc linearization (in terms of N) of the triplet loss that introduces class centroids, which must be optimized using the whole training set for each mini-batch - this means that a naive implementation of this approach has run-time complexity O(N^2).
1 code implementation • 6 Apr 2019 • Huu Le, Thanh-Toan Do, Tuan Hoang, Ngai-Man Cheung
In particular, our work enables the use of randomized methods for point cloud registration without the need of putative correspondences.
no code implementations • 23 Mar 2019 • Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis
We propose V2CNet, a new deep learning framework to automatically translate the demonstration videos to commands that can be directly used in robotic applications.
no code implementations • 5 Feb 2019 • Huu Le, Tuan Hoang, Qianggong Zhang, Thanh-Toan Do, Anders Eriksson, Michael Milford
In this paper, we present a novel 6-DOF localization system that for the first time simultaneously achieves all the three characteristics: significantly sub-linear storage growth, agnosticism to image descriptors, and customizability to available storage and computational resources.
no code implementations • 12 Jan 2019 • Yu Liu, Lingqiao Liu, Hamid Rezatofighi, Thanh-Toan Do, Qinfeng Shi, Ian Reid
As the post-processing step for object detection, non-maximum suppression (GreedyNMS) is widely used in most of the detectors for many years.
3 code implementations • 20 Nov 2018 • Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Shin-Fang Ch'ng, Thanh-Toan Do, Ian Reid
Our approaches rely on local features with an encoding technique to represent an image as a single vector.
1 code implementation • 24 Oct 2018 • Huu Le, Anders Eriksson, Thanh-Toan Do, Michael Milford
This approach allows us to solve constrained K-Means where multiple types of constraints can be simultaneously enforced.
no code implementations • 3 Jul 2018 • Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, Mats Sjöberg, Bogdan Ionescu, Thanh-Toan Do, France Rennes
In this paper, we present the Predicting Media Memorability task, which is proposed as part of the MediaEval 2018 Benchmarking Initiative for Multimedia Evaluation.
2 code implementations • 16 Jun 2018 • Anh-Dzung Doan, Abdul Mohsi Jawaid, Thanh-Toan Do, Tat-Jun Chin
This document describes G2D, a software that enables capturing videos from Grand Theft Auto V (GTA V), a popular role playing game set in an expansive virtual city.
no code implementations • ECCV 2018 • Trung Pham, Vijay Kumar B G, Thanh-Toan Do, Gustavo Carneiro, Ian Reid
In this paper, we present a novel open-set semantic instance segmentation approach capable of segmenting all known and unknown object classes in images, based on the output of an object detector trained on known object classes.
1 code implementation • 16 Mar 2018 • Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis
The key idea of our approach is the use of object descriptions to provide the detailed understanding of an object.
no code implementations • 28 Feb 2018 • Thanh-Toan Do, Ming Cai, Trung Pham, Ian Reid
Detecting objects and their 6D poses from only RGB images is an important task for many robotic applications.
no code implementations • 21 Feb 2018 • Thanh-Toan Do, Tuan Hoang, Dang-Khoa Le Tan, Trung Pham, Huu Le, Ngai-Man Cheung, Ian Reid
However, training deep hashing networks for the task is challenging due to the binary constraints on the hash codes, the similarity preserving property, and the requirement for a vast amount of labelled images.
no code implementations • 19 Feb 2018 • Tuan Hoang, Thanh-Toan Do, Huu Le, Dang-Khoa Le-Tan, Ngai-Man Cheung
For unsupervised data-dependent hashing, the two most important requirements are to preserve similarity in the low-dimensional feature space and to minimize the binary quantization loss.
no code implementations • 10 Feb 2018 • Ngoc-Trung Tran, Dang-Khoa Le Tan, Anh-Dzung Doan, Thanh-Toan Do, Tuan-Anh Bui, Mengxuan Tan, Ngai-Man Cheung
In order to overcome the resource constraints of mobile devices, we propose a system design that leverages the scalability advantage of image retrieval and accuracy of 3D model-based localization.
1 code implementation • 7 Feb 2018 • Thanh-Toan Do, Tuan Hoang, Dang-Khoa Le Tan, Huu Le, Tam V. Nguyen, Ngai-Man Cheung
In the large-scale image retrieval task, the two most important requirements are the discriminability of image representations and the efficiency in computation and storage of representations.
no code implementations • 8 Dec 2017 • Thanh-Toan Do, Tuan Hoang, Dang-Khoa Le Tan, Anh-Dzung Doan, Ngai-Man Cheung
This design has overcome a challenging problem in some previous works: optimizing non-smooth objective functions because of binarization.
no code implementations • 24 Nov 2017 • Dang-Khoa Le Tan, Thanh-Toan Do, Ngai-Man Cheung
Image hashing is a popular technique applied to large scale content-based visual retrieval due to its compact and efficient binary codes.
1 code implementation • 27 Oct 2017 • Huu Le, Tat-Jun Chin, Anders Eriksson, Thanh-Toan Do, David Suter
Further, our approach is naturally applicable to estimation problems with geometric residuals
2 code implementations • 21 Sep 2017 • Thanh-Toan Do, Anh Nguyen, Ian Reid
We propose AffordanceNet, a new deep learning approach to simultaneously detect multiple objects and their affordances from RGB images.
no code implementations • 21 Sep 2017 • Trung Pham, Thanh-Toan Do, Niko Sünderhauf, Ian Reid
This paper presents SceneCut, a novel approach to jointly discover previously unseen objects and non-object surfaces using a single RGB-D image.
1 code implementation • 22 Aug 2017 • Anh Nguyen, Thanh-Toan Do, Darwin G. Caldwell, Nikos G. Tsagarakis
Our method first creates the event image from a list of events that occurs in a very short time interval, then a Stacked Spatial LSTM Network (SP-LSTM) is used to learn the camera pose.
1 code implementation • 4 Jul 2017 • Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung
Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors.
no code implementations • 6 Apr 2017 • Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung
We introduce a novel approach to improve unsupervised hashing.
no code implementations • CVPR 2017 • Thanh-Toan Do, Dang-Khoa Le Tan, Trung T. Pham, Ngai-Man Cheung
This feature vector is then subjected to a hashing function that produces a binary hash code.
no code implementations • 19 Jul 2016 • Thanh-Toan Do, Anh-Dzung Doan, Duc-Thanh Nguyen, Ngai-Man Cheung
This paper proposes two approaches for inferencing binary codes in two-step (supervised, unsupervised) hashing.
no code implementations • 18 Jul 2016 • Thanh-Toan Do, Anh-Dzung Doan, Ngai-Man Cheung
Our resulting optimization with these binary, independence, and balance constraints is difficult to solve.
no code implementations • 23 May 2016 • Thanh-Toan Do, Ngai-Man Cheung
The objective of this paper is to design an embedding method that maps local features describing an image (e. g. SIFT) to a higher dimensional representation useful for the image retrieval problem.
no code implementations • 6 Jan 2016 • Yiren Zhou, Hossein Nejati, Thanh-Toan Do, Ngai-Man Cheung, Lynette Cheah
We address the vehicle detection and classification problems using Deep Neural Networks (DNNs) approaches.
no code implementations • 28 Aug 2015 • Thanh-Toan Do, Anh-Zung Doan, Ngai-Man Cheung
This paper addresses the problem of learning binary hash codes for large scale image search by proposing a novel hashing method based on deep neural network.
no code implementations • CVPR 2015 • Thanh-Toan Do, Quang D. Tran, Ngai-Man Cheung
The embedded vectors resulted by the function approximation process are then aggregated to form a single representation used in the image retrieval framework.