Search Results for author: Thanh-Toan Do

Found 72 papers, 28 papers with code

Connective Viewpoints of Signal-to-Noise Diffusion Models

no code implementations8 Aug 2024 Khanh Doan, Long Tung Vuong, Tuan Nguyen, Anh Tuan Bui, Quyen Tran, Thanh-Toan Do, Dinh Phung, Trung Le

Diffusion models (DM) have become fundamental components of generative models, excelling across various domains such as image creation, audio generation, and complex data interpolation.

Audio Generation

MetaAug: Meta-Data Augmentation for Post-Training Quantization

2 code implementations20 Jul 2024 Cuong Pham, Hoang Anh Dung, Cuong C. Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do

The transformation network modifies the original calibration data and the modified data will be used as the training set to learn the quantized model with the objective that the quantized model achieves a good performance on the original calibration data.

Data Augmentation Meta-Learning +1

Learning to Complement and to Defer to Multiple Users

no code implementations9 Jul 2024 Zheng Zhang, Wenjie Ai, Kevin Wells, David Rosewarne, Thanh-Toan Do, Gustavo Carneiro

This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users.

Decision Making

Agnostic Sharpness-Aware Minimization

no code implementations11 Jun 2024 Van-Anh Nguyen, Quyen Tran, Tuan Truong, Thanh-Toan Do, Dinh Phung, Trung Le

Sharpness-aware minimization (SAM) has been instrumental in improving deep neural network training by minimizing both the training loss and the sharpness of the loss landscape, leading the model into flatter minima that are associated with better generalization properties.

Meta-Learning

Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models

no code implementations25 Apr 2024 Parul Gupta, Munawar Hayat, Abhinav Dhall, Thanh-Toan Do

Few-shot image synthesis entails generating diverse and realistic images of novel categories using only a few example images.

Diversity Image Generation

Frequency Attention for Knowledge Distillation

1 code implementation9 Mar 2024 Cuong Pham, Van-Anh Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do

Inspired by the benefits of the frequency domain, we propose a novel module that functions as an attention mechanism in the frequency domain.

Image Classification Knowledge Distillation +3

DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition

no code implementations1 Jan 2024 Parul Gupta, Tuan Nguyen, Abhinav Dhall, Munawar Hayat, Trung Le, Thanh-Toan Do

The task of Visual Relationship Recognition (VRR) aims to identify relationships between two interacting objects in an image and is particularly challenging due to the widely-spread and highly imbalanced distribution of <subject, relation, object> triplets.

Object Relation

Learning to Complement with Multiple Humans

no code implementations22 Nov 2023 Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, Gustavo Carneiro

The ill-posedness of the LNL task requires the adoption of strong assumptions or the use of multiple noisy labels per training image, resulting in accurate models that work well in isolation but fail to optimise human-AI collaborative classification (HAI-CC).

Learning with noisy labels

Unveiling Camouflage: A Learnable Fourier-based Augmentation for Camouflaged Object Detection and Instance Segmentation

no code implementations29 Aug 2023 Minh-Quan Le, Minh-Triet Tran, Trung-Nghia Le, Tam V. Nguyen, Thanh-Toan Do

Camouflaged object detection (COD) and camouflaged instance segmentation (CIS) aim to recognize and segment objects that are blended into their surroundings, respectively.

Diversity Generative Adversarial Network +4

Optimal Transport Model Distributional Robustness

1 code implementation NeurIPS 2023 Van-Anh Nguyen, Trung Le, Anh Tuan Bui, Thanh-Toan Do, Dinh Phung

Interestingly, our developed theories allow us to flexibly incorporate the concept of sharpness awareness into training, whether it's a single model, ensemble models, or Bayesian Neural Networks, by considering specific forms of the center model distribution.

Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation

1 code implementation31 May 2023 Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro

To address IDN, Label Noise Learning (LNL) incorporates a sample selection stage to differentiate clean and noisy-label samples.

Instance-level Few-shot Learning with Class Hierarchy Mining

1 code implementation15 Apr 2023 Anh-Khoa Nguyen Vu, Thanh-Toan Do, Nhat-Duy Nguyen, Vinh-Tiep Nguyen, Thanh Duc Ngo, Tam V. Nguyen

In this paper, we exploit the hierarchical information to leverage discriminative and relevant features of base classes to effectively classify novel objects.

Few-shot Instance Segmentation Few-Shot Learning +2

PASS: Peer-Agreement based Sample Selection for training with Noisy Labels

no code implementations20 Mar 2023 Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro

The prevalence of noisy-label samples poses a significant challenge in deep learning, inducing overfitting effects.

MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation

1 code implementation9 Mar 2023 Minh-Quan Le, Tam V. Nguyen, Trung-Nghia Le, Thanh-Toan Do, Minh N. Do, Minh-Triet Tran

To overcome the disadvantage of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and $K-$shot information.

Few-shot Instance Segmentation Few-Shot Learning +2

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Approach

no code implementations4 Jan 2023 Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro

To meet this requirement without relying on additional $2C - 2$ manual annotations per instance, we propose a method that automatically generates additional noisy labels by estimating the noisy label distribution based on nearest neighbours.

Task Weighting in Meta-learning with Trajectory Optimisation

no code implementations4 Jan 2023 Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro

Developing meta-learning algorithms that are un-biased toward a subset of training tasks often requires hand-designed criteria to weight tasks, potentially resulting in sub-optimal solutions.

Few-Shot Learning

Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks

no code implementations27 Oct 2022 Cuong Pham, Tuan Hoang, Thanh-Toan Do

Knowledge distillation which learns a lightweight student model by distilling knowledge from a cumbersome teacher model is an attractive approach for learning compact deep neural networks (DNNs).

Knowledge Distillation Quantization

Vision Transformer Visualization: What Neurons Tell and How Neurons Behave?

1 code implementation14 Oct 2022 Van-Anh Nguyen, Khanh Pham Dinh, Long Tung Vuong, Thanh-Toan Do, Quan Hung Tran, Dinh Phung, Trung Le

Our approach departs from the computational process of ViTs with a focus on visualizing the local and global information in input images and the latent feature embeddings at multiple levels.

Instance-Dependent Noisy Label Learning via Graphical Modelling

1 code implementation2 Sep 2022 Arpit Garg, Cuong Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro

Noisy labels are unavoidable yet troublesome in the ecosystem of deep learning because models can easily overfit them.

Learning with noisy labels

Multi-Modal Mutual Information Maximization: A Novel Approach for Unsupervised Deep Cross-Modal Hashing

no code implementations13 Dec 2021 Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung

First, to learn informative representations that can preserve both intra- and inter-modal similarities, we leverage the recent advances in estimating variational lower-bound of MI to maximize the MI between the binary representations and input features and between binary representations of different modalities.

Cross-Modal Retrieval Retrieval

Logic Rules Meet Deep Learning: A Novel Approach for Ship Type Classification

1 code implementation1 Nov 2021 Manolis Pitsikalis, Thanh-Toan Do, Alexei Lisitsa, Shan Luo

The shipping industry is an important component of the global trade and economy, however in order to ensure law compliance and safety it needs to be monitored.

Probabilistic task modelling for meta-learning

1 code implementation9 Jun 2021 Cuong C. Nguyen, Thanh-Toan Do, Gustavo Carneiro

We propose probabilistic task modelling -- a generative probabilistic model for collections of tasks used in meta-learning.

Meta-Learning Variational Inference

Similarity of Classification Tasks

1 code implementation27 Jan 2021 Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro

Recent advances in meta-learning has led to remarkable performances on several few-shot learning benchmarks.

Classification Few-Shot Learning +1

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks

no code implementations26 Dec 2020 Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung

With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels.

Image Classification Quantization

Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding

1 code implementation9 Sep 2020 Binh X. Nguyen, Binh D. Nguyen, Gustavo Carneiro, Erman Tjiputra, Quang D. Tran, Thanh-Toan Do

Based on pseudo labels, we propose a novel unsupervised metric loss which enforces the positive concentration and negative separation of samples in the embedding space.

Benchmarking Clustering +2

Unsupervised Deep Cross-modality Spectral Hashing

no code implementations1 Aug 2020 Tuan Hoang, Thanh-Toan Do, Tam V. Nguyen, Ngai-Man Cheung

This paper presents a novel framework, namely Deep Cross-modality Spectral Hashing (DCSH), to tackle the unsupervised learning problem of binary hash codes for efficient cross-modal retrieval.

Cross-Modal Retrieval Retrieval +1

MirrorNet: Bio-Inspired Camouflaged Object Segmentation

no code implementations Pattern Recognition Journal 2020 Jinnan Yan, Trung-Nghia Le, Khanh-Duy Nguyen, Minh-Triet Tran, Thanh-Toan Do, Tam V. Nguyen

Differently from existing networks for segmentation, our proposed network possesses two segmentation streams: the main stream and the mirror stream corresponding with the original image and its flipped image, respectively.

Camouflaged Object Segmentation Camouflage Segmentation +3

Sensor Data for Human Activity Recognition: Feature Representation and Benchmarking

no code implementations15 May 2020 Flávia Alves, Martin Gairing, Frans A. Oliehoek, Thanh-Toan Do

In HAR, the development of Activity Recognition models is dependent upon the data captured by these devices and the methods used to analyse them, which directly affect performance metrics.

Benchmarking Human Activity Recognition

Overcoming Data Limitation in Medical Visual Question Answering

2 code implementations26 Sep 2019 Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran

Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training.

Ranked #11 on Medical Visual Question Answering on VQA-RAD (Open-ended Accuracy metric, using extra training data)

Denoising Medical Visual Question Answering +3

Scalable Place Recognition Under Appearance Change for Autonomous Driving

no code implementations ICCV 2019 Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Thanh-Toan Do, Ian Reid

Our experiments show that, compared to state-of-the-art techniques, our method has much greater potential for large-scale place recognition for autonomous driving.

Autonomous Driving Visual Place Recognition

Uncertainty in Model-Agnostic Meta-Learning using Variational Inference

1 code implementation27 Jul 2019 Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro

We introduce a new, rigorously-formulated Bayesian meta-learning algorithm that learns a probability distribution of model parameter prior for few-shot learning.

BIG-bench Machine Learning Classification +5

Bayesian Generative Active Deep Learning

no code implementations26 Apr 2019 Toan Tran, Thanh-Toan Do, Ian Reid, Gustavo Carneiro

Deep learning models have demonstrated outstanding performance in several problems, but their training process tends to require immense amounts of computational and human resources for training and labeling, constraining the types of problems that can be tackled.

Active Learning Data Augmentation

A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning

no code implementations CVPR 2019 Thanh-Toan Do, Toan Tran, Ian Reid, Vijay Kumar, Tuan Hoang, Gustavo Carneiro

Another approach explored in the field relies on an ad-hoc linearization (in terms of N) of the triplet loss that introduces class centroids, which must be optimized using the whole training set for each mini-batch - this means that a naive implementation of this approach has run-time complexity O(N^2).

Metric Learning Retrieval

SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration without Correspondences

1 code implementation6 Apr 2019 Huu Le, Thanh-Toan Do, Tuan Hoang, Ngai-Man Cheung

In particular, our work enables the use of randomized methods for point cloud registration without the need of putative correspondences.

Graph Matching Point Cloud Registration

V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation

no code implementations23 Mar 2019 Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis

We propose V2CNet, a new deep learning framework to automatically translate the demonstration videos to commands that can be directly used in robotic applications.

Decoder

SASSE: Scalable and Adaptable 6-DOF Pose Estimation

no code implementations5 Feb 2019 Huu Le, Tuan Hoang, Qianggong Zhang, Thanh-Toan Do, Anders Eriksson, Michael Milford

In this paper, we present a novel 6-DOF localization system that for the first time simultaneously achieves all the three characteristics: significantly sub-linear storage growth, agnosticism to image descriptors, and customizability to available storage and computational resources.

Benchmarking Pose Estimation +1

Learning Pairwise Relationship for Multi-object Detection in Crowded Scenes

no code implementations12 Jan 2019 Yu Liu, Lingqiao Liu, Hamid Rezatofighi, Thanh-Toan Do, Qinfeng Shi, Ian Reid

As the post-processing step for object detection, non-maximum suppression (GreedyNMS) is widely used in most of the detectors for many years.

object-detection Object Detection

A Binary Optimization Approach for Constrained K-Means Clustering

1 code implementation24 Oct 2018 Huu Le, Anders Eriksson, Thanh-Toan Do, Michael Milford

This approach allows us to solve constrained K-Means where multiple types of constraints can be simultaneously enforced.

Clustering

MediaEval 2018: Predicting Media Memorability Task

no code implementations3 Jul 2018 Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, Mats Sjöberg, Bogdan Ionescu, Thanh-Toan Do, France Rennes

In this paper, we present the Predicting Media Memorability task, which is proposed as part of the MediaEval 2018 Benchmarking Initiative for Multimedia Evaluation.

Benchmarking Memorization

G2D: from GTA to Data

2 code implementations16 Jun 2018 Anh-Dzung Doan, Abdul Mohsi Jawaid, Thanh-Toan Do, Tat-Jun Chin

This document describes G2D, a software that enables capturing videos from Grand Theft Auto V (GTA V), a popular role playing game set in an expansive virtual city.

3D Reconstruction Autonomous Driving +2

Bayesian Semantic Instance Segmentation in Open Set World

no code implementations ECCV 2018 Trung Pham, Vijay Kumar B G, Thanh-Toan Do, Gustavo Carneiro, Ian Reid

In this paper, we present a novel open-set semantic instance segmentation approach capable of segmenting all known and unknown object classes in images, based on the output of an object detector trained on known object classes.

Instance Segmentation Object +2

Object Captioning and Retrieval with Natural Language

1 code implementation16 Mar 2018 Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis

The key idea of our approach is the use of object descriptions to provide the detailed understanding of an object.

Object Retrieval

Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image

no code implementations28 Feb 2018 Thanh-Toan Do, Ming Cai, Trung Pham, Ian Reid

Detecting objects and their 6D poses from only RGB images is an important task for many robotic applications.

Benchmarking Instance Segmentation +5

Binary Constrained Deep Hashing Network for Image Retrieval without Manual Annotation

no code implementations21 Feb 2018 Thanh-Toan Do, Tuan Hoang, Dang-Khoa Le Tan, Trung Pham, Huu Le, Ngai-Man Cheung, Ian Reid

However, training deep hashing networks for the task is challenging due to the binary constraints on the hash codes, the similarity preserving property, and the requirement for a vast amount of labelled images.

Deep Hashing Image Retrieval +1

Simultaneous Compression and Quantization: A Joint Approach for Efficient Unsupervised Hashing

no code implementations19 Feb 2018 Tuan Hoang, Thanh-Toan Do, Huu Le, Dang-Khoa Le-Tan, Ngai-Man Cheung

For unsupervised data-dependent hashing, the two most important requirements are to preserve similarity in the low-dimensional feature space and to minimize the binary quantization loss.

Image Retrieval Quantization +1

On-device Scalable Image-based Localization via Prioritized Cascade Search and Fast One-Many RANSAC

no code implementations10 Feb 2018 Ngoc-Trung Tran, Dang-Khoa Le Tan, Anh-Dzung Doan, Thanh-Toan Do, Tuan-Anh Bui, Mengxuan Tan, Ngai-Man Cheung

In order to overcome the resource constraints of mobile devices, we propose a system design that leverages the scalability advantage of image retrieval and accuracy of 3D model-based localization.

Image-Based Localization Image Retrieval +2

From Selective Deep Convolutional Features to Compact Binary Representations for Image Retrieval

1 code implementation7 Feb 2018 Thanh-Toan Do, Tuan Hoang, Dang-Khoa Le Tan, Huu Le, Tam V. Nguyen, Ngai-Man Cheung

In the large-scale image retrieval task, the two most important requirements are the discriminability of image representations and the efficiency in computation and storage of representations.

Image Retrieval Retrieval

Compact Hash Code Learning with Binary Deep Neural Network

no code implementations8 Dec 2017 Thanh-Toan Do, Tuan Hoang, Dang-Khoa Le Tan, Anh-Dzung Doan, Ngai-Man Cheung

This design has overcome a challenging problem in some previous works: optimizing non-smooth objective functions because of binarization.

Binarization Deep Hashing +1

Supervised Hashing with End-to-End Binary Deep Neural Network

no code implementations24 Nov 2017 Dang-Khoa Le Tan, Thanh-Toan Do, Ngai-Man Cheung

Image hashing is a popular technique applied to large scale content-based visual retrieval due to its compact and efficient binary codes.

Image Retrieval Retrieval

Deterministic Approximate Methods for Maximum Consensus Robust Fitting

1 code implementation27 Oct 2017 Huu Le, Tat-Jun Chin, Anders Eriksson, Thanh-Toan Do, David Suter

Further, our approach is naturally applicable to estimation problems with geometric residuals

AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection

2 code implementations21 Sep 2017 Thanh-Toan Do, Anh Nguyen, Ian Reid

We propose AffordanceNet, a new deep learning approach to simultaneously detect multiple objects and their affordances from RGB images.

Affordance Detection Object +2

SceneCut: Joint Geometric and Object Segmentation for Indoor Scenes

no code implementations21 Sep 2017 Trung Pham, Thanh-Toan Do, Niko Sünderhauf, Ian Reid

This paper presents SceneCut, a novel approach to jointly discover previously unseen objects and non-object surfaces using a single RGB-D image.

Object Semantic Segmentation

Real-Time 6DOF Pose Relocalization for Event Cameras with Stacked Spatial LSTM Networks

1 code implementation22 Aug 2017 Anh Nguyen, Thanh-Toan Do, Darwin G. Caldwell, Nikos G. Tsagarakis

Our method first creates the event image from a list of events that occurs in a very short time interval, then a Stacked Spatial LSTM Network (SP-LSTM) is used to learn the camera pose.

Selective Deep Convolutional Features for Image Retrieval

1 code implementation4 Jul 2017 Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung

Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors.

Image Retrieval Retrieval

Binary Hashing with Semidefinite Relaxation and Augmented Lagrangian

no code implementations19 Jul 2016 Thanh-Toan Do, Anh-Dzung Doan, Duc-Thanh Nguyen, Ngai-Man Cheung

This paper proposes two approaches for inferencing binary codes in two-step (supervised, unsupervised) hashing.

Learning to Hash with Binary Deep Neural Network

no code implementations18 Jul 2016 Thanh-Toan Do, Anh-Dzung Doan, Ngai-Man Cheung

Our resulting optimization with these binary, independence, and balance constraints is difficult to solve.

Binarization

Embedding based on function approximation for large scale image search

no code implementations23 May 2016 Thanh-Toan Do, Ngai-Man Cheung

The objective of this paper is to design an embedding method that maps local features describing an image (e. g. SIFT) to a higher dimensional representation useful for the image retrieval problem.

Image Retrieval Retrieval

Discrete Hashing with Deep Neural Network

no code implementations28 Aug 2015 Thanh-Toan Do, Anh-Zung Doan, Ngai-Man Cheung

This paper addresses the problem of learning binary hash codes for large scale image search by proposing a novel hashing method based on deep neural network.

Image Retrieval

FAemb: A Function Approximation-Based Embedding Method for Image Retrieval

no code implementations CVPR 2015 Thanh-Toan Do, Quang D. Tran, Ngai-Man Cheung

The embedded vectors resulted by the function approximation process are then aggregated to form a single representation used in the image retrieval framework.

Image Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.