no code implementations • 4 Mar 2023 • Toan Ngyen, Minh Nhat Vu, An Vuong, Dzung Nguyen, Thieu Vo, Ngan Le, Anh Nguyen
Affordance detection is a challenging problem with a wide variety of robotic applications.
no code implementations • 30 Dec 2022 • Hasan Md Tusfiqur, Duy M. H. Nguyen, Mai T. N. Truong, Triet A. Nguyen, Binh T. Nguyen, Michael Barz, Hans-Juergen Profitlich, Ngoc T. T. Than, Ngan Le, Pengtao Xie, Daniel Sonntag
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment.
1 code implementation • 12 Dec 2022 • Khoa Vo, Kashu Yamazaki, Phong X. Nguyen, Phat Nguyen, Khoa Luu, Ngan Le
We choose video paragraph captioning and temporal action detection to illustrate the effectiveness of human perception based-contextual representation in video understanding.
no code implementations • 9 Dec 2022 • Hyekang Kevin Joo, Khoa Vo, Kashu Yamazaki, Ngan Le
Video anomaly detection (VAD) -- commonly formulated as a multiple-instance learning problem in a weakly-supervised manner due to its labor-intensive nature -- is a challenging problem in video surveillance where the frames of anomaly need to be localized in an untrimmed video.
1 code implementation • 28 Nov 2022 • Kashu Yamazaki, Khoa Vo, Sang Truong, Bhiksha Raj, Ngan Le
Video paragraph captioning aims to generate a multi-sentence description of an untrimmed video with several temporal event locations in coherent storytelling.
Ranked #2 on
Video Captioning
on ActivityNet Captions
no code implementations • 17 Nov 2022 • Pha Nguyen, Kha Gia Quach, Chi Nhan Duong, Son Lam Phung, Ngan Le, Khoa Luu
The development of autonomous vehicles generates a tremendous demand for a low-cost solution with a complete set of camera sensors capturing the environment around the car.
1 code implementation • 12 Oct 2022 • Minh Tran, Khoa Vo, Kashu Yamazaki, Arthur Fernandes, Michael Kidd, Ngan Le
AISFormer explicitly models the complex coherence between occluder, visible, amodal, and invisible masks within an object's regions of interest by treating them as learnable queries.
1 code implementation • 7 Oct 2022 • Tien-Phat Nguyen, Trong-Thang Pham, Tri Nguyen, Hieu Le, Dung Nguyen, Hau Lam, Phong Nguyen, Jennifer Fowler, Minh-Triet Tran, Ngan Le
The transformer expanding path models the temporal coherency between embryo images to ensure monotonic non-decreasing constraint and is optimized by a segmentation head.
1 code implementation • 5 Oct 2022 • Khoa Vo, Sang Truong, Kashu Yamazaki, Bhiksha Raj, Minh-Triet Tran, Ngan Le
PMR module represents each video snippet by a visual-linguistic feature, in which main actors and surrounding environment are represented by visual information, whereas relevant objects are depicted by linguistic features through an image-text model.
1 code implementation • 30 Sep 2022 • Thinh Phan, Duc Le, Patel Brijesh, Donald Adjeroh, Jingxian Wu, Morten Olgaard Jensen, Ngan Le
Electrocardiogram (ECG) signal is one of the most effective sources of information mainly employed for the diagnosis and prediction of cardiovascular diseases (CVDs) connected with the abnormalities in heart rhythm.
no code implementations • 11 Sep 2022 • Thanh-Dat Truong, Chi Nhan Duong, Ngan Le, Marios Savvides, Khoa Luu
We therefore introduce a new method named Attention-based Bijective Generative Adversarial Networks in a Distillation framework (DAB-GAN) to synthesize faces of a subject given his/her extracted face recognition features.
1 code implementation • 26 Jun 2022 • Kashu Yamazaki, Sang Truong, Khoa Vo, Michael Kidd, Chase Rainwater, Khoa Luu, Ngan Le
In this paper, we leverage the human perceiving process, that involves vision and language interaction, to generate a coherent paragraph description of untrimmed videos.
Ranked #3 on
Video Captioning
on ActivityNet Captions
no code implementations • 7 Jun 2022 • Pha Nguyen, Thanh-Dat Truong, Miaoqing Huang, Yi Liang, Ngan Le, Khoa Luu
Self-training crowd counting has not been attentively explored though it is one of the important challenges in computer vision.
no code implementations • 22 May 2022 • Thanh-Dat Truong, Naga Venkata Sai Raviteja Chappa, Xuan Bac Nguyen, Ngan Le, Ashley Dowling, Khoa Luu
Unsupervised domain adaptation is one of the challenging problems in computer vision.
no code implementations • 19 Apr 2022 • Pha Nguyen, Kha Gia Quach, Chi Nhan Duong, Ngan Le, Xuan-Bac Nguyen, Khoa Luu
The experimental results on the nuScenes dataset demonstrate the benefits of the proposed method to produce SOTA performance on the existing vision-based tracking dataset.
no code implementations • 16 Mar 2022 • Minh Tran, Viet-Khoa Vo-Ho, Kyle Quinn, Hien Nguyen, Khoa Luu, Ngan Le
We then provide recent developments of CapsNet for the task of medical image segmentation.
1 code implementation • 16 Mar 2022 • Khoa Vo, Kashu Yamazaki, Sang Truong, Minh-Triet Tran, Akihiro Sugimoto, Ngan Le
Temporal action proposal generation (TAPG) aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet plays an important role in many tasks of video analysis and understanding.
1 code implementation • 16 Mar 2022 • Ngoc-Vuong Ho, Tan Nguyen, Gia-Han Diep, Ngan Le, Binh-Son Hua
In this paper, we propose Point-Unet, a novel method that incorporates the efficiency of deep learning with 3D point clouds into volumetric segmentation.
2 code implementations • 16 Mar 2022 • Tan Nguyen, Binh-Son Hua, Ngan Le
Medical image segmentation has been so far achieving promising results with Convolutional Neural Networks (CNNs).
no code implementations • 16 Mar 2022 • Viet-Khoa Vo-Ho, Kashu Yamazaki, Hieu Hoang, Minh-Triet Tran, Ngan Le
To address such limitations, meta-learning has been adopted in the scenarios of few-shot learning and multiple tasks.
no code implementations • 15 Jan 2022 • Minh Tran, Loi Ly, Binh-Son Hua, Ngan Le
Capsule network is a recent new deep network architecture that has been applied successfully for medical image segmentation tasks.
1 code implementation • 21 Oct 2021 • Khoa Vo, Hyekang Joo, Kashu Yamazaki, Sang Truong, Kris Kitani, Minh-Triet Tran, Ngan Le
In this paper, we make an attempt to simulate that ability of a human by proposing Actor Environment Interaction (AEI) network to improve the video representation for temporal action proposals generation.
no code implementations • 25 Aug 2021 • Ngan Le, Vidhiwar Singh Rathour, Kashu Yamazaki, Khoa Luu, Marios Savvides
In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision.
1 code implementation • ICCV 2021 • Thanh-Dat Truong, Chi Nhan Duong, Ngan Le, Son Lam Phung, Chase Rainwater, Khoa Luu
Semantic segmentation aims to predict pixel-level labels.
1 code implementation • ICCV 2021 • Thanh-Dat Truong, Chi Nhan Duong, The De Vu, Hoang Anh Pham, Bhiksha Raj, Ngan Le, Khoa Luu
Therefore, this work introduces a new Audio-Visual Transformer approach to the problem of localization and highlighting the main speaker in both audio and visual channels of a multi-speaker conversation video in the wild.
1 code implementation • IEEE EMBS 2021 • Minh Duc Le, Vidhiwar Singh Rathour, Quang Sang Truong, Quan Mai, Patel Brijesh, Ngan Le
The automatic classification of electrocardiogram (ECG) signals has played an important role in cardiovascular diseases diagnosis and prediction.
no code implementations • 17 Jul 2021 • Viet-Khoa Vo-Ho, Ngan Le, Kashu Yamazaki, Akihiro Sugimoto, Minh-Triet Tran
Temporal action proposal generation is an essential and challenging task that aims at localizing temporal intervals containing human actions in untrimmed videos.
no code implementations • 4 Dec 2020 • Ngan Le, Trung Le, Kashu Yamazaki, Toan Duc Bui, Khoa Luu, Marios Savides
Our proposed Offset Curves (OsC) loss consists of three main fitting terms.
no code implementations • 3 Dec 2020 • Toan Duc Bui, Manh Nguyen, Ngan Le, Khoa Luu
To capture temporal structures in the medical images, we explore the displacement between the consecutive slices using a deformation field.
no code implementations • 3 Dec 2020 • Ngan Le, Kashu Yamazaki, Dat Truong, Kha Gia Quach, Marios Savvides
The first objective is performed by our proposed contextual brain tumor detection network, which plays a role of an attention gate and focuses on the region around brain tumor only while ignoring the far neighbor background which is less correlated to the tumor.
no code implementations • 9 Apr 2020 • Thanh-Dat Truong, Chi Nhan Duong, Kha Gia Quach, Ngan Le, Tien D. Bui, Khoa Luu
This work presents a novel Lightweight Attentive Angular Distillation (LIAAD) approach to Large-scale Lightweight AiFR that overcomes these limitations.
no code implementations • 28 May 2019 • Thanh-Dat Truong, Chi Nhan Duong, Khoa Luu, Minh-Triet Tran, Ngan Le
However, it has been largely overlooked in the problem of recognition in new unseen domains.
no code implementations • 28 May 2019 • Thanh-Dat Truong, Khoa Luu, Chi Nhan Duong, Ngan Le, Minh-Triet Tran
This paper presents a novel deep learning based approach to tackle the problem of across unseen modalities.
1 code implementation • 25 May 2019 • Chi Nhan Duong, Khoa Luu, Kha Gia Quach, Ngan Le
In addition, this work introduces a novel Angular Distillation Loss for distilling the feature direction and the sample distributions of the teacher's hypersphere to its student.
no code implementations • 24 May 2019 • Thanh-Dat Truong, Khoa Luu, Chi Nhan Duong, Ngan Le, Minh-Triet Tran
The experiments on CIFAR-10, ImageNet and Celeb-HQ datasets, have shown that our invertible $n \times n$ convolution helps to improve the performance of generative models significantly.
no code implementations • 28 Nov 2018 • Kha Gia Quach, Ngan Le, Chi Nhan Duong, Ibsa Jalata, Kaushik Roy, Khoa Luu
To demonstrate the robustness and effectiveness of each component in the proposed approach, three experiments were conducted: (i) evaluation on AffectNet database to benchmark the proposed EmoNet for recognizing facial expression; (ii) evaluation on EmotiW2018 to benchmark the proposed deep feature level fusion mechanism NVPF; and, (iii) examine the proposed TNVPF on an innovative Group-level Emotion on Crowd Videos (GECV) dataset composed of 627 videos collected from publicly available sources.
no code implementations • 27 Nov 2018 • Chi Nhan Duong, Kha Gia Quach, Ibsa Jalata, Ngan Le, Khoa Luu
Deep neural networks have been widely used in numerous computer vision applications, particularly in face recognition.
no code implementations • CVPR 2019 • Chi Nhan Duong, Khoa Luu, Kha Gia Quach, Nghia Nguyen, Eric Patterson, Tien D. Bui, Ngan Le
This paper presents a novel approach to synthesize automatically age-progressed facial images in video sequences using Deep Reinforcement Learning.
1 code implementation • 12 Apr 2017 • Ngan Le, Kha Gia Quach, Khoa Luu, Marios Savvides, Chenchen Zhu
To address these issues and boost the classic variational LS methods to a new level of the learnable deep learning approaches, we propose a novel definition of contour evolution named Recurrent Level Set (RLS)} to employ Gated Recurrent Unit under the energy minimization of a variational LS functional.