1 code implementation • 5 Apr 2025 • Essential AI, :, Darsh J Shah, Peter Rushton, Somanshu Singla, Mohit Parmar, Kurt Smith, Yash Vanjani, Ashish Vaswani, Adarsh Chaluvaraju, Andrew Hojel, Andrew Ma, Anil Thomas, Anthony Polloreno, Ashish Tanwer, Burhan Drak Sibai, Divya S Mansingka, Divya Shivaprasad, Ishaan Shah, Karl Stratos, Khoi Nguyen, Michael Callahan, Michael Pust, Mrinal Iyer, Philip Monk, Platon Mazarakis, Ritvik Kapila, Saurabh Srivastava, Tim Romanski
A language model's ability to reflect on its own reasoning provides a key advantage for solving complex problems.
no code implementations • 13 Dec 2024 • Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen, Rang Nguyen
Given an input video of a person and a new garment, the objective of this paper is to synthesize a new video where the person is wearing the specified garment while maintaining spatiotemporal consistency.
no code implementations • 5 Dec 2024 • Trong-Tung Nguyen, Quang Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham
Recent advances in text-guided image editing enable users to perform image edits through simple text inputs, leveraging the extensive priors of multi-step diffusion-based text-to-image models.
no code implementations • 5 Dec 2024 • Quang Nguyen, Truong Vu, Trong-Tung Nguyen, Yuxin Wen, Preston K Robinette, Taylor T Johnson, Tom Goldstein, Anh Tran, Khoi Nguyen
By leveraging the contextual and semantic strengths of LLMs, our framework achieves promising results on MagicBrush, AutoSplice, and PerfBrush (novel diffusion-based dataset) datasets, outperforming previous approaches in mIoU and F1-score metrics.
no code implementations • 3 Dec 2024 • Viet Nguyen, Anh Nguyen, Trung Dao, Khoi Nguyen, Cuong Pham, Toan Tran, Anh Tran
However, our study reveals its instability when handling different diffusion model backbones due to using a fixed guidance scale within the Variational Score Distillation (VSD) loss.
no code implementations • 27 Nov 2024 • Uy Dieu Tran, Minh Luu, Phong Ha Nguyen, Khoi Nguyen, Binh-Son Hua
Existing Score Distillation Sampling (SDS)-based methods have driven significant progress in text-to-3D generation.
no code implementations • 27 Nov 2024 • Duc-Hai Pham, Tung Do, Phong Nguyen, Binh-Son Hua, Khoi Nguyen, Rang Nguyen
We propose SharpDepth, a novel approach to monocular metric depth estimation that combines the metric accuracy of discriminative depth estimation methods (e. g., Metric3D, UniDepth) with the fine-grained boundary sharpness typically achieved by generative methods (e. g., Marigold, Lotus).
no code implementations • 25 Nov 2024 • Phuc Nguyen, Minh Luu, Anh Tran, Cuong Pham, Khoi Nguyen
Existing 3D instance segmentation methods frequently encounter issues with over-segmentation, leading to redundant and inaccurate 3D proposals that complicate downstream tasks.
no code implementations • 26 Aug 2024 • Trung Dao, Thuan Hoang Nguyen, Thanh Le, Duc Vu, Khoi Nguyen, Cuong Pham, Anh Tran
Remarkably, by combining the weights of models trained with efficient LoRA and full training, we achieve a new state-of-the-art one-step diffusion model, achieving an FID of 8. 14 and surpassing all GAN-based and multi-step Stable Diffusion models.
no code implementations • 21 Aug 2024 • Phuc D. A. Nguyen, Minh Luu, Anh Tran, Cuong Pham, Khoi Nguyen
To mitigate this constraint, we propose a novel problem termed Open-Ended 3D Instance Segmentation (OE-3DIS), which eliminates the necessity for predefined class names during testing.
no code implementations • 21 Aug 2024 • Duc-Hai Pham, Duc-Dung Nguyen, Anh Pham, Tuan Ho, Phong Nguyen, Khoi Nguyen, Rang Nguyen
Accurate prediction of 3D semantic occupancy from 2D visual images is vital in enabling autonomous agents to comprehend their surroundings for planning and navigation.
3D Semantic Occupancy Prediction
3D Semantic Scene Completion
no code implementations • 23 Feb 2024 • Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, Marc Pollefeys, Leonidas Guibas, Hongbo Tian, Chunjie Wang, Xiaosheng Yan, Bingwen Wang, Xuanyang Zhang, Xiao Liu, Phuc Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham, Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023.
1 code implementation • CVPR 2024 • Phuc D. A. Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis, Chuang Gan, Anh Tran, Cuong Pham, Khoi Nguyen
We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes.
Ranked #1 on
3D Open-Vocabulary Instance Segmentation
on S3DIS
3D Instance Segmentation
3D Open-Vocabulary Instance Segmentation
+4
no code implementations • 3 Dec 2023 • Quang Nguyen, Truong Vu, Cuong Pham, Anh Tran, Khoi Nguyen
In the ever-expanding digital landscape, safeguarding sensitive information remains paramount.
no code implementations • 2 Dec 2023 • Uy Dieu Tran, Minh Luu, Phong Ha Nguyen, Khoi Nguyen, Binh-Son Hua
Text-to-3D synthesis has recently emerged as a new approach to sampling 3D models by adopting pretrained text-to-image models as guiding visual priors.
1 code implementation • 26 Oct 2023 • Chau Pham, Truong Vu, Khoi Nguyen
To address this issue, we propose a novel method, LP-OVOD, that discards low-quality boxes by training a sigmoid linear classifier on pseudo labels retrieved from the top relevant region proposals to the novel text.
Ranked #6 on
Open Vocabulary Object Detection
on MSCOCO
1 code implementation • 25 Sep 2023 • Quang Nguyen, Truong Vu, Anh Tran, Khoi Nguyen
To address this, generative models have emerged as an effective solution for generating synthetic data.
1 code implementation • ICCV 2023 • Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen
Furthermore, we demonstrate the robustness of our approach, where we can adapt various state-of-the-art fully supervised methods to the weak supervision task by using our pseudo labels for training.
2 code implementations • CVPR 2023 • Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen
Existing 3D instance segmentation methods are predominated by the bottom-up design -- manually fine-tuned algorithm to group points into clusters followed by a refinement network.
Ranked #4 on
3D Instance Segmentation
on ScanNet200
2 code implementations • 3 Oct 2022 • Hue Nguyen, Diep Tran, Khoi Nguyen, Rang Nguyen
The extremes of lighting (e. g. too much or too little light) usually cause many troubles for machine and human vision.
1 code implementation • 22 Jul 2022 • Thanh Nguyen, Chau Pham, Khoi Nguyen, Minh Hoai
We tackle a new task of few-shot object counting and detection.
Ranked #4 on
Few-shot Object Counting and Detection
on FSC147
Few-shot Object Counting and Detection
Few-Shot Object Detection
+1
1 code implementation • 22 Jul 2022 • Tuan Ngo, Khoi Nguyen
This paper introduces a new problem in 3D point cloud: few-shot instance segmentation.
1 code implementation • 21 Jul 2022 • Khoi D. Nguyen, Quoc-Huy Tran, Khoi Nguyen, Binh-Son Hua, Rang Nguyen
To the best of our knowledge, our work is the first to explore transductive few-shot video classification.
1 code implementation • NeurIPS 2021 • Duong H. Le, Khoi D. Nguyen, Khoi Nguyen, Quoc-Huy Tran, Rang Nguyen, Binh-Son Hua
In this work, we propose to use out-of-distribution samples, i. e., unlabeled samples coming from outside the target classes, to improve few-shot learning.
1 code implementation • CVPR 2022 • Khoi Nguyen, Sinisa Todorovic
This paper addresses incremental few-shot instance segmentation, where a few examples of new object classes arrive when access to training examples of old classes is not available anymore, and the goal is to perform well on both old and new classes.
1 code implementation • Advances in Neural Information Processing Systems 2021 • Duong H. Le*, Khoi D. Nguyen*, Khoi Nguyen, Quoc-Huy Tran, Rang Nguyen, Binh-Son Hua
In this work, we propose to use out-of-distribution samples, i. e., unlabeled samples coming from outside the target classes, to improve few-shot learning.
1 code implementation • ICCV 2021 • Khoi Nguyen, Sinisa Todorovic
The resulting predictions on training images are taken as the pseudo-ground truth for the standard training of Mask-RCNN, which we use for amodal instance segmentation of test images.
no code implementations • 2 Aug 2021 • Khoi Nguyen, Yen Nguyen, Bao Le
Most successful semi-supervised learning approaches in computer vision focus on leveraging huge amount of unlabeled data, learning the general representation via data augmentation and transformation, creating pseudo labels, implementing different loss functions, and eventually transferring this knowledge to more task-specific smaller models.
1 code implementation • CVPR 2021 • Khoi Nguyen, Sinisa Todorovic
This paper is about few-shot instance segmentation, where training and test image sets do not share the same object classes.
no code implementations • 16 Aug 2020 • Khoi Nguyen, Sinisa Todorovic
This paper addresses unsupervised few-shot object recognition, where all training images are unlabeled, and test images are divided into queries and a few labeled support images per object class of interest.
1 code implementation • ICCV 2019 • Khoi Nguyen, Sinisa Todorovic
Finally, the target object is segmented in the query image by using a cosine similarity between the class feature vector and the query's feature map.
Ranked #80 on
Few-Shot Semantic Segmentation
on COCO-20i (5-shot)
no code implementations • 24 Jan 2019 • Yongjin Park, Abhishek Sarkar, Khoi Nguyen, Manolis Kellis
We can achieve necessary interpretation of GWAS in a causal mediation framework, looking to establish a sparse set of mediators between genetic and downstream variables, but there are several challenges.