no code implementations • 17 Dec 2024 • Weiguo Pian, Shijian Deng, Shentong Mo, Yunhui Guo, Yapeng Tian
In this paper, we introduce Modality-Inconsistent Continual Learning (MICL), a new continual learning scenario for Multimodal Large Language Models (MLLMs) that involves tasks with inconsistent modalities (image, audio, or video) and varying task types (captioning or question-answering).
no code implementations • 3 Dec 2024 • Sarthak Kumar Maharana, Baoming Zhang, Leonid Karlinsky, Rogerio Feris, Yunhui Guo
Although open-vocabulary classification models like Contrastive Language Image Pretraining (CLIP) have demonstrated strong zero-shot learning capabilities, their robustness to common image corruptions remains poorly understood.
no code implementations • 29 Nov 2024 • YuXuan Li, Yunhui Guo
Architecture plays an important role in deciding the performance of deep neural networks.
1 code implementation • 5 Nov 2024 • Weiguo Pian, Yiyang Nan, Shijian Deng, Shentong Mo, Yunhui Guo, Yapeng Tian
The task is inherently challenging as our models must not only effectively utilize information from both modalities in current tasks but also preserve their cross-modal association in old tasks to mitigate catastrophic forgetting during audio-visual continual learning.
1 code implementation • 4 Oct 2024 • Ruiyu Mao, Sarthak Kumar Maharana, Rishabh K Iyer, Yunhui Guo
A key requirement for training an accurate 3D object detector is the availability of a large amount of LiDAR-based point cloud data.
no code implementations • 18 Jul 2024 • Qifan Zhang, Yunhui Guo, Yu Xiang
To this end, we conducted experiments to study the CDL problem with three prompt-based CL models, i. e., L2P, DualPrompt and CODA-Prompt, where we utilized logit distillation, feature distillation and prompt distillation for knowledge distillation from a teacher model to a student model.
1 code implementation • 28 May 2024 • Yangxiao Lu, Jishnu Jaykumar P, Yunhui Guo, Nicholas Ruozzi, Yu Xiang
In instance segmentation tasks on seven core datasets of the BOP challenge, our method is around 4. 5 times faster than the leading published RGB method and surpasses it by 3. 6 AP.
no code implementations • 1 May 2024 • Yunhui Guo
Existing methods tackle this problem by analyzing the output of the pre-trained model or by comparing the pre-trained model with a probe model trained on the target dataset.
1 code implementation • 15 Mar 2024 • Sarthak Kumar Maharana, Baoming Zhang, Yunhui Guo
Real-world vision models in dynamic environments face rapid shifts in domain distributions, leading to decreased recognition performance.
1 code implementation • 15 Mar 2024 • YuXuan Li, Sarthak Kumar Maharana, Yunhui Guo
Unfortunately, existing methodologies based on trigger sets are still susceptible to functionality-stealing attacks, potentially enabling adversaries to steal the functionality of the source model without a reliable means of verifying ownership.
1 code implementation • 10 Jan 2024 • Ruiyu Mao, Ouyang Xu, Yunhui Guo
The presence of unknown classes in the data can significantly impact the performance of existing active learning methods due to the uncertainty they introduce.
1 code implementation • CVPR 2024 • Wenjie Zhao, Jia Li, Xin Dong, Yu Xiang, Yunhui Guo
Semantic segmentation models, while effective for in-distribution categories, face challenges in real-world deployment due to encountering out-of-distribution (OoD) objects.
no code implementations • 18 Sep 2023 • Haoliang Wang, Chen Zhao, Yunhui Guo, Kai Jiang, Feng Chen
In this study, we introduce a novel problem, semantic OOD detection across domains, which simultaneously addresses both distributional shifts.
no code implementations • 13 Sep 2023 • Zhihang Ren, Jefferson Ortega, Yifan Wang, Zhimin Chen, Yunhui Guo, Stella X. Yu, David Whitney
Along with the dataset, we propose a new computer vision task to infer the affect of the selected character via both context and character information in each video frame.
1 code implementation • ICCV 2023 • Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian
We demonstrate that joint audio-visual modeling can improve class-incremental learning, but current methods fail to preserve semantic similarity between audio and visual features as incremental step grows.
no code implementations • CVPR 2024 • Yunhui Guo, Youren Zhang, Yubei Chen, Stella X. Yu
With our feature mapper simply trained to spread out training instances in hyperbolic space, we observe that images move closer to the origin with congealing, validating our idea of unsupervised prototypicality discovery.
1 code implementation • 7 Feb 2023 • Yangxiao Lu, Ninad Khargonkar, Zesheng Xu, Charles Averill, Kamalesh Palanisamy, Kaiyu Hang, Yunhui Guo, Nicholas Ruozzi, Yu Xiang
By applying multi-object tracking and video object segmentation on the images collected via robot pushing, our system can generate segmentation masks of all the objects in these images in a self-supervised way.
1 code implementation • 24 Aug 2022 • Xiaofan Yu, Yunhui Guo, Sicun Gao, Tajana Rosing
To address the challenges, we propose Self-Supervised ContrAstive Lifelong LEarning without Prior Knowledge (SCALE) which can extract and memorize representations on the fly purely from the data continuum.
1 code implementation • CVPR 2022 • Tsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu
We enforce spatial consistency of grouping and bootstrap feature learning with co-segmentation among multiple views of the same image, and enforce semantic consistency across the grouping hierarchy with clustering transformers between coarse- and fine-grained features.
no code implementations • CVPR 2022 • Yunhui Guo, Haoran Guo, Stella Yu
We propose CO-SNE, which extends the Euclidean space visualization tool, t-SNE, to hyperbolic space.
1 code implementation • CVPR 2022 • Yunhui Guo, Xudong Wang, Yubei Chen, Stella X. Yu
Hyperbolic space can naturally embed hierarchies, unlike Euclidean space.
no code implementations • ICLR 2020 • Yunhui Guo, Mingrui Liu, Yandong Li, Liqiang Wang, Tianbao Yang, Tajana Rosing
We evaluate the effectiveness of traditional attack methods such as FGSM and PGD. The results show that A-GEM still possesses strong continual learning ability in the presence of adversarial examples in the memory and simple defense techniques such as label smoothing can further alleviate the adversarial effects.
2 code implementations • ECCV 2020 • Yunhui Guo, Noel C. Codella, Leonid Karlinsky, James V. Codella, John R. Smith, Kate Saenko, Tajana Rosing, Rogerio Feris
Extensive experiments on the proposed benchmark are performed to evaluate state-of-art meta-learning approaches, transfer learning approaches, and newer methods for cross-domain few-shot learning.
Ranked #3 on Cross-Domain Few-Shot on Plantae
cross-domain few-shot learning Few-Shot Image Classification +1
no code implementations • 21 Nov 2019 • Yunhui Guo, Yandong Li, Liqiang Wang, Tajana Rosing
Fine-tuning is a popular transfer learning technique for deep neural networks where a few rounds of training are applied to the parameters of a pre-trained model to adapt them to a new task.
1 code implementation • NeurIPS 2020 • Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing
This view leads to two improved schemes for episodic memory based lifelong learning, called MEGA-I and MEGA-II.
no code implementations • 25 Sep 2019 • Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing
In this paper, we introduce a novel and effective lifelong learning algorithm, called MixEd stochastic GrAdient (MEGA), which allows deep neural networks to acquire the ability of retaining performance on old tasks while learning new tasks.
1 code implementation • 3 Feb 2019 • Yunhui Guo, Yandong Li, Rogerio Feris, Liqiang Wang, Tajana Rosing
A model aware of the relationships between different domains can also be trained to work on new domains with less resources.
3 code implementations • CVPR 2019 • Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, Rogerio Feris
Transfer learning, which allows a source task to affect the inductive bias of the target task, is widely used in computer vision.
no code implementations • 13 Aug 2018 • Yunhui Guo
For all its popularity, deep neural networks are also criticized for consuming a lot of memory and draining battery life of devices during training and inference.