no code implementations • 28 Jul 2024 • Tz-Ying Wu, Kyle Min, Subarna Tripathi, Nuno Vasconcelos
Video understanding typically requires fine-tuning the large backbone when adapting to new domains.
no code implementations • 9 Jun 2023 • Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
Research in scene graph generation (SGG) usually considers two-stage models, that is, detecting a set of entities, followed by combining them and labeling all possible relationships.
1 code implementation • CVPR 2024 • Tz-Ying Wu, Chih-Hui Ho, Nuno Vasconcelos
A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
2 code implementations • CVPR 2022 • Tz-Ying Wu, Gurumurthy Swaminathan, Zhizhong Li, Avinash Ravichandran, Nuno Vasconcelos, Rahul Bhotika, Stefano Soatto
We hypothesize that a strong base model can provide a good representation for novel classes and incremental learning can be done with small adaptations.
no code implementations • ICCV 2021 • Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
Significant effort has been recently devoted to modeling visual relations.
1 code implementation • ECCV 2020 • Tz-Ying Wu, Pedro Morgado, Pei Wang, Chih-Hui Ho, Nuno Vasconcelos
Motivated by this, a deep realistic taxonomic classifier (Deep-RTC) is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
1 code implementation • CVPR 2020 • Chih-Hui Ho, Bo Liu, Tz-Ying Wu, Nuno Vasconcelos
Multiview recognition has been well studied in the literature and achieves decent performance in object recognition and retrieval task.
1 code implementation • CVPR 2020 • Yiran Xu, Xiaoyin Yang, Lihang Gong, Hsuan-Chu Lin, Tz-Ying Wu, Yunsheng Li, Nuno Vasconcelos
The new paradigm lies between the end-to-end and pipelined approaches, and is inspired by how humans solve the problem.
no code implementations • ECCV 2018 • Tz-Ying Wu, Juan-Ting Lin, Tsun-Hsuang Wang, Chan-Wei Hu, Juan Carlos Niebles, Min Sun
In the closed-loop system, the ability to monitor the state of the task via rich sensory information is important but often less studied.
1 code implementation • ICCV 2017 • Tz-Ying Wu, Ting-An Chien, Cheng-Sheng Chan, Chan-Wei Hu, Min Sun
The core of the system is a novel Recurrent Neural Network (RNN) and Policy Network (PN), where the RNN encodes visual and motion observation to anticipate intention, and the PN parsimoniously triggers the process of visual observation to reduce computation requirement.