no code implementations • 17 Mar 2025 • Jianzheng Huang, Xianyu Mo, Ziling Liu, Jinyu Yang, Feng Zheng
Point tracking is becoming a powerful solver for motion estimation and video editing.
no code implementations • 17 Dec 2024 • Xinliang Zhu, Michael Huang, Han Ding, Jinyu Yang, Kelvin Chen, Tao Zhou, Tal Neiman, Ouye Xie, Son Tran, Benjamin Yao, Doug Gray, Anuj Bindal, Arnab Dhua
Through extensive experiments, we show that this change leads to a substantial improvement to the image to image matching problem.
1 code implementation • 3 Dec 2024 • Liqiong Wang, Teng Jin, Jinyu Yang, Ales Leonardis, Fangyi Wang, Feng Zheng
By open-sourcing our dataset and model, we aim to promote research and development in LMMs within the agricultural domain and make significant contributions to tackle the challenges of agricultural pests and diseases.
1 code implementation • 23 Oct 2024 • Jinyu Yang, Qingwei Wang, Feng Zheng, Peng Chen, Aleš Leonardis, Deng-Ping Fan
Due to the unique characteristics of plant camouflage, including holes and irregular borders, we developed a new framework, named PCNet, dedicated to PCD.
no code implementations • 18 Jul 2024 • Sirnam Swetha, Jinyu Yang, Tal Neiman, Mamshad Nayeem Rizve, Son Tran, Benjamin Yao, Trishul Chilimbi, Mubarak Shah
In this work, we focus on enhancing the visual representations for MLLMs by combining high-frequency and detailed visual representations, obtained through masked image modeling (MIM), with semantically-enriched low-frequency representations captured by CL.
2 code implementations • 24 Jun 2024 • Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, YaoWei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng, Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu, Feiyu Pan, Hao Fang, Xiankai Lu
Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments.
1 code implementation • 11 Jun 2024 • Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng
Motion Expression guided Video Segmentation (MeViS), as an emerging task, poses many new challenges to the field of referring video object segmentation (RVOS).
no code implementations • 22 Feb 2024 • Ziling Liu, Jinyu Yang, Mingqi Gao, Feng Zheng
This paper introduces a novel and efficient system named Place-Anything, which facilitates the insertion of any object into any video solely based on a picture or text description of the target object or element.
1 code implementation • CVPR 2024 • Liqiong Wang, Jinyu Yang, yanfu Zhang, Fangyi Wang, Feng Zheng
In this paper we introduce Concealed Crop Detection (CCD) which extends classic COD to agricultural domains.
1 code implementation • 24 Apr 2023 • Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng
Therefore, in this report, we propose Track Anything Model (TAM), which achieves high-performance interactive tracking and segmentation in videos.
1 code implementation • CVPR 2023 • Jinyu Yang, Shang Gao, Zhe Li, Feng Zheng, Aleš Leonardis
However, current research on aerial perception has mainly focused on limited categories, such as pedestrian or vehicle, and most scenes are captured in urban environments from a birds-eye view.
no code implementations • 6 Nov 2022 • Shang Gao, Jinyu Yang, Zhe Li, Feng Zheng, Aleš Leonardis, Jingkuan Song
However, some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored.
no code implementations • 29 Jul 2022 • Jinyu Yang, Zhe Li, Feng Zheng, Aleš Leonardis, Jingkuan Song
Multi-modal tracking gains attention due to its ability to be more accurate and robust in complex scenarios compared to traditional RGB-based tracking.
Ranked #31 on
Rgb-T Tracking
on LasHeR
1 code implementation • 26 Mar 2022 • Jinyu Yang, Zhe Li, Song Yan, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen, Ling Shao
Particularly, we are the first to provide depth quality evaluation and analysis of tracking results in depth-friendly scenarios in RGBD tracking.
no code implementations • CVPR 2022 • Jiali Duan, Liqun Chen, Son Tran, Jinyu Yang, Yi Xu, Belinda Zeng, Trishul Chilimbi
Aligning signals from different modalities is an important step in vision-language representation learning as it affects the performance of later stages such as cross-modality fusion.
1 code implementation • CVPR 2022 • Jinyu Yang, Jiali Duan, Son Tran, Yi Xu, Sampath Chanda, Liqun Chen, Belinda Zeng, Trishul Chilimbi, Junzhou Huang
Besides CMA, TCL introduces an intra-modal contrastive objective to provide complementary benefits in representation learning.
Ranked #3 on
Zero-Shot Cross-Modal Retrieval
on COCO 2014
no code implementations • 22 Oct 2021 • Song Yan, Jinyu Yang, Ales Leonardis, Joni-Kristian Kamarainen
There are two potential reasons for the heuristics: 1) the lack of large RGBD tracking datasets to train deep RGBD trackers and 2) the long-term evaluation protocol of VOT RGBD that benefits from heuristics such as depth-based occlusion detection.
1 code implementation • 31 Aug 2021 • Song Yan, Jinyu Yang, Jani Käpylä, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen
RGBD (RGB plus depth) object tracking is gaining momentum as RGBD sensors have become popular in many application fields such as robotics. However, the best RGBD trackers are extensions of the state-of-the-art deep RGB trackers.
1 code implementation • 12 Aug 2021 • Jinyu Yang, Jingjing Liu, Ning Xu, Junzhou Huang
With the recent exponential increase in applying Vision Transformer (ViT) to vision tasks, the capability of ViT in adapting cross-domain knowledge, however, remains unexplored in the literature.
1 code implementation • ICCV 2021 • Jinyu Yang, Chunyuan Li, Weizhi An, Hehuan Ma, Yuzhi Guo, Yu Rong, Peilin Zhao, Junzhou Huang
Recent studies imply that deep neural networks are vulnerable to adversarial examples -- inputs with a slight but intentional perturbation are incorrectly classified by the network.
1 code implementation • ICCV 2021 • Song Yan, Jinyu Yang, Jani Kapyla, Feng Zheng, Ales Leonardis, Joni-Kristian Kamarainen
This can be explained by the fact that there are no sufficiently large RGBD datasets to 1) train "deep depth trackers" and to 2) challenge RGB trackers with sequences for which the depth cue is essential.
1 code implementation • 16 Dec 2020 • Jinyu Yang, Peilin Zhao, Yu Rong, Chaochao Yan, Chunyuan Li, Hehuan Ma, Junzhou Huang
Graph Neural Networks (GNNs) draw their strength from explicitly modeling the topological information of structured data.
1 code implementation • NeurIPS 2020 • Chaochao Yan, Qianggang Ding, Peilin Zhao, Shuangjia Zheng, Jinyu Yang, Yang Yu, Junzhou Huang
Retrosynthesis is the process of recursively decomposing target molecules into available building blocks.
no code implementations • ECCV 2020 • Jinyu Yang, Weizhi An, Sheng Wang, Xinliang Zhu, Chaochao Yan, Junzhou Huang
Unsupervised domain adaptation enables to alleviate the need for pixel-wise annotation in the semantic segmentation.
Ranked #28 on
Domain Adaptation
on SYNTHIA-to-Cityscapes
no code implementations • 9 Mar 2020 • Jinyu Yang, Weizhi An, Chaochao Yan, Peilin Zhao, Junzhou Huang
To achieve this goal, we design two cross-domain attention modules to adapt context dependencies from both spatial and channel views.
Ranked #29 on
Domain Adaptation
on SYNTHIA-to-Cityscapes
no code implementations • 24 Oct 2019 • Chunlei Liu, Wenrui Ding, Jinyu Yang, Vittorio Murino, Baochang Zhang, Jungong Han, Guodong Guo
In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift.
1 code implementation • 1 Oct 2019 • Chaochao Yan, Sheng Wang, Jinyu Yang, Tingyang Xu, Junzhou Huang
We investigate the posterior collapse problem of current RNN-based VAEs for molecule sequence generation.
no code implementations • 5 May 2019 • Gaoyang Li, Jinyu Yang, Chunguo Wu, Qin Ma
Recent researchers reveal that maximizing the margin distribution of whole training dataset rather than the minimal margin of a few support vectors, is prone to achieve better generalization performance.
no code implementations • 26 Feb 2019 • Jinyu Yang, Bo Zhang
We first analyse the environment of the ITR and propose a relationship model for describing interactions of ITR with the students, the social milieu and the curriculum.
no code implementations • 25 Nov 2018 • Bo Zhang, Bin Chen, Jinyu Yang, Wenjing Yang, Jiankang Zhang
Motivated by Shannon's model and recent rehabilitation of self-supervised artificial intelligence having a "World Model", this paper propose an unified intelligence-communication (UIC) model for describing a single agent and any multi-agent system.