1 code implementation • 13 Mar 2025 • Xudong Tan, Peng Ye, Chongjun Tu, JianJian Cao, Yaoxin Yang, Lin Zhang, Dongzhan Zhou, Tao Chen
This insight introduces a novel information-preserving perspective, making it possible to maintain performance even under extreme token compression.
no code implementations • 3 Mar 2025 • Yongqi Huang, Peng Ye, Chenyu Huang, JianJian Cao, Lin Zhang, Baopu Li, Gang Yu, Tao Chen
Upcycled Mixture-of-Experts (MoE) models have shown great potential in various tasks by converting the original Feed-Forward Network (FFN) layers in pre-trained dense models into MoE layers.
1 code implementation • 3 Jun 2024 • Pengtao Chen, Mingzhu Shen, Peng Ye, JianJian Cao, Chongjun Tu, Christos-Savvas Bouganis, Yiren Zhao, Tao Chen
Based on this insight, we propose an overall training-free inference acceleration framework $\Delta$-DiT: using a designed cache mechanism to accelerate the rear DiT blocks in the early sampling stages and the front DiT blocks in the later stages.
1 code implementation • CVPR 2024 • JianJian Cao, Peng Ye, Shengze Li, Chong Yu, Yansong Tang, Jiwen Lu, Tao Chen
To this end, we propose a novel framework named Multimodal Alignment-Guided Dynamic Token Pruning (MADTP) for accelerating various VLTs.
1 code implementation • 23 Jan 2024 • Shengze Li, JianJian Cao, Peng Ye, Yuhan Ding, Chongjun Tu, Tao Chen
Recently, foundational models such as CLIP and SAM have shown promising performance for the task of Zero-Shot Anomaly Segmentation (ZSAS).
no code implementations • 22 Jan 2024 • JianJian Cao, Beiya Dai, Yulin Li, Xiameng Qin, Jingdong Wang
Holi integrates features of the two modalities by a cross-modal attention mechanism, which suppresses the irrelevant redundancy under the guide of positioning information from RoCo.
2 code implementations • NeurIPS 2023 • Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang
To the best of our knowledge, we present one of the very first open-source endeavors in the field, LAMM, encompassing a Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.
no code implementations • 23 Feb 2023 • Lin Zhan, Jiayuan Fan, Peng Ye, JianJian Cao
To address the above issues, we propose a multi-stage search architecture in order to overcome asymmetric spectral-spatial dimensions and capture significant features.
Hyperspectral Image Classification
Neural Architecture Search
no code implementations • 20 Feb 2023 • Jiamu Sheng, Jiayuan Fan, Peng Ye, JianJian Cao
Despite substantial progress in no-reference image quality assessment (NR-IQA), previous training models often suffer from over-fitting due to the limited scale of used datasets, resulting in model performance bottlenecks.
1 code implementation • 14 Dec 2021 • JianJian Cao, Xiameng Qin, Sanyuan Zhao, Jianbing Shen
In this paper, we focus on these two problems and propose a Graph Matching Attention (GMA) network.