1 code implementation • 20 Feb 2025 • Tao Ji, Bin Guo, Yuanbin Wu, Qipeng Guo, Lixing Shen, Zhan Chen, Xipeng Qiu, Qi Zhang, Tao Gui
For example, the KV cache size of Llama2-7B is reduced by 92. 19%, with only a 0. 5% drop in LongBench performance.
no code implementations • CVPR 2025 • Keyizhi Xu, Chi Zhang, Zhan Chen, Zhongyuan Wang, Chunxia Xiao, Chao Liang
Multi-exit neural networks represent a promising approach to enhancing model inference efficiency, yet like common neural networks, they suffer from significantly reduced robustness against adversarial attacks.
1 code implementation • 21 Aug 2024 • Enze Zhu, Zhan Chen, Dingkai Wang, Hanru Shi, Xiaoxuan Liu, Lei Wang
Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning and disaster assessment. Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient.
no code implementations • 1 Jul 2024 • Zhan Chen, Chen Tang, Lu Xiong
Additionally, to enhance the temporal consistency and causal relationships of the predictions, we propose a Time Series Memory framework to learn the conditional distribution models of the prediction outputs at future time steps from multivariate time series.
1 code implementation • 11 Jan 2024 • Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data.
no code implementations • 12 Oct 2023 • Zhan Chen, Yidan Zhang, Xiyu Qi, Yongqiang Mao, Xin Zhou, Lulu Niu, Hui Wu, Lei Wang, Yunping Ge
MIB supplements the fixed sample grid in CNN of the conventional backbone network with tokens of different interaction ranges.
1 code implementation • 21 Jul 2023 • Kai Lei, Zhan Chen, Shuman Jia, Xiaoteng Zhang
In this study, we propose a new detection algorithm called HVDetFusion, which is a multi-modal detection algorithm that not only supports pure camera data as input for detection, but also can perform fusion input of radar data and camera data.
Ranked #112 on
3D Object Detection
on nuScenes
1 code implementation • 7 Jul 2022 • Zhan Chen, Hong Liu, Tianyu Guo, Zhengyan Chen, Pinhao Song, Hao Tang
First, SkeleMix utilizes the topological information of skeleton data to mix two skeleton sequences by randomly combing the cropped skeleton fragments (the trimmed view) with the remaining skeleton sequences (the truncated view).
2 code implementations • 28 Jun 2022 • Pinhao Song, Pengteng Li, Linhui Dai, Tao Wang, Zhan Chen
This work aims to solve the problem from two perspectives: uncertainty modeling and hard example mining.
Ranked #85 on
Object Detection
on COCO test-dev
1 code implementation • 27 Jun 2022 • Zhan Chen, Sicheng Li, Bing Yang, Qinghan Li, Hong Liu
To solve this problem, we present a multi-scale spatial graph convolution (MS-GC) module and a multi-scale temporal graph convolution (MT-GC) module to enrich the receptive field of the model in spatial and temporal dimensions.
1 code implementation • 7 Dec 2021 • Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding
In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.
Contrastive Learning
Few-Shot Skeleton-Based Action Recognition
+5
no code implementations • Interspeech 2020 • Hong Liu, Zhan Chen, Bing Yang
Second, the hybrid visual stream is combined with the audio stream by an attention-based bidirectional synchronous fusion which allows bidirectional information interaction to resolve the asynchrony between the two modalities during fusion.
Ranked #5 on
Landmark-based Lipreading
on LRW
Audio-Visual Speech Recognition
Landmark-based Lipreading
+2
no code implementations • 28 Aug 2020 • Siliang Tang, Qi Zhang, Tianpeng Zheng, Mengdi Zhou, Zhan Chen, Lixing Shen, Xiang Ren, Yueting Zhuang, ShiLiang Pu, Fei Wu
When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction.
no code implementations • EMNLP 2018 • Huang Hu, Xianchao Wu, Bingfeng Luo, Chongyang Tao, Can Xu, Wei Wu, Zhan Chen
The 20 Questions (Q20) game is a well known game which encourages deductive reasoning and creativity.