2 code implementations • ICCV 2023 • Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang
In this paper, we propose VAD, an end-to-end vectorized paradigm for autonomous driving, which models the driving scene as a fully vectorized representation.
no code implementations • 5 Dec 2022 • Bo Jiang, Shaoyu Chen, Xinggang Wang, Bencheng Liao, Tianheng Cheng, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang
Motion prediction is highly relevant to the perception of dynamic objects and static map elements in the scenarios of autonomous driving.
1 code implementation • 11 Aug 2022 • Chuanguang Yang, Zhulin An, Helong Zhou, Linhang Cai, Xiang Zhi, Jiwen Wu, Yongjun Xu, Qian Zhang
MixSKD mutually distills feature maps and probability distributions between the random pair of original images and their mixup images in a meaningful way.
2 code implementations • 23 Jul 2022 • Chuanguang Yang, Zhulin An, Helong Zhou, Fuzhen Zhuang, Yongjun Xu, Qian Zhan
This enables each network to learn extra contrastive knowledge from others, leading to better feature representations, thus improving the performance of visual recognition tasks.
no code implementations • 21 Jun 2022 • Yihan Hu, Wenxin Shao, Bo Jiang, Jiajie Chen, Siqi Chai, Zhening Yang, Jingyu Qian, Helong Zhou, Qiang Liu
In this report, we introduce our solution to the Occupancy and Flow Prediction challenge in the Waymo Open Dataset Challenges at CVPR 2022, which ranks 1st on the leaderboard.
1 code implementation • CVPR 2022 • Chuanguang Yang, Helong Zhou, Zhulin An, Xue Jiang, Yongjun Xu, Qian Zhang
Current Knowledge Distillation (KD) methods for semantic segmentation often guide the student to mimic the teacher's structured information generated from individual data samples.
1 code implementation • ACL 2022 • Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin
In particular, audio and visual front-ends are trained on large-scale unimodal datasets, then we integrate components of both front-ends into a larger multimodal framework which learns to recognize parallel audio-visual data into characters through a combination of CTC and seq2seq decoding.
Ranked #2 on Automatic Speech Recognition (ASR) on LRS2
Audio-Visual Speech Recognition Automatic Speech Recognition (ASR) +7
4 code implementations • 1 Feb 2021 • Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang
The outputs from the teacher network are used as soft labels for supervising the training of a new network.
Ranked #25 on Knowledge Distillation on ImageNet
no code implementations • ICLR 2021 • Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang
In this paper, we investigate the bias-variance tradeoff brought by distillation with soft labels.
1 code implementation • 12 Jul 2019 • Qian Zhang, Jianjun Li, Meng Yao, Liangchen Song, Helong Zhou, Zhichao Li, Wenming Meng, Xuezhi Zhang, Guoli Wang
In this paper, we propose a novel network design mechanism for efficient embedded computing.
Ranked #5 on Face Verification on CFP-FP