1 code implementation • 6 Jul 2021 • Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, Yu Zhang
As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding.
no code implementations • 27 Nov 2020 • Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang
As a special design of this transformer, the information encoded in the encoder is different from that in the decoder, i. e. the encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-channel information for the current frame in a voxel-wise manner.
1 code implementation • 26 Dec 2018 • Yiheng Liu, Zhenxun Yuan, Wengang Zhou, Houqiang Li
How to explore the abundant spatial-temporal information in video sequences is the key to solve this problem.