no code implementations • 15 Mar 2024 • Yuanhang Zhang, Zhidi Lin, Yiyong Sun, Feng Yin, Carsten Fritsche
Deep state-space models (DSSMs) have gained popularity in recent years due to their potent modeling capacity for dynamic systems.
no code implementations • 13 Jul 2023 • Yuanhang Zhang, Jundong Liu
Path planning plays a crucial role in various autonomy applications, and RRT* is one of the leading solutions in this field.
no code implementations • 26 May 2023 • Zaibin Zhang, Yuanhang Zhang, Lijun Wang, Yifan Wang, Huchuan Lu
At the core of our method is the newly-designed instance occupancy prediction (IOP) module, which aims to infer point-level occupancy status for each instance in the frustum space.
no code implementations • 22 Jun 2022 • Yuanhang Zhang, Susan Liang, Shuang Yang, Shiguang Shan
This report presents a brief description of our winning solution to the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2022.
2 code implementations • 24 Apr 2022 • Zhuohao Li, Fandi Gou, Qixin De, Leqi Ding, Yuanhang Zhang, Yunze Cai
Innovation of our method is using information fusion to compensate the problem of insufficient frame rate of output image, and improve the robustness of target detection and depth estimation under monocular vision. Object Detection is based on YOLO-v5.
no code implementations • 5 Aug 2021 • Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan, Xilin Chen
Our solution is a novel, unified framework that focuses on jointly modeling multiple types of contextual information: spatial context to indicate the position and scale of each candidate's face, relational context to capture the visual relationships among the candidates and contrast audio-visual affinities with each other, and temporal context to aggregate long-term information and smooth out local uncertainties.
no code implementations • The ActivityNet Large-Scale Activity Recognition Challenge Workshop, CVPR 2021 • Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan
This report presents a brief description of our method for the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2021.
no code implementations • 15 Feb 2020 • Nicolas K. Fontaine, Yuanhang Zhang, Haoshuo Chen, Roland Ryf, David T. Neilson, Guifang Li, Mark Cappuzzo, Rose Kopf, Al Tate, Hugo Safar, Cristian Bolle, Mark Earnshaw, Joel Carpenter
We designed, fabricated and tested an optical hybrid that supports an octave of bandwidth (900-1800 nm) and below 4-dB insertion loss using multiplane light conversion.
Optics
no code implementations • The ActivityNet Large-Scale Activity Recognition Challenge Workshop, CVPR 2019 • Yuanhang Zhang, Jingyun Xiao, Shuang Yang, Shiguang Shan
This report describes the approach underlying our submission to the active speaker detection task (task B-2) of ActivityNet Challenge 2019.
Ranked #17 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker (using extra training data)