no code implementations • 10 Apr 2024 • Diankun Zhang, Guoan Wang, Runwen Zhu, Jianbo Zhao, Xiwu Chen, Siyu Zhang, Jiahao Gong, Qibin Zhou, Wenyuan Zhang, Ningzi Wang, Feiyang Tan, Hangning Zhou, Ziyao Xu, Haotian Yao, Chi Zhang, Xiaojun Liu, Xiaoguang Di, Bin Li
End-to-End paradigms use a unified framework to implement multi-tasks in an autonomous driving system.
1 code implementation • 28 Feb 2024 • Shen Cai, Zhanhao Wu, Lingxi Guo, Jiachun Wang, Siyu Zhang, Junchi Yan, Shuhan Shen
Under the minimal $4$-point configuration, the first and the last similarity transformations in SKS are computed by two anchor points on target and source planes, respectively.
no code implementations • 17 Dec 2023 • Siyu Zhang, Wendong Mao, Huihong Shi, Zhongfeng Wang
Video compression is widely used in digital television, surveillance systems, and virtual reality.
no code implementations • 21 Nov 2023 • Sirui Cheng, Siyu Zhang, Jiayi Wu, Muchen Lan
Within the multimodal field, large vision-language models (LVLMs) have made significant progress due to their strong perception and reasoning capabilities in the visual and language systems.
no code implementations • 20 Oct 2023 • Siyu Zhang, Yeming Chen, Sirui Cheng, Yaoru Sun, Jun Yang, Lizhi Bai
It parses the entire image as a fine-to-coarse hierarchical structure of constituent visual patterns, and captures multiscale features by progressively merging adjacent superpixels as graph nodes.
1 code implementation • 24 Sep 2023 • Haoran Wang, Zeshen Tang, Leya Yang, Yaoru Sun, Fang Wang, Siyu Zhang, Yeming Chen
Here, we propose a goal-conditioned HRL framework named Guided Cooperation via Model-based Rollout (GCMR), aiming to bridge inter-layer information synchronization and cooperation by exploiting forward dynamics.
Hierarchical Reinforcement Learning reinforcement-learning +1
no code implementations • 18 Aug 2023 • Yeming Chen, Siyu Zhang, Yaoru Sun, Weijian Liang, Haoran Wang
In this work, we propose an efficient computation framework for multimodal alignment by introducing a novel visual semantic module to further improve the performance of the VL tasks.
1 code implementation • 11 Aug 2023 • Shangqing Tu, Zheyuan Zhang, Jifan Yu, Chunyang Li, Siyu Zhang, Zijun Yao, Lei Hou, Juanzi Li
However, few MOOC platforms are providing human or virtual teaching assistants to support learning for massive online students due to the complexity of real-world online education scenarios and the lack of training data.
no code implementations • 26 Jul 2023 • Siyu Zhang, Yeming Chen, Yaoru Sun, Fang Wang, Haibo Shi, Haoran Wang
Visual question answering (VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly.
1 code implementation • 18 Apr 2023 • Xiyang Wang, Chunyun Fu, JiaWei He, Mingguang Huang, Ting Meng, Siyu Zhang, Hangning Zhou, Ziyao Xu, Chi Zhang
In the classical tracking-by-detection (TBD) paradigm, detection and tracking are separately and sequentially conducted, and data association must be properly performed to achieve satisfactory tracking performance.
1 code implementation • 7 Oct 2022 • Haoqin Tu, Zhongliang Yang, Jinshuai Yang, Siyu Zhang, Yongfeng Huang
Visualization of the local latent prior well confirms the primary devotion in hidden space of the proposed model.
1 code implementation • CVPR 2022 • Jiaming Sun, ZiHao Wang, Siyu Zhang, Xingyi He, Hongcheng Zhao, Guofeng Zhang, Xiaowei Zhou
We propose a new method named OnePose for object pose estimation.
1 code implementation • 27 Feb 2022 • Yongdong Huang, Yuanzhan Li, Xulong Cao, Siyu Zhang, Shen Cai, Ting Lu, Jie Wang, Yuqi Liu
However, many previous works employ neural networks with fixed architecture and size to represent different 3D objects, which lead to excessive network parameters for simple objects and limited reconstruction accuracy for complex objects.
1 code implementation • 19 Jan 2022 • Yuanzhan Li, Yuqi Liu, Yujie Lu, Siyu Zhang, Shen Cai, Yanting Zhang
Compared to previous works, our method achieves the high-fidelity and high-compression 3D object coding and reconstruction.
1 code implementation • Findings (ACL) 2021 • Siyu Zhang, Zhongliang Yang, Jinshuai Yang, Yongfeng Huang
Generative linguistic steganography mainly utilized language models and applied steganographic sampling (stegosampling) to generate high-security steganographic text (stegotext).
1 code implementation • 31 May 2021 • Siyu Zhang, Hui Cao, Yuqi Liu, Shen Cai, Yanting Zhang, Yuanzhan Li, Xiaoyu Chi
Using deep learning techniques to process 3D objects has achieved many successes.
no code implementations • ICCV 2021 • Jiaming Sun, Yiming Xie, Siyu Zhang, Linghao Chen, Guofeng Zhang, Hujun Bao, Xiaowei Zhou
In this work, we propose a novel system for integrated 3D object detection and tracking, which uses a dynamic object occupancy map and previous object states as spatial-temporal memory to assist object detection in future frames.
1 code implementation • 9 Nov 2020 • Ji Luo, Hui Cao, Jie Wang, Siyu Zhang, Shen Cai
Voxel-based 3D object classification has been thoroughly studied in recent years.
no code implementations • 27 Jun 2020 • Lamei Zhang, Siyu Zhang, Bin Zou, Hongwei Dong
To handle this problem, in this paper, learning transferrable representations from unlabeled PolSAR data through convolutional architectures is explored for the first time.
1 code implementation • CVPR 2020 • Jiaming Sun, Linghao Chen, Yiming Xie, Siyu Zhang, Qinhong Jiang, Xiaowei Zhou, Hujun Bao
In this paper, we propose a novel system named Disp R-CNN for 3D object detection from stereo images.
3D Object Detection From Stereo Images Disparity Estimation +2
1 code implementation • 25 Dec 2019 • Hui Cao, Haikuan Du, Siyu Zhang, Shen Cai
Unlike previous methods that use points, voxels, or multi-view images as inputs of deep neural network (DNN), the proposed method constructs a class of more representative features named infilling spheres from signed distance field (SDF).
no code implementations • 16 Nov 2019 • Hongwei Dong, Siyu Zhang, Bin Zou, Lamei Zhang
By DAS, the weights parameters and architecture parameters (corresponds to the hyperparameters but not the topologies) can be optimized by stochastic gradient descent method during the training.