1 code implementation • 19 Apr 2024 • Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang
In this paper, we propose a novel LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model, termed MambaMOS.
1 code implementation • 15 Mar 2024 • Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang
Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications.
no code implementations • 13 Mar 2024 • Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang
Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision.
no code implementations • 28 Feb 2024 • Hao Shi, Tatsuya Kawahara
Adapting an automatic speech recognition (ASR) system to unseen noise environments is crucial.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 30 Jan 2024 • Jianbin Jiao, Xina Cheng, WeiJie Chen, Xiaoting Yin, Hao Shi, Kailun Yang
Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are primarily composed of multi-view video data collected in laboratory environments, which contains rich spatial-temporal correlation information besides the image frame content.
1 code implementation • 8 Nov 2023 • Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang
Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques under the same backbones.
1 code implementation • 4 Oct 2023 • Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang
Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety.
Ranked #2 on 3D Object Detection on Rope3D
no code implementations • 28 Aug 2023 • Bingying Yue, Jianhao Li, Hao Shi, Yupei Wang, Honghu Zhong
However, it still has some drawbacks in SAR target classification, especially in fine-grained classification of aircraft: aircrafts in SAR images have large intra-class diversity and inter-class similarity; the number of effective samples is insufficient and it's hard to annotate.
1 code implementation • 14 Aug 2023 • Zhonghua Yi, Hao Shi, Kailun Yang, Qi Jiang, Yaozu Ye, Ze Wang, Huajian Ni, Kaiwei Wang
Based on the modeling method, we present FocusFlow, a framework consisting of 1) a mix loss function combined with a classic photometric loss function and our proposed Conditional Point Control Loss (CPCL) function for diverse point-wise supervision; 2) a conditioned controlling model which substitutes the conventional feature encoder by our proposed Condition Control Encoder (CCE).
no code implementations • 11 Aug 2023 • Liang Chen, Yifei Yin, Hao Shi, Qingqing Sheng, Wei Li
The training image pairs are generated by the sub-sampler from real-word SAR image to estimate the noise distribution.
1 code implementation • 11 Jul 2023 • Yaozu Ye, Hao Shi, Kailun Yang, Ze Wang, Xiaoting Yin, Yining Lin, Mao Liu, Yaonan Wang, Kaiwei Wang
We then propose EVA-Flow, an EVent-based Anytime Flow estimation network to produce high-frame-rate event optical flow with only low-frame-rate optical flow ground truth for supervision.
1 code implementation • 22 Jun 2023 • Qi Jiang, Shaohua Gao, Yao Gao, Kailun Yang, Zhonghua Yi, Hao Shi, Lei Sun, Kaiwei Wang
In this paper, we propose a Panoramic Computational Imaging Engine (PCIE) to address minimalist and high-quality panoramic imaging.
1 code implementation • 11 Jun 2023 • Ze Wang, Kailun Yang, Hao Shi, Yufan Zhang, Zhijie Xu, Fei Gao, Kaiwei Wang
The purpose of our research is to unleash the potential of point-line odometry with large-FoV omnidirectional cameras, even for cameras with negative-plane FoV.
no code implementations • 18 May 2023 • Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji
At the decoded feature level, we fuse the two decoded features by generative and predictive decoders.
1 code implementation • 7 May 2023 • Siyu Li, Kailun Yang, Hao Shi, Jiaming Zhang, Jiacheng Lin, Zhifeng Teng, Zhiyong Li
At the same time, an Across-Space Loss (ASL) is designed to mitigate the negative impact of geometric distortions.
1 code implementation • 24 Mar 2023 • Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen
This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV).
1 code implementation • 24 Mar 2023 • Ze Shi, Hao Shi, Kailun Yang, Zhe Yin, Yining Lin, Kaiwei Wang
To address this, we propose \textit{PanoVPR}, a perspective-to-equirectangular (P2E) visual place recognition framework that employs sliding windows to eliminate feature truncation caused by hard cropping.
1 code implementation • 21 Mar 2023 • Zhifeng Teng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Hao Shi, Simon Reiß, Ke Cao, Rainer Stiefelhagen
Seeing only a tiny part of the whole is not knowing the full circumstance.
1 code implementation • CVPR 2023 • Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, Rainer Stiefelhagen
To make this possible, we present the arbitrary cross-modal segmentation model CMNeXt.
Ranked #1 on Semantic Segmentation on DSEC
no code implementations • 21 Nov 2022 • Qi Jiang, Hao Shi, Shaohua Gao, Jiaming Zhang, Kailun Yang, Lei Sun, Huajian Ni, Kaiwei Wang
Further, we propose Computational Imaging Assisted Domain Adaptation (CIADA) to leverage prior knowledge of CI for robust performance in SSOA.
3 code implementations • 21 Nov 2022 • Hao Shi, Qi Jiang, Kailun Yang, Xiaoting Yin, Huajian Ni, Kaiwei Wang
In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety.
Ranked #1 on Seeing Beyond the Visible on KITTI360-EX
no code implementations • 2 Nov 2022 • Tongtong Song, Qiang Xu, Haoyu Lu, Longbiao Wang, Hao Shi, Yuqin Lin, Yanbing Yang, Jianwu Dang
It has two stages: the speech awareness (SA) stage and the language fusion (LF) stage.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
3 code implementations • 12 Sep 2022 • Ze Wang, Kailun Yang, Hao Shi, Peng Li, Fei Gao, Jian Bai, Kaiwei Wang
As loop closure on wide-FoV panoramic data further comes with a large number of outliers, traditional outlier rejection methods are not directly applicable.
1 code implementation • 25 Jul 2022 • Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen
In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360-degree imagery.
Ranked #1 on Semantic Segmentation on SynPASS
no code implementations • 29 Jun 2022 • Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv, Yuqin Lin, Jianwu Dang
Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition.
1 code implementation • 13 Jun 2022 • Qi Jiang, Hao Shi, Lei Sun, Shaohua Gao, Kailun Yang, Kaiwei Wang
In this paper, we propose an Annular Computational Imaging (ACI) framework to break the optical limit of light-weight PAL design.
1 code implementation • 9 Jun 2022 • Jiaan Chen, Hao Shi, Yaozu Ye, Kailun Yang, Lei Sun, Kaiwei Wang
We then leverage the rasterized event point cloud as input to three different backbones, PointNet, DGCNN, and Point Transformer, with two linear layer decoders to predict the location of human keypoints.
Ranked #1 on 3D Human Pose Estimation on DHP19
no code implementations • 2 Jun 2022 • Hao Shi, Zi-Jiao Wang, Lan-Ru Zhai
Self-attention based models are widely used in news recommendation tasks.
no code implementations • 11 May 2022 • Shaohua Gao, Kailun Yang, Hao Shi, Kaiwei Wang, Jian Bai
However, while satisfying the need for large-FoV photographic imaging, panoramic imaging instruments are expected to have high resolution, no blind area, miniaturization, and multidimensional intelligent perception, and can be combined with artificial intelligence methods towards the next generation of intelligent instruments, enabling deeper understanding and more holistic perception of 360-degree real-world surrounding environments.
no code implementations • 7 Mar 2022 • Hao Shi, Qi Peng, Yiqi Zhuang
Moreover, a novel confidence weighted loss function is proposed to address the imbalance issue and it is implemented by a two-stage learning scheme. Through the two-stage learning, AFNet can focus on high-confidence samples with more valid information and extract effective representations, so as to improve the overall classification performance.
1 code implementation • 27 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Ze Wang, Yaozu Ye, Zhe Yin, Shi Meng, Peng Li, Kaiwei Wang
PanoFlow achieves state-of-the-art performance on the public OmniFlowNet and the established FlowScape benchmarks.
1 code implementation • 25 Feb 2022 • Ze Wang, Kailun Yang, Hao Shi, Peng Li, Fei Gao, Kaiwei Wang
To tackle this issue, we propose LF-VIO, a real-time VIO framework for cameras with extremely large FoV.
1 code implementation • 2 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Kaiwei Wang
In this paper, we propose a new deep network architecture for optical flow estimation in autonomous driving--CSFlow, which consists of two novel modules: Cross Strip Correlation module (CSC) and Correlation Regression Initialization module (CRI).
no code implementations • 18 Feb 2016 • Fuqiang Liu, Fukun Bi, Liang Chen, Hao Shi, Wei Liu
This letter proposes a synthetic aperture radar (SAR) image registration method named Feature-Area Optimization (FAO).