Search Results for author: Hao Shi

Found 34 papers, 22 papers with code

MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

1 code implementation • 19 Apr 2024 • Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang

In this paper, we propose a novel LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model, termed MambaMOS.

Paper
Code

Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

1 code implementation • 15 Mar 2024 • Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications.

Unsupervised Domain Adaptation

Paper
Code

OccFiner: Offboard Occupancy Refinement with Hybrid Propagation

no code implementations • 13 Mar 2024 • Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang

Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision.

3D Semantic Scene Completion

Paper
Add Code

Investigation of Adapter for Automatic Speech Recognition in Noisy Environment

no code implementations • 28 Feb 2024 • Hao Shi, Tatsuya Kawahara

Adapting an automatic speech recognition (ASR) system to unseen noise environments is crucial.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers

1 code implementation • 30 Jan 2024 • Jianbin Jiao, Xina Cheng, WeiJie Chen, Xiaoting Yin, Hao Shi, Kailun Yang

Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are primarily composed of multi-view video data collected in laboratory environments, which contains rich spatial-temporal correlation information besides the image frame content.

3D Human Pose Estimation Scene Understanding

Paper
Code

Rethinking Event-based Human Pose Estimation with 3D Event Representations

1 code implementation • 8 Nov 2023 • Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang

Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques under the same backbones.

Autonomous Driving Pose Estimation

Paper
Code

CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

1 code implementation • 4 Oct 2023 • Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety.

Ranked #2 on 3D Object Detection on Rope3D

feature selection Monocular 3D Object Detection +1

Paper
Code

MS-Net: A Multi-modal Self-supervised Network for Fine-Grained Classification of Aircraft in SAR Images

no code implementations • 28 Aug 2023 • Bingying Yue, Jianhao Li, Hao Shi, Yupei Wang, Honghu Zhong

However, it still has some drawbacks in SAR target classification, especially in fine-grained classification of aircraft: aircrafts in SAR images have large intra-class diversity and inter-class similarity; the number of effective samples is insufficient and it's hard to annotate.

Classification Earth Observation +1

Paper
Add Code

FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving

1 code implementation • 14 Aug 2023 • Zhonghua Yi, Hao Shi, Kailun Yang, Qi Jiang, Yaozu Ye, Ze Wang, Huajian Ni, Kaiwei Wang

Based on the modeling method, we present FocusFlow, a framework consisting of 1) a mix loss function combined with a classic photometric loss function and our proposed Conditional Point Control Loss (CPCL) function for diverse point-wise supervision; 2) a conditioned controlling model which substitutes the conventional feature encoder by our proposed Condition Control Encoder (CCE).

Autonomous Driving Optical Flow Estimation +1

Paper
Code

A Self-supervised SAR Image Despeckling Strategy Based on Parameter-sharing Convolutional Neural Networks

no code implementations • 11 Aug 2023 • Liang Chen, Yifei Yin, Hao Shi, Qingqing Sheng, Wei Li

The training image pairs are generated by the sub-sampler from real-word SAR image to estimate the noise distribution.

Sar Image Despeckling

Paper
Add Code

Towards Anytime Optical Flow Estimation with Event Cameras

1 code implementation • 11 Jul 2023 • Yaozu Ye, Hao Shi, Kailun Yang, Ze Wang, Xiaoting Yin, Yining Lin, Mao Liu, Yaonan Wang, Kaiwei Wang

We then propose EVA-Flow, an EVent-based Anytime Flow estimation network to produce high-frame-rate event optical flow with only low-frame-rate optical flow ground truth for supervision.

Autonomous Driving Motion Estimation +1

Paper
Code

Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers

1 code implementation • 22 Jun 2023 • Qi Jiang, Shaohua Gao, Yao Gao, Kailun Yang, Zhonghua Yi, Hao Shi, Lei Sun, Kaiwei Wang

In this paper, we propose a Panoramic Computational Imaging Engine (PCIE) to address minimalist and high-quality panoramic imaging.

Super-Resolution

Paper
Code

LF-PGVIO: A Visual-Inertial-Odometry Framework for Large Field-of-View Cameras using Points and Geodesic Segments

1 code implementation • 11 Jun 2023 • Ze Wang, Kailun Yang, Hao Shi, Yufan Zhang, Zhijie Xu, Fei Gao, Kaiwei Wang

The purpose of our research is to unleash the potential of point-line odometry with large-FoV omnidirectional cameras, even for cameras with negative-plane FoV.

Line Detection

Paper
Code

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

no code implementations • 18 May 2023 • Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji

At the decoded feature level, we fuse the two decoded features by generative and predictive decoders.

Speech Enhancement

Paper
Add Code

Bi-Mapper: Holistic BEV Semantic Mapping for Autonomous Driving

1 code implementation • 7 May 2023 • Siyu Li, Kailun Yang, Hao Shi, Jiaming Zhang, Jiacheng Lin, Zhifeng Teng, Zhiyong Li

At the same time, an Across-Space Loss (ASL) is designed to mitigate the negative impact of geometric distortions.

Autonomous Driving

Paper
Code

FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

1 code implementation • 24 Mar 2023 • Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen

This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV).

Image Outpainting Semantic Segmentation

Paper
Code

PanoVPR: Towards Unified Perspective-to-Equirectangular Visual Place Recognition via Sliding Windows across the Panoramic View

1 code implementation • 24 Mar 2023 • Ze Shi, Hao Shi, Kailun Yang, Zhe Yin, Yining Lin, Kaiwei Wang

To address this, we propose \textit{PanoVPR}, a perspective-to-equirectangular (P2E) visual place recognition framework that employs sliding windows to eliminate feature truncation caused by hard cropping.

Autonomous Driving Image Retrieval +2

Paper
Code

360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View

1 code implementation • 21 Mar 2023 • Zhifeng Teng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Hao Shi, Simon Reiß, Ke Cao, Rainer Stiefelhagen

Seeing only a tiny part of the whole is not knowing the full circumstance.

Semantic Segmentation

Paper
Code

Delivering Arbitrary-Modal Semantic Segmentation

1 code implementation • CVPR 2023 • Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, Rainer Stiefelhagen

To make this possible, we present the arbitrary cross-modal segmentation model CMNeXt.

Ranked #1 on Semantic Segmentation on DSEC

Segmentation Semantic Segmentation +1

123

Paper
Code

Computational Imaging for Machine Perception: Transferring Semantic Segmentation beyond Aberrations

no code implementations • 21 Nov 2022 • Qi Jiang, Hao Shi, Shaohua Gao, Jiaming Zhang, Kailun Yang, Lei Sun, Huajian Ni, Kaiwei Wang

Further, we propose Computational Imaging Assisted Domain Adaptation (CIADA) to leverage prior knowledge of CI for robust performance in SSOA.

Scene Understanding Semantic Segmentation +1

Paper
Add Code

Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer

3 code implementations • 21 Nov 2022 • Hao Shi, Qi Jiang, Kailun Yang, Xiaoting Yin, Huajian Ni, Kaiwei Wang

In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety.

Ranked #1 on Seeing Beyond the Visible on KITTI360-EX

Autonomous Vehicles object-detection +4

Paper
Code

Monolingual Recognizers Fusion for Code-switching Speech Recognition

no code implementations • 2 Nov 2022 • Tongtong Song, Qiang Xu, Haoyu Lu, Longbiao Wang, Hao Shi, Yuqin Lin, Yanbing Yang, Jianwu Dang

It has two stages: the speech awareness (SA) stage and the language fusion (LF) stage.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

LF-VISLAM: A SLAM Framework for Large Field-of-View Cameras with Negative Imaging Plane on Mobile Agents

3 code implementations • 12 Sep 2022 • Ze Wang, Kailun Yang, Hao Shi, Peng Li, Fei Gao, Jian Bai, Kaiwei Wang

As loop closure on wide-FoV panoramic data further comes with a large number of outliers, traditional outlier rejection methods are not directly applicable.

Autonomous Driving Simultaneous Localization and Mapping

Paper
Code

Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

1 code implementation • 25 Jul 2022 • Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen

In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360-degree imagery.

Ranked #1 on Semantic Segmentation on SynPASS

Pseudo Label Segmentation +2

Paper
Code

Language-specific Characteristic Assistance for Code-switching Speech Recognition

no code implementations • 29 Jun 2022 • Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv, Yuqin Lin, Jianwu Dang

Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Annular Computational Imaging: Capture Clear Panoramic Images through Simple Lens

1 code implementation • 13 Jun 2022 • Qi Jiang, Hao Shi, Lei Sun, Shaohua Gao, Kailun Yang, Kaiwei Wang

In this paper, we propose an Annular Computational Imaging (ACI) framework to break the optical limit of light-weight PAL design.

Image Restoration

Paper
Code

Efficient Human Pose Estimation via 3D Event Point Cloud

1 code implementation • 9 Jun 2022 • Jiaan Chen, Hao Shi, Yaozu Ye, Kailun Yang, Lei Sun, Kaiwei Wang

We then leverage the rasterized event point cloud as input to three different backbones, PointNet, DGCNN, and Point Transformer, with two linear layer decoders to predict the location of human keypoints.

Ranked #1 on 3D Human Pose Estimation on DHP19

3D Human Pose Estimation Edge-computing +1

Paper
Code

DCAN: Diversified News Recommendation with Coverage-Attentive Networks

no code implementations • 2 Jun 2022 • Hao Shi, Zi-Jiao Wang, Lan-Ru Zhai

Self-attention based models are widely used in news recommendation tasks.

News Recommendation

Paper
Add Code

Review on Panoramic Imaging and Its Applications in Scene Understanding

no code implementations • 11 May 2022 • Shaohua Gao, Kailun Yang, Hao Shi, Kaiwei Wang, Jian Bai

However, while satisfying the need for large-FoV photographic imaging, panoramic imaging instruments are expected to have high resolution, no blind area, miniaturization, and multidimensional intelligent perception, and can be combined with artificial intelligence methods towards the next generation of intelligent instruments, enabling deeper understanding and more holistic perception of 360-degree real-world surrounding environments.

Autonomous Driving Depth Estimation +4

Paper
Add Code

An Improved Automatic Modulation Classification Scheme Based on Adaptive Fusion Network

no code implementations • 7 Mar 2022 • Hao Shi, Qi Peng, Yiqi Zhuang

Moreover, a novel confidence weighted loss function is proposed to address the imbalance issue and it is implemented by a two-stage learning scheme. Through the two-stage learning, AFNet can focus on high-confidence samples with more valid information and extract effective representations, so as to improve the overall classification performance.

Classification valid

Paper
Add Code

PanoFlow: Learning 360° Optical Flow for Surrounding Temporal Understanding

1 code implementation • 27 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Ze Wang, Yaozu Ye, Zhe Yin, Shi Meng, Peng Li, Kaiwei Wang

PanoFlow achieves state-of-the-art performance on the public OmniFlowNet and the established FlowScape benchmarks.

Autonomous Vehicles Optical Flow Estimation

Paper
Code

LF-VIO: A Visual-Inertial-Odometry Framework for Large Field-of-View Cameras with Negative Plane

1 code implementation • 25 Feb 2022 • Ze Wang, Kailun Yang, Hao Shi, Peng Li, Fei Gao, Kaiwei Wang

To tackle this issue, we propose LF-VIO, a real-time VIO framework for cameras with extremely large FoV.

Autonomous Driving Visual Odometry

113

Paper
Code

CSFlow: Learning Optical Flow via Cross Strip Correlation for Autonomous Driving

1 code implementation • 2 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Kaiwei Wang

In this paper, we propose a new deep network architecture for optical flow estimation in autonomous driving--CSFlow, which consists of two novel modules: Cross Strip Correlation module (CSC) and Correlation Regression Initialization module (CRI).

Autonomous Driving Optical Flow Estimation

Paper
Code

Feature-Area Optimization: A Novel SAR Image Registration Method

no code implementations • 18 Feb 2016 • Fuqiang Liu, Fukun Bi, Liang Chen, Hao Shi, Wei Liu

This letter proposes a synthetic aperture radar (SAR) image registration method named Feature-Area Optimization (FAO).

Image Registration

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.