Search Results for author: Yikang Li

Found 54 papers, 29 papers with code

RhyRNN: Rhythmic RNN for Recognizing Events in Long and Complex Videos

no code implementations ECCV 2020 Tianshu Yu, Yikang Li, Baoxin Li

We study the behavior of RhyRNN and empirically show that our method works well even when mph{only event-level labels are available} in the training stage (compared to algorithms requiring sub-activity labels for recognition), and thus is more practical when the sub-activity labels are missing or difficult to obtain.

LimSim: A Long-term Interactive Multi-scenario Traffic Simulator

1 code implementation13 Jul 2023 Licheng Wen, Daocheng Fu, Song Mao, Pinlong Cai, Min Dou, Yikang Li, Yu Qiao

With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing.

Autonomous Driving

StreetSurf: Extending Multi-view Implicit Surface Reconstruction to Street Views

1 code implementation8 Jun 2023 Jianfei Guo, Nianchen Deng, Xinyang Li, Yeqi Bai, Botian Shi, Chiyu Wang, Chenjing Ding, Dongliang Wang, Yikang Li

We present a novel multi-view implicit surface reconstruction technique, termed StreetSurf, that is readily applicable to street view images in widely-used autonomous driving datasets, such as Waymo-perception sequences, without necessarily requiring LiDAR data.

Autonomous Driving Neural Rendering +2

Calib-Anything: Zero-training LiDAR-Camera Extrinsic Calibration Method Using Segment Anything

1 code implementation5 Jun 2023 Zhaotong Luo, Guohang Yan, Yikang Li

The research on extrinsic calibration between Light Detection and Ranging(LiDAR) and camera are being promoted to a more accurate, automatic and generic manner.

Camera Calibration

AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset

1 code implementation NeurIPS 2023 Jiakang Yuan, Bo Zhang, Xiangchao Yan, Tao Chen, Botian Shi, Yikang Li, Yu Qiao

It is a long-term vision for Autonomous Driving (AD) community that the perception models can learn from a large-scale point cloud dataset, to obtain unified representations that can achieve promising results on different tasks or benchmarks.

Autonomous Driving Point Cloud Pre-training

SUG: Single-dataset Unified Generalization for 3D Point Cloud Classification

2 code implementations16 May 2023 Siyuan Huang, Bo Zhang, Botian Shi, Peng Gao, Yikang Li, Hongsheng Li

In this paper, different from previous 2D DG works, we focus on the 3D DG problem and propose a Single-dataset Unified Generalization (SUG) framework that only leverages a single source dataset to alleviate the unforeseen domain differences faced by a well-trained source model.

3D Point Cloud Classification Domain Generalization +2

Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

1 code implementation CVPR 2023 Xinmiao Lin, Yikang Li, Jenhao Hsiao, Chiuman Ho, Yu Kong

The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises.

Image Generation Image Reconstruction

Perception Imitation: Towards Synthesis-free Simulator for Autonomous Vehicles

no code implementations19 Apr 2023 Xiaoliang Ju, Yiyang Sun, Yiming Hao, Yikang Li, Yu Qiao, Hongsheng Li

We propose a perception imitation method to simulate results of a certain perception model, and discuss a new heuristic route of autonomous driving simulator without data synthesis.

Autonomous Driving

SCPNet: Semantic Scene Completion on Point Cloud

1 code implementation CVPR 2023 Zhaoyang Xia, Youquan Liu, Xin Li, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao

We propose a simple yet effective label rectification strategy, which uses off-the-shelf panoptic segmentation labels to remove the traces of dynamic objects in completion labels, greatly improving the performance of deep models especially for those moving objects.

3D Semantic Scene Completion Knowledge Distillation +3

Rethinking Range View Representation for LiDAR Segmentation

no code implementations ICCV 2023 Lingdong Kong, Youquan Liu, Runnan Chen, Yuexin Ma, Xinge Zhu, Yikang Li, Yuenan Hou, Yu Qiao, Ziwei Liu

We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i. e., SemanticKITTI, nuScenes, and ScribbleKITTI.

3D Semantic Segmentation Autonomous Driving +4

SensorX2car: Sensors-to-car calibration for autonomous driving in road scenarios

1 code implementation18 Jan 2023 Guohang Yan, Zhaotong Luo, Zhuochun Liu, Yikang Li

However, most prior methods focus on extrinsic calibration between sensors, and few focus on the misalignment between the sensors and the vehicle coordinate system.

Autonomous Driving

CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP

1 code implementation CVPR 2023 Runnan Chen, Youquan Liu, Lingdong Kong, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao, Wenping Wang

For the first time, our pre-trained network achieves annotation-free 3D semantic segmentation with 20. 8% and 25. 08% mIoU on nuScenes and ScanNet, respectively.

3D Semantic Segmentation Contrastive Learning +4

UniDA3D: Unified Domain Adaptive 3D Semantic Segmentation Pipeline

1 code implementation20 Dec 2022 Ben Fei, Siyuan Huang, Jiakang Yuan, Botian Shi, Bo Zhang, Weidong Yang, Min Dou, Yikang Li

Different from previous studies that only focus on a single adaptation task, UniDA3D can tackle several adaptation tasks in 3D segmentation field, by designing a unified source-and-target active sampling strategy, which selects a maximally-informative subset from both source and target domains for effective model adaptation.

3D Semantic Segmentation Domain Generalization +2

LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for Autonomous Driving

1 code implementation7 Dec 2022 Xiang Li, Junbo Yin, Botian Shi, Yikang Li, Ruigang Yang, Jianbing Shen

In this paper, we present a more artful framework, LiDAR-guided Weakly Supervised Instance Segmentation (LWSIS), which leverages the off-the-shelf 3D data, i. e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.

Autonomous Driving Instance Segmentation +5

Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library

2 code implementations29 Nov 2022 Xinyu Cai, Wentao Jiang, Runsheng Xu, Wenquan Zhao, Jiaqi Ma, Si Liu, Yikang Li

Through simulating point cloud data in different LiDAR placements, we can evaluate the perception accuracy of these placements using multiple detection models.

Online LiDAR-Camera Extrinsic Parameters Self-checking

1 code implementation19 Oct 2022 Pengjin Wei, Guohang Yan, Yikang Li, Kun Fang, Jie Yang, Wei Liu

This calibration task is multi-modal, where the rich color and texture information captured by the camera and the accurate three-dimensional spatial information from the LiDAR is incredibly significant for downstream tasks.

Autonomous Driving Binary Classification

Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection

no code implementations18 Oct 2022 Xin Li, Botian Shi, Yuenan Hou, Xingjiao Wu, Tianlong Ma, Yikang Li, Liang He

To address these problems, we construct the homogeneous structure between the point cloud and images to avoid projective information loss by transforming the camera features into the LiDAR 3D space.

3D Object Detection Autonomous Driving +1

Open Vocabulary Multi-Label Classification with Dual-Modal Decoder on Aligned Visual-Textual Features

no code implementations19 Aug 2022 Shichao Xu, Yikang Li, Jenhao Hsiao, Chiuman Ho, Zhu Qi

In computer vision, multi-label recognition are important tasks with many real-world applications, but classifying previously unseen labels remains a significant challenge.

Classification Multi-Label Classification +1

Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

no code implementations CVPR 2022 Yuenan Hou, Xinge Zhu, Yuexin Ma, Chen Change Loy, Yikang Li

This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.

Ranked #8 on LIDAR Semantic Segmentation on nuScenes (val mIoU metric)

3D Semantic Segmentation Knowledge Distillation +1

OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving

1 code implementation27 May 2022 Guohang Yan, Liu Zhuochun, Chengjie Wang, Chunlei Shi, Pengjin Wei, Xinyu Cai, Tao Ma, Zhizheng Liu, Zebin Zhong, Yuqian Liu, Ming Zhao, Zheng Ma, Yikang Li

To this end, we present OpenCalib, a calibration toolbox that contains a rich set of various sensor calibration methods.

Autonomous Driving

Comprehensive Review of Deep Learning-Based 3D Point Cloud Completion Processing and Analysis

no code implementations7 Mar 2022 Ben Fei, Weidong Yang, Wenming Chen, Zhijun Li, Yikang Li, Tao Ma, Xing Hu, Lipeng Ma

Point cloud completion is a generation and estimation issue derived from the partial point clouds, which plays a vital role in the applications in 3D computer vision.

Point Cloud Completion

$β$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation3 Mar 2022 Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically.

Neural Architecture Search

Multi-modal Sensor Fusion for Auto Driving Perception: A Survey

no code implementations6 Feb 2022 Keli Huang, Botian Shi, Xiang Li, Xin Li, Siyuan Huang, Yikang Li

Multi-modal fusion is a fundamental task for the perception of an autonomous driving system, which has recently intrigued many researchers.

Autonomous Driving object-detection +3

b-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation CVPR 2022 Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically.

Neural Architecture Search

MOC-GAN: Mixing Objects and Captions to Generate Realistic Images

no code implementations6 Jun 2021 Tao Ma, Yikang Li

Correspondingly, a MOC-GAN is proposed to mix the inputs of two modalities to generate realistic images.

Implicit Relations

Perception Entropy: A Metric for Multiple Sensors Configuration Evaluation and Design

no code implementations14 Apr 2021 Tao Ma, Zhizheng Liu, Yikang Li

To tackle these issues, we propose a novel method based on conditional entropy in Bayesian theory to evaluate the sensor configurations containing both cameras and LiDARs.

Autonomous Driving

CRLF: Automatic Calibration and Refinement based on Line Feature for LiDAR and Camera in Road Scenes

no code implementations8 Mar 2021 Tao Ma, Zhizheng Liu, Guohang Yan, Yikang Li

For autonomous vehicles, an accurate calibration for LiDAR and camera is a prerequisite for multi-sensor perception systems.

Autonomous Vehicles

Exploring the Hierarchy in Relation Labels for Scene Graph Generation

no code implementations12 Sep 2020 Yi Zhou, Shuyang Sun, Chao Zhang, Yikang Li, Wanli Ouyang

By assigning each relationship a single label, current approaches formulate the relationship detection as a classification problem.

Graph Generation Relation +2

Deep Learning of Determinantal Point Processes via Proper Spectral Sub-gradient

no code implementations ICLR 2020 Tianshu Yu, Yikang Li, Baoxin Li

Determinantal point processes (DPPs) is an effective tool to deliver diversity on multiple machine learning and computer vision tasks.

Point Processes valid

Recognizing Video Events with Varying Rhythms

1 code implementation14 Jan 2020 Yikang Li, Tianshu Yu, Baoxin Li

In this paper, we investigate the problem of recognizing long and complex events with varying action rhythms, which has not been considered in the literature but is a practical challenge.

Action Recognition

PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

1 code implementation NeurIPS 2019 Yikang Li, Tao Ma, Yeqi Bai, Nan Duan, Sining Wei, Xiaogang Wang

Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops.

Image Generation Object

Disentangling Pose from Appearance in Monochrome Hand Images

no code implementations16 Apr 2019 Yikang Li, Chris Twigg, Yuting Ye, Lingling Tao, Xiaogang Wang

Hand pose estimation from the monocular 2D image is challenging due to the variation in lighting, appearance, and background.

2D Pose Estimation Disentanglement +1

Perceive Where to Focus: Learning Visibility-aware Part-level Features for Partial Person Re-identification

1 code implementation CVPR 2019 Yifan Sun, Qin Xu, Ya-Li Li, Chi Zhang, Yikang Li, Shengjin Wang, Jian Sun

The visibility awareness allows VPM to extract region-level features and compare two images with focus on their shared regions (which are visible on both images).

Person Re-Identification

Plan-Recognition-Driven Attention Modeling for Visual Recognition

no code implementations2 Dec 2018 Yantian Zha, Yikang Li, Tianshu Yu, Subbarao Kambhampati, Baoxin Li

We build an event recognition system, ER-PRN, which takes Pixel Dynamics Network as a subroutine, to recognize events based on observations augmented by plan-recognition-driven attention.

Mean Local Group Average Precision (mLGAP): A New Performance Metric for Hashing-based Retrieval

no code implementations24 Nov 2018 Pak Lun Kevin Ding, Yikang Li, Baoxin Li

In this paper, we introduce a new metric named Mean Local Group Average Precision (mLGAP) for better evaluation of the performance of hashing-based retrieval.

Image Retrieval Retrieval

Question-Guided Hybrid Convolution for Visual Question Answering

no code implementations ECCV 2018 Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang

Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features. To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage.

Question Answering Visual Question Answering

Training Neural Networks by Using Power Linear Units (PoLUs)

1 code implementation1 Feb 2018 Yikang Li, Pak Lun Kevin Ding, Baoxin Li

Experimental results show that our proposed activation function outperforms other state-of-the-art models with most networks.

Image Classification

Recognizing Plans by Learning Embeddings from Observed Action Distributions

no code implementations5 Dec 2017 Yantian Zha, Yikang Li, Sriram Gopalakrishnan, Baoxin Li, Subbarao Kambhampati

The first involves resampling the distribution sequences to single action sequences, from which we could learn an action affinity model based on learned action (word) embeddings for plan recognition.

Activity Recognition Word Embeddings

Semantically Consistent Image Completion with Fine-grained Details

no code implementations26 Nov 2017 Pengpeng Liu, Xiaojuan Qi, Pinjia He, Yikang Li, Michael R. Lyu, Irwin King

Image completion has achieved significant progress due to advances in generative adversarial networks (GANs).

Image Inpainting

Visual Question Generation as Dual Task of Visual Question Answering

no code implementations CVPR 2018 Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, Xiaogang Wang

Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately.

Question Answering Question Generation +2

Scene Graph Generation from Objects, Phrases and Region Captions

1 code implementation ICCV 2017 Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, Xiaogang Wang

Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information.

Graph Generation object-detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.