no code implementations • ECCV 2020 • Tianshu Yu, Yikang Li, Baoxin Li
We study the behavior of RhyRNN and empirically show that our method works well even when mph{only event-level labels are available} in the training stage (compared to algorithms requiring sub-activity labels for recognition), and thus is more practical when the sub-activity labels are missing or difficult to obtain.
no code implementations • 13 Oct 2024 • Lexiang Hu, Yikang Li, Zhouchen Lin
However, constructing equivariant neural networks typically requires prior knowledge of data types and symmetries, which is difficult to achieve in most tasks.
1 code implementation • CVPR 2024 • Yikang Li, Yeqing Qiu, Yuxuan Chen, Lingshen He, Zhouchen Lin
In this paper we build affine equivariant networks based on differential invariants from the viewpoint of symmetric PDEs without discretizing or sampling the group.
1 code implementation • 7 Dec 2023 • Xin Li, Yeqi Bai, Pinlong Cai, Licheng Wen, Daocheng Fu, Bo Zhang, Xuemeng Yang, Xinyu Cai, Tao Ma, Jianfei Guo, Xing Gao, Min Dou, Yikang Li, Botian Shi, Yong liu, Liang He, Yu Qiao
This paper explores the emerging knowledge-driven autonomous driving technologies.
no code implementations • ICCV 2023 • Wentao Jiang, Hao Xiang, Xinyu Cai, Runsheng Xu, Jiaqi Ma, Yikang Li, Gim Hee Lee, Si Liu
We define perceptual gain as the increased perceptual capability when a new LiDAR is placed.
1 code implementation • ICCV 2023 • Youquan Liu, Runnan Chen, Xin Li, Lingdong Kong, Yuchen Yang, Zhaoyang Xia, Yeqi Bai, Xinge Zhu, Yuexin Ma, Yikang Li, Yu Qiao, Yuenan Hou
Besides, we construct the OpenPCSeg codebase, which is the largest and most comprehensive outdoor LiDAR segmentation codebase.
Ranked #2 on 3D Semantic Segmentation on SemanticKITTI (using extra training data)
1 code implementation • 13 Jul 2023 • Licheng Wen, Daocheng Fu, Song Mao, Pinlong Cai, Min Dou, Yikang Li, Yu Qiao
With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing.
1 code implementation • ICCV 2023 • Tao Ma, Xuemeng Yang, Hongbin Zhou, Xin Li, Botian Shi, Junjie Liu, Yuchen Yang, Zhizheng Liu, Liang He, Yu Qiao, Yikang Li, Hongsheng Li
Extensive experiments on Waymo Open Dataset show our DetZero outperforms all state-of-the-art onboard and offboard 3D detection methods.
Ranked #1 on 3D Multi-Object Tracking on Waymo Open Dataset
1 code implementation • 8 Jun 2023 • Jianfei Guo, Nianchen Deng, Xinyang Li, Yeqi Bai, Botian Shi, Chiyu Wang, Chenjing Ding, Dongliang Wang, Yikang Li
We present a novel multi-view implicit surface reconstruction technique, termed StreetSurf, that is readily applicable to street view images in widely-used autonomous driving datasets, such as Waymo-perception sequences, without necessarily requiring LiDAR data.
1 code implementation • 5 Jun 2023 • Zhaotong Luo, Guohang Yan, Yikang Li
The research on extrinsic calibration between Light Detection and Ranging(LiDAR) and camera are being promoted to a more accurate, automatic and generic manner.
1 code implementation • NeurIPS 2023 • Jiakang Yuan, Bo Zhang, Xiangchao Yan, Tao Chen, Botian Shi, Yikang Li, Yu Qiao
It is a long-term vision for Autonomous Driving (AD) community that the perception models can learn from a large-scale point cloud dataset, to obtain unified representations that can achieve promising results on different tasks or benchmarks.
2 code implementations • 16 May 2023 • Siyuan Huang, Bo Zhang, Botian Shi, Peng Gao, Yikang Li, Hongsheng Li
In this paper, different from previous 2D DG works, we focus on the 3D DG problem and propose a Single-dataset Unified Generalization (SUG) framework that only leverages a single source dataset to alleviate the unforeseen domain differences faced by a well-trained source model.
1 code implementation • CVPR 2023 • Xinmiao Lin, Yikang Li, Jenhao Hsiao, Chiuman Ho, Yu Kong
The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises.
no code implementations • 19 Apr 2023 • Xiaoliang Ju, Yiyang Sun, Yiming Hao, Yikang Li, Yu Qiao, Hongsheng Li
We propose a perception imitation method to simulate results of a certain perception model, and discuss a new heuristic route of autonomous driving simulator without data synthesis.
1 code implementation • 21 Mar 2023 • Jingyang Lin, Hang Hua, Ming Chen, Yikang Li, Jenhao Hsiao, Chiuman Ho, Jiebo Luo
We propose a new joint video and text summarization task.
Ranked #1 on Video Summarization on videoxum
1 code implementation • CVPR 2023 • Zhaoyang Xia, Youquan Liu, Xin Li, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao
We propose a simple yet effective label rectification strategy, which uses off-the-shelf panoptic segmentation labels to remove the traces of dynamic objects in completion labels, greatly improving the performance of deep models especially for those moving objects.
Ranked #2 on 3D Semantic Scene Completion on SemanticKITTI
1 code implementation • CVPR 2023 • Bo Zhang, Jiakang Yuan, Botian Shi, Tao Chen, Yikang Li, Yu Qiao
In this paper, we study the task of training a unified 3D detector from multiple datasets.
1 code implementation • CVPR 2023 • Jiakang Yuan, Bo Zhang, Xiangchao Yan, Tao Chen, Botian Shi, Yikang Li, Yu Qiao
Unsupervised Domain Adaptation (UDA) technique has been explored in 3D cross-domain tasks recently.
no code implementations • ICCV 2023 • Lingdong Kong, Youquan Liu, Runnan Chen, Yuexin Ma, Xinge Zhu, Yikang Li, Yuenan Hou, Yu Qiao, Ziwei Liu
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i. e., SemanticKITTI, nuScenes, and ScribbleKITTI.
Ranked #4 on 3D Semantic Segmentation on SemanticKITTI
no code implementations • 8 Mar 2023 • Xing Gao, Xiaogang Jia, Yikang Li, Hongkai Xiong
Due to the complex and changing interactions in dynamic scenarios, motion forecasting is a challenging problem in autonomous driving.
Ranked #1 on Trajectory Prediction on Argoverse2
1 code implementation • CVPR 2023 • Xin Li, Tao Ma, Yuenan Hou, Botian Shi, Yuchen Yang, Youquan Liu, Xingjiao Wu, Qin Chen, Yikang Li, Yu Qiao, Liang He
Notably, LoGoNet ranks 1st on Waymo 3D object detection leaderboard and obtains 81. 02 mAPH (L2) detection performance.
1 code implementation • 18 Jan 2023 • Guohang Yan, Zhaotong Luo, Zhuochun Liu, Yikang Li
However, most prior methods focus on extrinsic calibration between sensors, and few focus on the misalignment between the sensors and the vehicle coordinate system.
1 code implementation • CVPR 2023 • Runnan Chen, Youquan Liu, Lingdong Kong, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao, Wenping Wang
For the first time, our pre-trained network achieves annotation-free 3D semantic segmentation with 20. 8% and 25. 08% mIoU on nuScenes and ScanNet, respectively.
1 code implementation • 20 Dec 2022 • Ben Fei, Siyuan Huang, Jiakang Yuan, Botian Shi, Bo Zhang, Weidong Yang, Min Dou, Yikang Li
Different from previous studies that only focus on a single adaptation task, UniDA3D can tackle several adaptation tasks in 3D segmentation field, by designing a unified source-and-target active sampling strategy, which selects a maximally-informative subset from both source and target domains for effective model adaptation.
1 code implementation • 7 Dec 2022 • Xiang Li, Junbo Yin, Botian Shi, Yikang Li, Ruigang Yang, Jianbing Shen
In this paper, we present a more artful framework, LiDAR-guided Weakly Supervised Instance Segmentation (LWSIS), which leverages the off-the-shelf 3D data, i. e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
2 code implementations • 29 Nov 2022 • Xinyu Cai, Wentao Jiang, Runsheng Xu, Wenquan Zhao, Jiaqi Ma, Si Liu, Yikang Li
Through simulating point cloud data in different LiDAR placements, we can evaluate the perception accuracy of these placements using multiple detection models.
1 code implementation • 19 Oct 2022 • Pengjin Wei, Guohang Yan, Yikang Li, Kun Fang, Jie Yang, Wei Liu
This calibration task is multi-modal, where the rich color and texture information captured by the camera and the accurate three-dimensional spatial information from the LiDAR is incredibly significant for downstream tasks.
no code implementations • 18 Oct 2022 • Xin Li, Botian Shi, Yuenan Hou, Xingjiao Wu, Tianlong Ma, Yikang Li, Liang He
To address these problems, we construct the homogeneous structure between the point cloud and images to avoid projective information loss by transforming the camera features into the LiDAR 3D space.
no code implementations • 19 Aug 2022 • Shichao Xu, Yikang Li, Jenhao Hsiao, Chiuman Ho, Zhu Qi
In computer vision, multi-label recognition are important tasks with many real-world applications, but classifying previously unseen labels remains a significant challenge.
no code implementations • CVPR 2022 • Yuenan Hou, Xinge Zhu, Yuexin Ma, Chen Change Loy, Yikang Li
This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.
Ranked #8 on 3D Semantic Segmentation on SemanticKITTI
1 code implementation • 27 May 2022 • Guohang Yan, Liu Zhuochun, Chengjie Wang, Chunlei Shi, Pengjin Wei, Xinyu Cai, Tao Ma, Zhizheng Liu, Zebin Zhong, Yuqian Liu, Ming Zhao, Zheng Ma, Yikang Li
To this end, we present OpenCalib, a calibration toolbox that contains a rich set of various sensor calibration methods.
no code implementations • 7 Mar 2022 • Ben Fei, Weidong Yang, Wenming Chen, Zhijun Li, Yikang Li, Tao Ma, Xing Hu, Lipeng Ma
Point cloud completion is a generation and estimation issue derived from the partial point clouds, which plays a vital role in the applications in 3D computer vision.
1 code implementation • 7 Mar 2022 • Pengjin Wei, Guohang Yan, Yikang Li, Kun Fang, Xinyu Cai, Jie Yang, Wei Liu
Sensor-based environmental perception is a crucial part of the autonomous driving system.
1 code implementation • 3 Mar 2022 • Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang
Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically.
no code implementations • 6 Feb 2022 • Keli Huang, Botian Shi, Xiang Li, Xin Li, Siyuan Huang, Yikang Li
Multi-modal fusion is a fundamental task for the perception of an autonomous driving system, which has recently intrigued many researchers.
1 code implementation • CVPR 2022 • Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang
Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically.
no code implementations • 6 Jun 2021 • Tao Ma, Yikang Li
Correspondingly, a MOC-GAN is proposed to mix the inputs of two modalities to generate realistic images.
no code implementations • 14 Apr 2021 • Tao Ma, Zhizheng Liu, Yikang Li
To tackle these issues, we propose a novel method based on conditional entropy in Bayesian theory to evaluate the sensor configurations containing both cameras and LiDARs.
no code implementations • 8 Mar 2021 • Tao Ma, Zhizheng Liu, Guohang Yan, Yikang Li
For autonomous vehicles, an accurate calibration for LiDAR and camera is a prerequisite for multi-sensor perception systems.
no code implementations • 19 Dec 2020 • Yikang Li, Pulkit Goel, Varsha Kuppur Rajendra, Har Simrat Singh, Jonathan Francis, Kaixin Ma, Eric Nyberg, Alessandro Oltramari
Conditional text generation has been a challenging task that is yet to see human-level performance from state-of-the-art models.
no code implementations • 12 Sep 2020 • Yi Zhou, Shuyang Sun, Chao Zhang, Yikang Li, Wanli Ouyang
By assigning each relationship a single label, current approaches formulate the relationship detection as a classification problem.
no code implementations • ICLR 2020 • Tianshu Yu, Yikang Li, Baoxin Li
Determinantal point processes (DPPs) is an effective tool to deliver diversity on multiple machine learning and computer vision tasks.
1 code implementation • 14 Jan 2020 • Yikang Li, Tianshu Yu, Baoxin Li
In this paper, we investigate the problem of recognizing long and complex events with varying action rhythms, which has not been considered in the literature but is a practical challenge.
1 code implementation • NeurIPS 2019 • Yikang Li, Tao Ma, Yeqi Bai, Nan Duan, Sining Wei, Xiaogang Wang
Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops.
no code implementations • 16 Apr 2019 • Yikang Li, Chris Twigg, Yuting Ye, Lingling Tao, Xiaogang Wang
Hand pose estimation from the monocular 2D image is challenging due to the variation in lighting, appearance, and background.
1 code implementation • CVPR 2019 • Yifan Sun, Qin Xu, Ya-Li Li, Chi Zhang, Yikang Li, Shengjin Wang, Jian Sun
The visibility awareness allows VPM to extract region-level features and compare two images with focus on their shared regions (which are visible on both images).
Ranked #12 on Person Re-Identification on Market-1501-C
no code implementations • 2 Dec 2018 • Yantian Zha, Yikang Li, Tianshu Yu, Subbarao Kambhampati, Baoxin Li
We build an event recognition system, ER-PRN, which takes Pixel Dynamics Network as a subroutine, to recognize events based on observations augmented by plan-recognition-driven attention.
no code implementations • 24 Nov 2018 • Pak Lun Kevin Ding, Yikang Li, Baoxin Li
In this paper, we introduce a new metric named Mean Local Group Average Precision (mLGAP) for better evaluation of the performance of hashing-based retrieval.
no code implementations • ECCV 2018 • Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang
Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features. To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage.
Ranked #14 on Visual Question Answering (VQA) on CLEVR
1 code implementation • ECCV 2018 • Yikang Li, Wanli Ouyang, Bolei Zhou, Jianping Shi, Chao Zhang, Xiaogang Wang
Generating scene graph to describe all the relations inside an image gains increasing interests these years.
Ranked #1 on Scene Graph Generation on VRD
1 code implementation • 1 Feb 2018 • Yikang Li, Pak Lun Kevin Ding, Baoxin Li
Experimental results show that our proposed activation function outperforms other state-of-the-art models with most networks.
no code implementations • 5 Dec 2017 • Yantian Zha, Yikang Li, Sriram Gopalakrishnan, Baoxin Li, Subbarao Kambhampati
The first involves resampling the distribution sequences to single action sequences, from which we could learn an action affinity model based on learned action (word) embeddings for plan recognition.
no code implementations • 26 Nov 2017 • Pengpeng Liu, Xiaojuan Qi, Pinjia He, Yikang Li, Michael R. Lyu, Irwin King
Image completion has achieved significant progress due to advances in generative adversarial networks (GANs).
no code implementations • CVPR 2018 • Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, Xiaogang Wang
Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately.
1 code implementation • ICCV 2017 • Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, Xiaogang Wang
Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information.
Ranked #2 on Object Detection on Visual Genome
no code implementations • CVPR 2017 • Yikang Li, Wanli Ouyang, Xiaogang Wang, Xiao'ou Tang
In this paper, each visual relationship is considered as a phrase with three components.