no code implementations • 11 Mar 2025 • Runjian Chen, Wenqi Shao, Bo Zhang, Shaoshuai Shi, Li Jiang, Ping Luo
However, the over-reliance on real labeled data in LiDAR perception limits the scale of on-road attempts.
no code implementations • 2 Jan 2025 • Jiajun Deng, Tianyu He, Li Jiang, Tianyu Wang, Feras Dayoub, Ian Reid
Current 3D Large Multimodal Models (3D LMMs) have shown tremendous potential in 3D-vision-based dialogue and reasoning.
1 code implementation • 15 Dec 2024 • Guan Wang, Haoyi Niu, Jianxiong Li, Li Jiang, Jianming Hu, Xianyuan Zhan
Among various branches of offline reinforcement learning (RL) methods, goal-conditioned supervised learning (GCSL) has gained increasing popularity as it formulates the offline RL problem as a sequential modeling task, therefore bypassing the notoriously difficult credit assignment challenge of value learning in conventional RL paradigm.
no code implementations • 31 Jul 2024 • Anurag Das, Xinting Hu, Li Jiang, Bernt Schiele
Specifically, we first propose Mask-Text Decoder that enhances the mask representations using rich textual data with the CLIP language model.
no code implementations • 9 Jun 2024 • Li Jiang, Xiao Li, Andre Milzarek, Junwen Qiu
Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes.
1 code implementation • 5 Jun 2024 • Li Jiang, Zhaowei Lu, Yuebing Gao, Yifan Wang
Image copy-move is an operation that replaces one part of the image with another part of the same image, which can be used for illegal purposes due to the potential semantic changes.
no code implementations • 19 May 2024 • Li Jiang, Yusen Wu, Junwu Xiong, Jingqing Ruan, Yichuan Ding, Qingpei Guo, Zujie Wen, Jun Zhou, Xiaotie Deng
Preference datasets are essential for incorporating human preferences into pre-trained language models, playing a key role in the success of Reinforcement Learning from Human Feedback.
no code implementations • 17 May 2024 • Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu
Then, we introduce LLM-enabled key techniques and telecom applications in terms of generation, classification, optimization, and prediction problems.
1 code implementation • CVPR 2024 • Senqiao Yang, Zhuotao Tian, Li Jiang, Jiaya Jia
This paper introduces Unified Language-driven Zero-shot Domain Adaptation (ULDA), a novel task setting that enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.
1 code implementation • CVPR 2024 • Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia
This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module to greatly enhance the adaptivity of sparse CNNs at minimal computational cost.
Ranked #5 on
3D Semantic Segmentation
on SemanticKITTI
(val mIoU metric)
no code implementations • 20 Mar 2024 • Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yan
As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation.
1 code implementation • CVPR 2024 • Chengyao Wang, Li Jiang, Xiaoyang Wu, Zhuotao Tian, Bohao Peng, Hengshuang Zhao, Jiaya Jia
To address this issue, we propose GroupContrast, a novel approach that combines segment grouping and semantic-aware contrastive learning.
1 code implementation • 14 Mar 2024 • Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, LiWei Wang
Due to its simple design, this paradigm holds promise for narrowing the architectural gap between vision and language.
Ranked #2 on
Video Captioning
on MSVD-CTN
(using extra training data)
no code implementations • 14 Mar 2024 • Jie Li, Jiaying Wen, Tongxin Yang, Fenglin Cai, Miao Wei, Zhiwei Zhang, Li Jiang
In this paper, we introduce a new dataset in the medical field of Traumatic Brain Injury (TBI), called TBI-IT, which includes both electronic medical records (EMRs) and head CT images.
1 code implementation • 29 Jan 2024 • Jie Li, Yulong Xia, Tongxin Yang, Fenglin Cai, Miao Wei, Zhiwei Zhang, Li Jiang
Index Terms-HICH, Deep learning, Intraparenchymal hemorrhage, named entity recognition, novel dataset
1 code implementation • CVPR 2024 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
no code implementations • CVPR 2024 • Li Jiang, Shaoshuai Shi, Bernt Schiele
In dynamic 3D environments the ability to recognize a diverse range of objects without the constraints of predefined categories is indispensable for real-world applications.
3D Semantic Segmentation
Open Vocabulary Semantic Segmentation
+2
1 code implementation • CVPR 2024 • Xinting Hu, Li Jiang, Bernt Schiele
We present S4Former a novel approach to training Vision Transformers for Semi-Supervised Semantic Segmentation (S4).
1 code implementation • 19 Dec 2023 • Li Jiang, Zhaowei Lu
Therefore, this paper introduces entropy images to determine the coordinates and scales of keypoints based on Scale Invariant Feature Transform detector, which make the pre-processing more suitable for solving the above problems.
3 code implementations • 15 Dec 2023 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
Ranked #1 on
3D Semantic Segmentation
on ScanNet++
(using extra training data)
2 code implementations • 19 Sep 2023 • Shaocong Xu, Pengfei Li, Qianpu Sun, Xinyu Liu, Yang Li, Shihui Guo, Zhen Wang, Bo Jiang, Rui Wang, Kehua Sheng, Bo Zhang, Li Jiang, Hao Zhao, Yilun Chen
We demonstrate that learning different abstaining penalties, apart from point-wise penalty, for different types of (synthesized) outliers can further improve the performance.
1 code implementation • 16 Aug 2023 • Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks.
no code implementations • 6 Jul 2023 • Li Jiang, Sijie Cheng, JieLin Qiu, Haoran Xu, Wai Kin Chan, Zhao Ding
The prevalent use of benchmarks in current offline reinforcement learning (RL) research has led to a neglect of the imbalance of real-world dataset distributions in the development of models.
1 code implementation • 30 Jun 2023 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Extensive experimental results demonstrate that the MTR framework achieves state-of-the-art performance on the highly-competitive motion prediction benchmarks, while the MTR++ framework surpasses its precursor, exhibiting enhanced performance and efficiency in predicting accurate multimodal future trajectories for multiple agents.
1 code implementation • NeurIPS 2023 • Peng Cheng, Xianyuan Zhan, Zhihao Wu, Wenjia Zhang, Shoucheng Song, Han Wang, Youfang Lin, Li Jiang
Based on extensive experiments, we find TSRL achieves great performance on small benchmark datasets with as few as 1% of the original samples, which significantly outperforms the recent offline RL algorithms in terms of data efficiency and generalizability. Code is available at: https://github. com/pcheng2/TSRL
1 code implementation • CVPR 2024 • Zhikai Zhang, Jian Ding, Li Jiang, Dengxin Dai, Gui-Song Xia
Based on the point features, we perform a bottom-up multicut algorithm to segment point clouds into coarse instance masks as pseudo labels, which are used to train a point cloud instance segmentation model.
no code implementations • CVPR 2023 • Li Jiang, Zetong Yang, Shaoshuai Shi, Vladislav Golyanik, Dengxin Dai, Bernt Schiele
Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images.
no code implementations • 18 Apr 2023 • Li Jiang, Ting Zhang, Qiruyi Zuo, Chenyu Tian, George P. Chan, Wai Kin, Chan
Spatiotemporal (ST) data collected by sensors can be represented as multi-variate time series, which is a sequence of data points listed in an order of time.
4 code implementations • 28 Mar 2023 • Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Victor Wai Kin Chan, Xianyuan Zhan
This gives a deeper understanding of why the in-sample learning paradigm works, i. e., it applies implicit value regularization to the policy.
2 code implementations • 21 Mar 2023 • Zhuotao Tian, Jiequan Cui, Li Jiang, Xiaojuan Qi, Xin Lai, Yixin Chen, Shu Liu, Jiaya Jia
Semantic segmentation is still a challenging task for parsing diverse contexts in different scenes, thus the fixed classifier might not be able to well address varying feature distributions during testing.
1 code implementation • 15 Oct 2022 • Haoran Xu, Li Jiang, Jianxiong Li, Xianyuan Zhan
We decompose the conventional reward-maximizing policy in offline RL into a guide-policy and an execute-policy.
2 code implementations • 11 Oct 2022 • Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao
In this work, we analyze the limitations of the Point Transformer and propose our powerful and efficient Point Transformer V2 model with novel designs that overcome the limitations of previous work.
Ranked #1 on
3D Semantic Segmentation
on nuScenes
3 code implementations • 27 Sep 2022 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Predicting multimodal future behavior of traffic participants is essential for robotic vehicles to make safe decisions.
2 code implementations • 20 Sep 2022 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
In this report, we present the 1st place solution for motion prediction track in 2022 Waymo Open Dataset Challenges.
2 code implementations • 28 Jul 2022 • Yan Hu, Zhongxi Qiu, Dan Zeng, Li Jiang, Chen Lin, Jiang Liu
Vascular segmentation extracts blood vessels from images and serves as the basis for diagnosing various diseases, like ophthalmic diseases.
no code implementations • 3 Jul 2022 • Wu Zheng, Li Jiang, Fanbin Lu, Yangyang Ye, Chi-Wing Fu
To boost a detector for single-frame 3D object detection, we present a new approach to train it to simulate features and responses following a detector trained on multi-frame point clouds.
no code implementations • CVPR 2022 • Wu Zheng, Mingxuan Hong, Li Jiang, Chi-Wing Fu
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
no code implementations • 6 Jun 2022 • Xiangjin Xie, Yangning Li, Wang Chen, Kai Ouyang, Li Jiang, Haitao Zheng
(2) linear combination significantly limits the sampling space for generating samples.
1 code implementation • 4 Apr 2022 • Runyu Ding, Jihan Yang, Li Jiang, Xiaojuan Qi
Deep learning approaches achieve prominent success in 3D semantic segmentation.
4 code implementations • CVPR 2022 • Xin Lai, Jianhui Liu, Li Jiang, LiWei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia
In this paper, we propose Stratified Transformer that is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
Ranked #19 on
Semantic Segmentation
on ScanNet
no code implementations • 11 Mar 2022 • Zhuoran Song, Yihong Xu, Han Li, Naifeng Jing, Xiaoyao Liang, Li Jiang
The training phases of Deep neural network~(DNN) consumes enormous processing time and energy.
1 code implementation • 9 Mar 2022 • Zhuoran Song, Yihong Xu, Zhezhi He, Li Jiang, Naifeng Jing, Xiaoyao Liang
We explore the sparsity in ViT and observe that informative patches and heads are sufficient for accurate image recognition.
1 code implementation • CVPR 2022 • Zetong Yang, Li Jiang, Yanan sun, Bernt Schiele, Jiaya Jia
This is achieved by introducing an intermediate representation, i. e., Q-representation, in the querying stage to serve as a bridge between the embedding stage and task heads.
Ranked #7 on
Semantic Segmentation
on S3DIS
no code implementations • 30 Jan 2022 • Weidong Cao, Yilong Zhao, Adith Boloor, Yinhe Han, Xuan Zhang, Li Jiang
This paper presents a new PIM architecture to efficiently accelerate deep learning tasks by minimizing the required A/D conversions with analog accumulation and neural approximated peripheral circuits.
1 code implementation • 15 Dec 2021 • Yu Gong, Zhihan Xu, Zhezhi He, Weifeng Zhang, Xiaobing Tu, Xiaoyao Liang, Li Jiang
From the software perspective, we mathematically and systematically model the latency and resource utilization of the proposed heterogeneous accelerator, regarding varying system design configurations.
2 code implementations • ICCV 2021 • Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia
To address the high cost and challenges of 3D point-level labeling, we present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.
2 code implementations • CVPR 2021 • Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia
Semantic segmentation has made tremendous progress in recent years.
no code implementations • 10 May 2021 • Min Li, Yu Li, Ye Tian, Li Jiang, Qiang Xu
This paper presents AppealNet, a novel edge/cloud collaborative architecture that runs deep learning (DL) tasks more efficiently than state-of-the-art solutions.
no code implementations • 21 Apr 2021 • Yunyan Hong, Ailing Zeng, Min Li, Cewu Lu, Li Jiang, Qiang Xu
Video action recognition (VAR) is a primary task of video understanding, and untrimmed videos are more common in real-life scenes.
1 code implementation • CVPR 2021 • Wu Zheng, Weiliang Tang, Li Jiang, Chi-Wing Fu
Lastly, to better exploit hard targets, we design an ODIoU loss to supervise the student with constraints on the predicted box centers and orientations.
Ranked #1 on
Birds Eye View Object Detection
on KITTI Cars Easy
1 code implementation • CVPR 2021 • WenBo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong
Via the \emph{BPM}, complementary 2D and 3D information can interact with each other in multiple architectural levels, such that advantages in these two visual domains can be combined for better scene recognition.
Ranked #21 on
Semantic Segmentation
on ScanNet
no code implementations • 4 Mar 2021 • Yu-Chen Guo, Li Jiang, Ji-Chong Yang
The search of new physics~(NP) beyond the Standard Model is one of the most important tasks of high energy physics.
High Energy Physics - Phenomenology
no code implementations • 2 Mar 2021 • Fangxin Liu, Wenbo Zhao, Yilong Zhao, Zongwu Wang, Tao Yang, Zhezhi He, Naifeng Jing, Xiaoyao Liang, Li Jiang
However, it is challenging for crossbar architecture to exploit the sparsity in the DNN.
1 code implementation • 31 Jan 2021 • Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, Hongsheng Li
3D object detection is receiving increasing attention from both industry and academia thanks to its wide applications in various fields.
Ranked #2 on
3D Object Detection
on KITTI Cars Easy val
1 code implementation • ICCV 2021 • Fangxin Liu, Wenbo Zhao, Zhezhi He, Yanzhi Wang, Zongwu Wang, Changzhi Dai, Xiaoyao Liang, Li Jiang
Model quantization has emerged as a mandatory technique for efficient inference with advanced Deep Neural Networks (DNN).
24 code implementations • ICCV 2021 • Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun
For example, on the challenging S3DIS dataset for large-scale semantic scene segmentation, the Point Transformer attains an mIoU of 70. 4% on Area 5, outperforming the strongest prior model by 3. 3 absolute percentage points and crossing the 70% mIoU threshold for the first time.
Ranked #4 on
3D Semantic Segmentation
on S3DIS
1 code implementation • 5 Dec 2020 • Wu Zheng, Weiliang Tang, Sijin Chen, Li Jiang, Chi-Wing Fu
Existing single-stage detectors for locating objects in point clouds often treat object localization and category classification as separate tasks, so the localization accuracy and classification confidence may not well align.
Ranked #3 on
Birds Eye View Object Detection
on KITTI Cars Easy
1 code implementation • CVPR 2022 • Zhuotao Tian, Xin Lai, Li Jiang, Shu Liu, Michelle Shu, Hengshuang Zhao, Jiaya Jia
Then, since context is essential for semantic segmentation, we propose the Context-Aware Prototype Learning (CAPL) that significantly improves performance by 1) leveraging the co-occurrence prior knowledge from support samples, and 2) dynamically enriching contextual information to the classifier, conditioned on the content of each query image.
2 code implementations • CVPR 2020 • Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia
Instance segmentation is an important task for scene understanding.
Ranked #9 on
3D Instance Segmentation
on STPLS3D
12 code implementations • CVPR 2020 • Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN), for accurate 3D object detection from point clouds.
no code implementations • 10 Dec 2019 • Takuya Yoshioka, Igor Abramovski, Cem Aksoylar, Zhuo Chen, Moshe David, Dimitrios Dimitriadis, Yifan Gong, Ilya Gurvich, Xuedong Huang, Yan Huang, Aviv Hurvitz, Li Jiang, Sharon Koubi, Eyal Krupka, Ido Leichter, Changliang Liu, Partha Parthasarathy, Alon Vinnikov, Lingfeng Wu, Xiong Xiao, Wayne Xiong, Huaming Wang, Zhenghao Wang, Jun Zhang, Yong Zhao, Tianyan Zhou
This increases marginally to 1. 6% when 50% of the attendees are unknown to the system.
1 code implementation • 10 Dec 2019 • Qi Yan, Li Jiang, Solmaz Kia
Optimal selection of which teammates a robot should take a relative measurement from such that the updated joint localization uncertainty of the team is minimized is an NP-hard problem.
Robotics
no code implementations • ICCV 2019 • Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia
To incorporate point features in the edge branch, we establish a hierarchical graph framework, where the graph is initialized from a coarse layer and gradually enriched along the point decoding process.
Ranked #46 on
Semantic Segmentation
on S3DIS Area5
no code implementations • 29 Aug 2019 • Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang
Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model.
no code implementations • 4 Apr 2019 • Hongyu Chen, Li Jiang
Ubiquitous anomalies endanger the security of our system constantly.
1 code implementation • 19 Oct 2018 • Haiyue Song, Chengwen Xu, Qiang Xu, Zhuoran Song, Naifeng Jing, Xiaoyao Liang, Li Jiang
We thus propose a novel approximate computing architecture with a Multiclass-Classifier and Multiple Approximators (MCMA).
no code implementations • ECCV 2018 • Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia
We propose to add geometric adversarial loss (GAL).
2 code implementations • 27 Jul 2018 • Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li Jiang
To guarantee the approximation quality, existing works deploy two neural networks (NNs), e. g., an approximator and a predictor.
no code implementations • 23 May 2018 • Zhuoran Song, Ru Wang, Dongyu Ru, Hongru Huang, Zhenghao Peng, Jing Ke, Xiaoyao Liang, Li Jiang
In this paper, we propose the Approximate Random Dropout that replaces the conventional random dropout of neurons and synapses with a regular and predefined patterns to eliminate the unnecessary computation and data access.