no code implementations • ECCV 2020 • Yu Zheng, Danyang Zhang, Sinan Xie, Jiwen Lu, Jie zhou
In this paper, we propose a Rotation-robust Intersection over Union ($ extit{RIoU}$) for 3D object detection, which aims to jointly learn the overlap of rotated bounding boxes.
no code implementations • ECCV 2020 • Guangyi Chen, Yongming Rao, Jiwen Lu, Jie zhou
Specifically, we disentangle the video representation into the temporal coherence and motion parts and randomly change the scale of the temporal motion features as the adversarial noise.
no code implementations • ECCV 2020 • Wenzhao Zheng, Jiwen Lu, Jie zhou
We employ a metric model and a layout encoder to map the RGB images and the ground-truth layouts to the embedding space, respectively, and a layout decoder to map the embeddings to the corresponding layouts, where the whole framework is trained in an end-to-end manner.
no code implementations • ECCV 2020 • Liangliang Ren, Yangyang Song, Jiwen Lu, Jie zhou
Unlike most existing works that define room layout on a 2D image, we model the layout in 3D as a configuration of the camera and the room.
no code implementations • ECCV 2020 • Guangyi Chen, Yuhao Lu, Jiwen Lu, Jie Zhou
Experimental results demonstrate that our DCML method explores credible and valuable training data and improves the performance of unsupervised domain adaptation.
no code implementations • ECCV 2020 • Ziwei Wang, Quan Zheng, Jiwen Lu, Jie zhou
n this paper, we propose a Deep Hashing method with Active Pairwise Supervision(DH-APS).
1 code implementation • 16 Mar 2023 • Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu
Towards a more comprehensive perception of a 3D scene, in this paper, we propose a SurroundOcc method to predict the 3D occupancy with multi-camera images.
1 code implementation • 7 Mar 2023 • XiaoFeng Wang, Zheng Zhu, Wenbo Xu, Yunpeng Zhang, Yi Wei, Xu Chi, Yun Ye, Dalong Du, Jiwen Lu, Xingang Wang
Towards a comprehensive benchmarking of surrounding perception algorithms, we propose OpenOccupancy, which is the first surrounding semantic occupancy perception benchmark.
1 code implementation • arXiv 2023 • Wenliang Zhao, Yongming Rao, Zuyan Liu, Benlin Liu, Jie zhou, Jiwen Lu
In this paper, we propose VPD (Visual Perception with a pre-trained Diffusion model), a new framework that exploits the semantic information of a pre-trained text-to-image diffusion model in visual perception tasks.
Ranked #1 on
Monocular Depth Estimation
on NYU-Depth V2
no code implementations • 23 Feb 2023 • Zhenyu Wu, Ziwei Wang, Jiwen Lu, Haibin Yan
Then we fuse the feature maps representing the visual information of multi-view RGB images and the pixel affinity learned from the clutter point cloud, where the acquired instance segmentation masks of multi-view RGB images are projected to partition the clutter point cloud.
1 code implementation • 15 Feb 2023 • Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang, Jie zhou, Jiwen Lu
To lift image features to the 3D TPV space, we further propose a transformer-based TPV encoder (TPVFormer) to obtain the TPV features effectively.
1 code implementation • 9 Feb 2023 • Wenliang Zhao, Lujia Bai, Yongming Rao, Jie zhou, Jiwen Lu
Combining UniP and UniC, we propose a unified predictor-corrector framework called UniPC for the fast sampling of DPMs, which has a unified analytical form for any order and can significantly improve the sampling quality over previous methods.
1 code implementation • 11 Jan 2023 • Xumin Yu, Yongming Rao, Ziyi Wang, Jiwen Lu, Jie zhou
In this paper, we present a new method that reformulates point cloud completion as a set-to-set translation problem and design a new model, called PoinTr, which adopts a Transformer encoder-decoder architecture for point cloud completion.
no code implementations • 10 Jan 2023 • Shuai Shen, Wenliang Zhao, Zibin Meng, Wanhua Li, Zheng Zhu, Jie zhou, Jiwen Lu
In this way, the proposed DiffTalk is capable of producing high-quality talking head videos in synchronization with the source audio, and more importantly, it can be naturally generalized across different identities without any further fine-tuning.
1 code implementation • 18 Dec 2022 • Borui Zhang, Wenzhao Zheng, Jie zhou, Jiwen Lu
Deep learning has revolutionized human society, yet the black-box nature of deep neural networks hinders further application to reliability-demanded industries.
no code implementations • 9 Dec 2022 • Yansong Tang, Jinpeng Liu, Aoyang Liu, Bin Yang, Wenxun Dai, Yongming Rao, Jiwen Lu, Jie zhou, Xiu Li
With the continuously thriving popularity around the world, fitness activity analytic has become an emerging research topic in computer vision.
1 code implementation • 6 Dec 2022 • Muheng Li, Yueqi Duan, Jie zhou, Jiwen Lu
Previous approaches lack flexibility in both 3D data representation and shape generation, thereby failing to generate highly diversified 3D shapes conforming to the given text descriptions.
no code implementations • 17 Nov 2022 • Sichao Huang, Ziwei Wang, Jie zhou, Jiwen Lu
We compare our approach with existing robotic packing methods for irregular objects in a physics simulator.
1 code implementation • 17 Nov 2022 • Haojun Jiang, Jianke Zhang, Rui Huang, Chunjiang Ge, Zanlin Ni, Jiwen Lu, Jie zhou, Shiji Song, Gao Huang
However, as pre-trained models are scaling up, fully fine-tuning them on text-video retrieval datasets has a high risk of overfitting.
1 code implementation • 15 Nov 2022 • Chengkun Wang, Wenzhao Zheng, Xian Sun, Jiwen Lu, Jie zhou
We propose to learn a global probabilistic distribution for each pixel in the patch and a probabilistic metric to model the distance between distributions.
1 code implementation • 15 Oct 2022 • An Tao, Yueqi Duan, Yingqi Wang, Jiwen Lu, Jie zhou
In this paper, we investigate the dynamics-aware adversarial attack problem of adaptive neural networks.
1 code implementation • 12 Oct 2022 • Han Xiao, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu
Data mixing strategies (e. g., CutMix) have shown the ability to greatly improve the performance of convolutional neural networks (CNNs).
1 code implementation • 11 Oct 2022 • Chengkun Wang, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu
The pretrain-finetune paradigm in modern computer vision facilitates the success of self-supervised learning, which tends to achieve better transferability than supervised learning.
1 code implementation • 22 Aug 2022 • Yunpeng Zhang, Wenzhao Zheng, Zheng Zhu, Guan Huang, Jie zhou, Jiwen Lu
First, we extract multi-scale features and generate the perspective object proposals on each monocular image.
no code implementations • 7 Aug 2022 • Quan Zheng, Ziwei Wang, Jie zhou, Jiwen Lu
Explaining deep convolutional neural networks has been recently drawing increasing attention since it helps to understand the networks' internal operations and why they make certain decisions.
1 code implementation • 4 Aug 2022 • Ziyi Wang, Xumin Yu, Yongming Rao, Jie zhou, Jiwen Lu
Nowadays, pre-training big models on large-scale datasets has become a crucial topic in deep learning.
Ranked #7 on
3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
4 code implementations • 28 Jul 2022 • Yongming Rao, Wenliang Zhao, Yansong Tang, Jie zhou, Ser-Nam Lim, Jiwen Lu
In this paper, we show that the key ingredients behind the vision Transformers, namely input-adaptive, long-range and high-order spatial interactions, can also be efficiently implemented with a convolution-based framework.
Ranked #17 on
Semantic Segmentation
on ADE20K
1 code implementation • 26 Jul 2022 • Cheng Ma, Jingyi Zhang, Jie zhou, Jiwen Lu
On the other hand, we propose a parallel network which includes two branches of cascaded lookup tables which process different components of the input low-resolution images.
1 code implementation • 24 Jul 2022 • Shuai Shen, Wanhua Li, Zheng Zhu, Yueqi Duan, Jie zhou, Jiwen Lu
Thus the facial radiance field can be flexibly adjusted to the new identity with few reference images.
1 code implementation • 18 Jul 2022 • Wanhua Li, Zhexuan Cao, Jianjiang Feng, Jie zhou, Jiwen Lu
As each sample is annotated with multiple attribute labels, these "words" will naturally form an unordered but meaningful "sentence", which depicts the semantic information of the corresponding sample.
1 code implementation • 17 Jul 2022 • Yansong Tang, Xingyu Liu, Xumin Yu, Danyang Zhang, Jiwen Lu, Jie zhou
Different from the conventional adversarial learning-based approaches for UDA, we utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
no code implementations • 12 Jul 2022 • Wanhua Li, Jiwen Lu, Abudukelimu Wuerkaixi, Jianjiang Feng, Jie zhou
Unlike most existing personalized methods that learn the parameters of a personalized estimator for each person in the training set, our method learns the mapping from identity information to age estimator parameters.
1 code implementation • 4 Jul 2022 • Yongming Rao, Zuyan Liu, Wenliang Zhao, Jie zhou, Jiwen Lu
We extend our method to hierarchical models including CNNs and hierarchical vision Transformers as well as more complex dense prediction tasks that require structured feature maps by formulating a more generic dynamic spatial sparsification framework with progressive sparsification and asymmetric computation for different spatial locations.
1 code implementation • CVPR 2022 • Han Xiao, Ziwei Wang, Zheng Zhu, Jie zhou, Jiwen Lu
Differentiable architecture search (DARTS) acquires the optimal architectures by optimizing the architecture parameters with gradient descent, which significantly reduces the search cost.
1 code implementation • 6 Jun 2022 • Wanhua Li, Xiaoke Huang, Zheng Zhu, Yansong Tang, Xiu Li, Jie zhou, Jiwen Lu
In this paper, we propose to learn the rank concepts from the rich semantic CLIP latent space.
1 code implementation • CVPR 2022 • Ziyi Wang, Yongming Rao, Xumin Yu, Jie zhou, Jiwen Lu
Conventional point cloud semantic segmentation methods usually employ an encoder-decoder architecture, where mid-level features are locally aggregated to extract geometric information.
1 code implementation • 19 May 2022 • Yunpeng Zhang, Zheng Zhu, Wenzhao Zheng, JunJie Huang, Guan Huang, Jie zhou, Jiwen Lu
Specifically, BEVerse first performs shared feature extraction and lifting to generate 4D BEV representations from multi-timestamp and multi-view images.
1 code implementation • 9 May 2022 • Chengkun Wang, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu
This paper proposes an introspective deep metric learning (IDML) framework for uncertainty-aware comparisons of images.
no code implementations • ICCV 2021 • Zheng Zhu, Xianda Guo, Tian Yang, JunJie Huang, Jiankang Deng, Guan Huang, Dalong Du, Jiwen Lu, Jie zhou
In this paper, we contribute a new benchmark for Gait REcognition in the Wild (GREW).
no code implementations • 21 Apr 2022 • Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, JunJie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Dalong Du, Jiwen Lu, Jie zhou
For a comprehensive evaluation of face matchers, three recognition tasks are performed under standard, masked and unbiased settings, respectively.
no code implementations • CVPR 2022 • Yu Zheng, Yueqi Duan, Jiwen Lu, Jie zhou, Qi Tian
A bathtub in a library, a sink in an office, a bed in a laundry room -- the counter-intuition suggests that scene provides important prior knowledge for 3D object detection, which instructs to eliminate the ambiguous detection of similar objects.
1 code implementation • CVPR 2022 • Jinglin Xu, Yongming Rao, Xumin Yu, Guangyi Chen, Jie zhou, Jiwen Lu
Most existing action quality assessment methods rely on the deep features of an entire video to predict the score, which is less reliable due to the non-transparent inference process and poor interpretability.
1 code implementation • 7 Apr 2022 • Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu, Jie zhou
In this paper, we propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
no code implementations • IEEE Transactions on Image Processing 2022 • Wencheng Zhu, Yucheng Han, Jiwen Lu, Jie zhou
Then, we construct a temporal graph by using the aggregated representations of spatial graphs.
Ranked #1 on
Video Summarization
on TvSum
1 code implementation • 28 Mar 2022 • Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jie zhou, Jiwen Lu
In this paper, we propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection.
1 code implementation • CVPR 2022 • Borui Zhang, Wenzhao Zheng, Jie zhou, Jiwen Lu
This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images.
Ranked #2 on
Metric Learning
on CARS196
(using extra training data)
1 code implementation • CVPR 2022 • Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie zhou, Jiwen Lu
The generated text prompts are paired with corresponding video clips, and together co-train the text encoder and the video encoder via a contrastive approach.
Ranked #2 on
Action Segmentation
on GTEA
no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.
1 code implementation • CVPR 2022 • Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie zhou, Jiwen Lu
Human behavior has the nature of indeterminacy, which requires the pedestrian trajectory prediction system to model the multi-modality of future motion states.
2 code implementations • CVPR 2022 • Xiuwei Xu, Yifan Wang, Yu Zheng, Yongming Rao, Jie zhou, Jiwen Lu
In this paper, we propose a weakly-supervised approach for 3D object detection, which makes it possible to train a strong 3D detector with position-level annotations (i. e. annotations of object centers).
1 code implementation • 22 Jan 2022 • Mantang Guo, Junhui Hou, Jing Jin, Hui Liu, Huanqiang Zeng, Jiwen Lu
To this end, we propose content-aware warping, which adaptively learns the interpolation weights for pixels of a relatively large neighborhood from their contextual information via a lightweight neural network.
no code implementations • 20 Jan 2022 • Kun Song, Junwei Han, Gong Cheng, Jiwen Lu, Feiping Nie
In this paper, we reveal that metric learning would suffer from serious inseparable problem if without informative sample mining.
no code implementations • CVPR 2022 • Yunpeng Zhang, Wenzhao Zheng, Zheng Zhu, Guan Huang, Dalong Du, Jie zhou, Jiwen Lu
In this paper, we propose a general method to learn appropriate embeddings for dimension estimation in monocular 3D object detection.
2 code implementations • 22 Dec 2021 • Liang Pan, Tong Wu, Zhongang Cai, Ziwei Liu, Xumin Yu, Yongming Rao, Jiwen Lu, Jie zhou, Mingye Xu, Xiaoyuan Luo, Kexue Fu, Peng Gao, Manning Wang, Yali Wang, Yu Qiao, Junsheng Zhou, Xin Wen, Peng Xiang, Yu-Shen Liu, Zhizhong Han, Yuanjie Yan, Junyi An, Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández, Qinlong Wang, Yang Yang
Based on the MVP dataset, this paper reports methods and results in the Multi-View Partial Point Cloud Challenge 2021 on Completion and Registration.
1 code implementation • 17 Dec 2021 • An Tao, Yueqi Duan, He Wang, Ziyi Wu, Pengliang Ji, Haowen Sun, Jie zhou, Jiwen Lu
It results in a serious issue of lagged gradient, making the learned attack at the current step ineffective due to the architecture changes afterward.
1 code implementation • CVPR 2022 • Yongming Rao, Wenliang Zhao, Guangyi Chen, Yansong Tang, Zheng Zhu, Guan Huang, Jie zhou, Jiwen Lu
In this work, we present a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP.
1 code implementation • CVPR 2022 • Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie zhou, Jiwen Lu
Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers.
Ranked #7 on
Few-Shot 3D Point Cloud Classification
on ModelNet40 5-way (10-shot)
(using extra training data)
3D Point Cloud Linear Classification
Few-Shot 3D Point Cloud Classification
+2
1 code implementation • 17 Oct 2021 • Yinghuan Shi, Jian Zhang, Tong Ling, Jiwen Lu, Yefeng Zheng, Qian Yu, Lei Qi, Yang Gao
In semi-supervised medical image segmentation, most previous works draw on the common assumption that higher entropy means higher uncertainty.
1 code implementation • 26 Sep 2021 • Cheng Ma, Yongming Rao, Jiwen Lu, Jie zhou
Firstly, we propose SPSR with gradient guidance (SPSR-G) by exploiting gradient maps of images to guide the recovery in two aspects.
no code implementations • 6 Sep 2021 • Wanhua Li, Jiwen Lu, Abudukelimu Wuerkaixi, Jianjiang Feng, Jie zhou
To address this, we propose a Star-shaped Reasoning Graph Network (S-RGN).
1 code implementation • ICCV 2021 • Yi Wei, Shaohui Liu, Yongming Rao, Wang Zhao, Jiwen Lu, Jie zhou
In this work, we present a new multi-view depth estimation method that utilizes both conventional reconstruction and learning-based priors over the recently proposed neural radiance fields (NeRF).
1 code implementation • 1 Sep 2021 • Haotong Qin, Yifu Ding, Xiangguo Zhang, Jiakai Wang, Xianglong Liu, Jiwen Lu
We first give a theoretical analysis that the diversity of synthetic samples is crucial for the data-free quantization, while in existing approaches, the synthetic data completely constrained by BN statistics experimentally exhibit severe homogenization at distribution and sample levels.
1 code implementation • ICCV 2021 • Wenzhao Zheng, Borui Zhang, Jiwen Lu, Jie zhou
This paper presents a deep relational metric learning (DRML) framework for image clustering and retrieval.
1 code implementation • ICCV 2021 • Xumin Yu, Yongming Rao, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie zhou
In this paper, we present a new method that reformulates point cloud completion as a set-to-set translation problem and design a new model, called PoinTr that adopts a transformer encoder-decoder architecture for point cloud completion.
Ranked #1 on
Point Cloud Completion
on ShapeNet
(Chamfer Distance L2 metric)
1 code implementation • ICCV 2021 • Yongming Rao, Guangyi Chen, Jiwen Lu, Jie zhou
Unlike most existing methods that learn visual attention based on conventional likelihood, we propose to learn the attention with counterfactual causality, which provides a tool to measure the attention quality and a powerful supervisory signal to guide the learning process.
Ranked #6 on
Fine-Grained Image Classification
on FGVC Aircraft
1 code implementation • ICCV 2021 • Xumin Yu, Yongming Rao, Wenliang Zhao, Jiwen Lu, Jie zhou
Assessing action quality is challenging due to the subtle differences between videos and large variations in scores.
Ranked #2 on
Action Quality Assessment
on MTL-AQA
2 code implementations • ICCV 2021 • Yongming Rao, Benlin Liu, Yi Wei, Jiwen Lu, Cho-Jui Hsieh, Jie zhou
In particular, we propose to generate random layouts of a scene by making use of the objects in the synthetic CAD dataset and learn the 3D scene representation by applying object-level contrastive learning on two random scenes generated from the same set of synthetic objects.
no code implementations • 16 Aug 2021 • Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, JunJie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Jia Guo, Jiwen Lu, Dalong Du, Jie zhou
There are second phase of the challenge till October 1, 2021 and on-going leaderboard.
1 code implementation • ICCV 2021 • Wenliang Zhao, Yongming Rao, Ziyi Wang, Jiwen Lu, Jie zhou
Our method is model-agnostic, which can be applied to off-the-shelf backbone networks and metric learning methods.
Ranked #13 on
Metric Learning
on CUB-200-2011
1 code implementation • 11 Aug 2021 • Guangyi Chen, Tianpei Gu, Jiwen Lu, Jin-An Bao, Jie zhou
Experimental results demonstrate the superiority of our method, which outperforms the state-of-the-art methods by a large margin with limited computational cost.
Ranked #14 on
Person Re-Identification
on MSMT17
1 code implementation • ICCV 2021 • Ziwei Wang, Han Xiao, Jiwen Lu, Jie zhou
On the contrary, our GMPQ searches the mixed-quantization policy that can be generalized to largescale datasets with only a small amount of data, so that the search cost is significantly reduced without performance degradation.
1 code implementation • ICCV 2021 • Ziwei Wang, Yunsong Wang, Ziyi Wu, Jiwen Lu, Jie zhou
In this paper, we propose an instance similarity learning (ISL) method for unsupervised feature representation.
1 code implementation • ICCV 2021 • Guangyi Chen, Junlong Li, Nuoxing Zhou, Liangliang Ren, Jiwen Lu
In this paper, we present a distribution discrimination (DisDis) method to predict personalized motion patterns by distinguishing the potential distributions.
1 code implementation • ICCV 2021 • Guangyi Chen, Junlong Li, Jiwen Lu, Jie zhou
Most existing methods learn to predict future trajectories by behavior clues from history trajectories and interaction clues from environments.
1 code implementation • 4 Jul 2021 • Linqing Zhao, Jiwen Lu, Jie zhou
To address this, we employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds and utilize them to guide the fusion of two modalities to further exploit complementary information.
Ranked #11 on
Semantic Segmentation
on ScanNet
3 code implementations • NeurIPS 2021 • Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, Jie zhou
Recent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer inductive biases.
Ranked #9 on
Image Classification
on Stanford Cars
(using extra training data)
1 code implementation • CVPR 2021 • Wenzhao Zheng, Chengkun Wang, Jiwen Lu, Jie zhou
In this paper, we propose a deep compositional metric learning (DCML) framework for effective and generalizable similarity measurement between images.
1 code implementation • CVPR 2021 • Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie zhou
To address the dilemma of large-scale training and efficient inference, we propose the STructure-AwaRe Face Clustering (STAR-FC) method.
no code implementations • CVPR 2021 • Guoli Wang, Jiaqi Ma, Qian Zhang, Jiwen Lu, Jie zhou
Many of them settle it by generating fake frontal faces from extreme ones, whereas they are tough to maintain the identity information with high computational consumption and uncontrolled disturbances.
1 code implementation • CVPR 2021 • Shuyan Li, Xiu Li, Jiwen Lu, Jie zhou
Most existing unsupervised video hashing methods are built on unidirectional models with less reliable training objectives, which underuse the correlations among frames and the similarity structure between videos.
1 code implementation • NeurIPS 2021 • Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie zhou, Cho-Jui Hsieh
Based on this observation, we propose a dynamic token sparsification framework to prune redundant tokens progressively and dynamically based on the input.
Ranked #297 on
Image Classification
on ImageNet
1 code implementation • 17 May 2021 • Yi Wei, Shang Su, Jiwen Lu, Jie zhou
To tackle this problem, we propose frustum-aware geometric reasoning (FGR) to detect vehicles in point clouds without any 3D annotations.
3 code implementations • CVPR 2021 • Yunpeng Zhang, Jiwen Lu, Jie zhou
The precise localization of 3D objects from a single image without depth information is a highly challenging problem.
Ranked #4 on
Monocular 3D Object Detection
on KITTI Cars Moderate
no code implementations • 6 Apr 2021 • Jiabin Zhang, Zheng Zhu, Jiwen Lu, JunJie Huang, Guan Huang, Jie zhou
To make a better trade-off between accuracy and efficiency, we propose a novel multi-person pose estimation framework, SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation (SIMPLE).
no code implementations • CVPR 2021 • Wanhua Li, Shiwei Wang, Jiwen Lu, Jianjiang Feng, Jie zhou
In the end, the samples in the unbalanced train batch are re-weighted by the learned meta-miner to optimize the kinship models.
1 code implementation • CVPR 2021 • Wanhua Li, Xiaoke Huang, Jiwen Lu, Jianjiang Feng, Jie zhou
An ordinal distribution constraint is proposed to exploit the ordinal nature of regression.
1 code implementation • 24 Mar 2021 • Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie zhou
To address the dilemma of large-scale training and efficient inference, we propose the STructure-AwaRe Face Clustering (STAR-FC) method.
no code implementations • CVPR 2021 • Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, JunJie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Jiwen Lu, Dalong Du, Jie zhou
In this paper, we contribute a new million-scale face benchmark containing noisy 4M identities/260M faces (WebFace260M) and cleaned 2M identities/42M faces (WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol.
Ranked #1 on
Face Verification
on IJB-C
(training dataset metric)
1 code implementation • 18 Feb 2021 • Wencheng Zhu, Jiahao Li, Jiwen Lu, Jie zhou
Specifically, we first compute a pixel-wise similarity matrix by using representations of reference and target pixels and then select top-rank reference pixels for target pixel classification.
One-shot visual object segmentation
Video Semantic Segmentation
no code implementations • 2 Feb 2021 • Cheng Ma, Jiwen Lu, Jie zhou
As hashing becomes an increasingly appealing technique for large-scale image retrieval, multi-label hashing is also attracting more attention for the ability to exploit multi-level semantic contents.
no code implementations • 19 Jan 2021 • Lei He, Jiwen Lu, Guanghui Wang, Shiyu Song, Jie zhou
In this paper, we first introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks through an analysis of the imaging process, then propose a Semantic Object Segmentation and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
Ranked #44 on
Semantic Segmentation
on NYU Depth v2
no code implementations • ICCV 2021 • Bingyao Yu, Wanhua Li, Xiu Li, Jiwen Lu, Jie zhou
In this paper, we propose a frequency-aware spatiotemporal transformers for deep In this paper, we propose a Frequency-Aware Spatiotemporal Transformer (FAST) for video inpainting detection, which aims to simultaneously mine the traces of video inpainting from spatial, temporal, and frequency domains.
1 code implementation • 18 Dec 2020 • An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie zhou
Most existing point cloud instance and semantic segmentation methods rely heavily on strong supervision signals, which require point-level labels for every point in the scene.
1 code implementation • CVPR 2021 • Yi Wei, Ziyi Wang, Yongming Rao, Jiwen Lu, Jie zhou
In this paper, we propose a Point-Voxel Recurrent All-Pairs Field Transforms (PV-RAFT) method to estimate scene flow from point clouds.
1 code implementation • 1 Dec 2020 • Wencheng Zhu, Jiwen Lu, Jiahao Li, and Jie Zhou
In this paper, we propose a Detect-to-Summarize network (DSNet) framework for supervised video summarization.
Ranked #2 on
Video Summarization
on TvSum
no code implementations • ECCV 2020 • Lijie Liu, Chufan Wu, Jiwen Lu, Lingxi Xie, Jie zhou, Qi Tian
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Ranked #16 on
Vehicle Pose Estimation
on KITTI Cars Hard
no code implementations • ECCV 2020 • Benlin Liu, Yongming Rao, Jiwen Lu, Jie zhou, Cho-Jui Hsieh
Knowledge Distillation (KD) has been one of the most popu-lar methods to learn a compact model.
1 code implementation • ECCV 2020 • Wanhua Li, Yueqi Duan, Jiwen Lu, Jianjiang Feng, Jie zhou
Human beings are fundamentally sociable -- that we generally organize our social lives in terms of relations with other people.
1 code implementation • CVPR 2020 • Yansong Tang, Zanlin Ni, Jiahuan Zhou, Danyang Zhang, Jiwen Lu, Ying Wu, Jie zhou
Assessing action quality from videos has attracted growing attention in recent years.
Ranked #4 on
Action Quality Assessment
on AQA-7
no code implementations • 12 May 2020 • Shan Gu, Jianjiang Feng, Jiwen Lu, Jie zhou
Given a pair of fingerprints to match, we bypass the minutiae extraction step and take uniformly sampled points as key points.
no code implementations • 22 Apr 2020 • Wanhua Li, Yingqiang Zhang, Kangchen Lv, Jiwen Lu, Jianjiang Feng, Jie zhou
In this paper, we propose a graph-based kinship reasoning (GKR) network for kinship verification, which aims to effectively perform relational reasoning on the extracted features of an image pair.
1 code implementation • CVPR 2020 • Yongming Rao, Jiwen Lu, Jie zhou
Based on this hypothesis, we propose to learn point cloud representation by bidirectional reasoning between the local structures at different abstraction hierarchies and the global shape without human supervision.
2 code implementations • CVPR 2020 • Cheng Ma, Yongming Rao, Yean Cheng, Ce Chen, Jiwen Lu, Jie zhou
In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details.
Ranked #38 on
Image Super-Resolution
on Urban100 - 4x upscaling
1 code implementation • CVPR 2020 • Cheng Ma, Zhenyu Jiang, Yongming Rao, Jiwen Lu, Jie zhou
In this paper, we propose a deep face super-resolution (FSR) method with iterative collaboration between two recurrent networks which focus on facial image recovery and landmark estimation respectively.
no code implementations • 20 Mar 2020 • Yansong Tang, Jiwen Lu, Jie zhou
We believe the introduction of the COIN dataset will promote the future in-depth research on instructional video analysis for the community.
2 code implementations • CVPR 2020 • Ziwei Wang, Ziyi Wu, Jiwen Lu, Jie zhou
Conventional network binarization methods directly quantize the weights and activations in one-stage or two-stage detectors with constrained representational capacity, so that the information redundancy in the networks causes numerous false positives and degrades the performance significantly.
no code implementations • 23 Feb 2020 • Hao-Chiang Shao, Kang-Yu Liu, Chia-Wen Lin, Jiwen Lu
With their aid, DotFAN can learn a disentangled face representation and effectively generate face images of various facial attributes while preserving the identity of augmented faces.
no code implementations • 19 Dec 2019 • Peiyu Yu, Yongming Rao, Jiwen Lu, Jie zhou
Humans are able to perform fast and accurate object pose estimation even under severe occlusion by exploiting learned object model priors from everyday life.
1 code implementation • 18 Oct 2019 • Yinghuan Shi, Tiexin Qin, Yong liu, Jiwen Lu, Yang Gao, Dinggang Shen
By introducing an unified optimization goal, DeepAugNet intends to combine the data augmentation and the deep model training in an end-to-end training manner which is realized by simultaneously training a hybrid architecture of dueling deep Q-learning algorithm and a surrogate deep model.
1 code implementation • ICCV 2019 • Yongcheng Liu, Bin Fan, Gaofeng Meng, Jiwen Lu, Shiming Xiang, Chunhong Pan
Point cloud processing is very challenging, as the diverse shapes formed by irregular points are often indistinguishable.
Ranked #16 on
3D Part Segmentation
on ShapeNet-Part
no code implementations • ICLR 2019 • Shaohui Liu*, Yi Wei*, Jiwen Lu, Jie zhou
Unlike most existing evaluation frameworks which transfer the representation of ImageNet inception model to map images onto the feature space, our framework uses a specialized encoder to acquire fine-grained domain-specific representation.
no code implementations • CVPR 2019 • Lijie Liu, Jiwen Lu, Chunjing Xu, Qi Tian, Jie zhou
In this paper, we propose to learn a deep fitting degree scoring network for monocular 3D object detection, which aims to score fitting degree between proposals and object conclusively.
Ranked #7 on
Vehicle Pose Estimation
on KITTI Cars Hard
no code implementations • CVPR 2019 • Yi Wei, Shaohui Liu, Wang Zhao, Jiwen Lu, Jie zhou
In this paper, we present a new perspective towards image-based shape generation.
no code implementations • CVPR 2019 • Wanhua Li, Jiwen Lu, Jianjiang Feng, Chunjing Xu, Jie zhou, Qi Tian
Existing methods for age estimation usually apply a divide-and-conquer strategy to deal with heterogeneous data caused by the non-stationary aging process.
2 code implementations • CVPR 2019 • Wenzhao Zheng, Zhaodong Chen, Jiwen Lu, Jie zhou
This paper presents a hardness-aware deep metric learning (HDML) framework.
Ranked #27 on
Metric Learning
on CUB-200-2011
(using extra training data)
no code implementations • CVPR 2019 • Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, Jie zhou
There are substantial instructional videos on the Internet, which enables us to acquire knowledge for completing various tasks.
no code implementations • ECCV 2018 • Chunze Lin, Jiwen Lu, Gang Wang, Jie zhou
In this paper, we propose a graininess-aware deep feature learning method for pedestrian detection.
no code implementations • ECCV 2018 • Xin Yuan, Liangliang Ren, Jiwen Lu, Jie zhou
In this paper, we propose a simple yet effective relaxation-free method to learn more effective binary codes via policy gradient for scalable image search.
no code implementations • ECCV 2018 • Lei Chen, Jiwen Lu, Zhanjie Song, Jie zhou
In this paper, we propose a part-activated deep reinforcement learning (PA-DRL) for action prediction.
no code implementations • ECCV 2018 • Minghao Guo, Jiwen Lu, Jie zhou
In this paper, we propose a dual-agent deep reinforcement learning (DADRL) method for deformable face tracking, which generates bounding boxes and detects facial landmarks interactively from face videos.
no code implementations • ECCV 2018 • Liangliang Ren, Xin Yuan, Jiwen Lu, Ming Yang, Jie Zhou
Visual tracking is confronted by the dilemma to locate a target both}accurately and efficiently, and make decisions online whether and how to adapt the appearance model or even restart tracking.
no code implementations • ECCV 2018 • Liangliang Ren, Jiwen Lu, Zifeng Wang, Qi Tian, Jie zhou
To address this, we develop a deep prediction-decision network in our C-DRL, which simultaneously detects and predicts objects under a unified network via deep reinforcement learning.
no code implementations • ECCV 2018 • Xudong Lin, Yueqi Duan, Qiyuan Dong, Jiwen Lu, Jie zhou
Deep metric learning has been extensively explored recently, which trains a deep neural network to produce discriminative embedding features.
no code implementations • CVPR 2018 • Yueqi Duan, Wenzhao Zheng, Xudong Lin, Jiwen Lu, Jie zhou
Learning an effective distance metric between image pairs plays an important role in visual analysis, where the training procedure largely relies on hard negative samples.
no code implementations • CVPR 2018 • Yueqi Duan, Ziwei Wang, Jiwen Lu, Xudong Lin, Jie zhou
Specifically, we design a deep reinforcement learning model to learn the structure of the graph for bitwise interaction mining, reducing the uncertainty of binary codes by maximizing the mutual information with inputs and related bits, so that the ambiguous bits receive additional instruction from the graph for confident binarization.
no code implementations • CVPR 2018 • Yansong Tang, Yi Tian, Jiwen Lu, Peiyang Li, Jie zhou
In this paper, we propose a deep progressive reinforcement learning (DPRL) method for action recognition in skeleton-based videos, which aims to distil the most informative frames and discard ambiguous frames in sequences for recognizing actions.
Ranked #3 on
Skeleton Based Action Recognition
on UT-Kinect
no code implementations • CVPR 2018 • Yongming Rao, Dahua Lin, Jiwen Lu, Jie zhou
In this paper, we propose a simple yet effective method to learn globally optimized detector for object detection, which is a simple modification to the standard cross-entropy gradient inspired by the REINFORCE algorithm.
no code implementations • CVPR 2018 • Zhixiang Chen, Xin Yuan, Jiwen Lu, Qi Tian, Jie zhou
This paper presents a discrepancy minimizing model to address the discrete optimization problem in hashing learning.
1 code implementation • 20 Mar 2018 • Shaohui Liu, Yi Wei, Jiwen Lu, Jie zhou
Unlike most existing evaluation frameworks which transfer the representation of ImageNet inception model to map images onto the feature space, our framework uses a specialized encoder to acquire fine-grained domain-specific representation.
no code implementations • NeurIPS 2017 • Ji Lin, Yongming Rao, Jiwen Lu, Jie zhou
In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime.
no code implementations • ICCV 2017 • Yongming Rao, Jiwen Lu, Jie zhou
In this paper, we propose an attention-aware deep reinforcement learning (ADRL) method for video face recognition, which aims to discard the misleading and confounding frames and find the focuses of attention in face videos for person recognition.
no code implementations • ICCV 2017 • Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, Jie zhou
In this paper, we propose a cross-modal deep variational hashing (CMDVH) method to learn compact binary codes for cross-modality multimedia retrieval.
no code implementations • ICCV 2017 • Yongming Rao, Ji Lin, Jiwen Lu, Jie zhou
In this paper, we propose a discriminative aggregation network (DAN) for video face recognition, which aims to integrate information from video frames effectively and efficiently.
no code implementations • 25 Sep 2017 • Xi Peng, Jiashi Feng, Shijie Xiao, Jiwen Lu, Zhang Yi, Shuicheng Yan
In this paper, we present a deep extension of Sparse Subspace Clustering, termed Deep Sparse Subspace Clustering (DSSC).
no code implementations • ICCV 2017 • Fangyu Liu, Shuaipeng Li, Liqiang Zhang, Chenghu Zhou, Rongtian Ye, Yuebin Wang, Jiwen Lu
Our method provides an automatic process that maps the raw data to the classification results.
no code implementations • CVPR 2017 • Yueqi Duan, Jiwen Lu, Ziwei Wang, Jianjiang Feng, Jie zhou
In this paper, we propose an unsupervised feature learning method called deep binary descriptor with multi-quantization (DBD-MQ) for visual matching.
no code implementations • CVPR 2017 • Ji Lin, Liangliang Ren, Jiwen Lu, Jianjiang Feng, Jie zhou
In this paper, we propose a consistent-aware deep learning (CADL) framework for person re-identification in a camera network.
no code implementations • European Conference on Computer Vision 2016 • Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang
Matching pedestrians across multiple camera views known as human re-identification (re-identification) is a challenging problem in visual surveillance.
no code implementations • CVPR 2016 • Anran Wang, Jianfei Cai, Jiwen Lu, Tat-Jen Cham
While convolutional neural networks (CNN) have been excellent for object recognition, the greater spatial variability in scene images typically meant that the standard full-image CNN features are suboptimal for scene classification.
no code implementations • CVPR 2016 • Kevin Lin, Jiwen Lu, Chu-Song Chen, Jie zhou
In this paper, we propose a new unsupervised deep learning approach called DeepBit to learn compact binary descriptor for efficient visual object matching.
no code implementations • 6 Apr 2016 • Ziyan Wang, Jiwen Lu, Ruogu Lin, Jianjiang Feng, Jie zhou
Specifically, we construct a pair of deep convolutional neural networks (CNNs) for the RGB and depth data, and concatenate them at the top layer of the network with a loss function which learns a new feature space where both correlated part and the individual part of the RGB-D information are well modelled.
no code implementations • 4 Jan 2016 • Abrar H. Abdulnabi, Gang Wang, Jiwen Lu, Kui Jia
Each CNN will generate attribute-specific feature representations, and then we apply multi-task learning on the features to predict their attributes.
no code implementations • ICCV 2015 • Jiwen Lu, Venice Erin Liong, Jie zhou
In this paper, we propose a simultaneous local binary feature learning and encoding (SLBFLE) method for face recognition.
no code implementations • ICCV 2015 • Xianglong Liu, Lei Huang, Cheng Deng, Jiwen Lu, Bo Lang
have enjoyed the benefits of complementary hash tables and information fusion over multiple views.
no code implementations • ICCV 2015 • Anran Wang, Jianfei Cai, Jiwen Lu, Tat-Jen Cham
We first construct deep CNN layers for color and depth separately, and then connect them with our carefully designed multi-modal layers, which fuse color and depth information by enforcing a common part to be shared by features of different modalities.
no code implementations • ICCV 2015 • Lin Ma, Jiwen Lu, Jianjiang Feng, Jie zhou
It is desirable to combine multiple feature descriptors to improve the visual tracking performance because different features can provide complementary information to describe objects of interest.
no code implementations • ICCV 2015 • Lin Ma, Xiaoqin Zhang, Weiming Hu, Junliang Xing, Jiwen Lu, Jie zhou
To address this, this paper presents a local subspace collaborative tracking method for robust visual tracking, where multiple linear and nonlinear subspaces are learned to better model the nonlinear relationship of object appearances.
no code implementations • 16 Nov 2015 • Siyuan Huang, Jiwen Lu, Jie zhou, Anil K. Jain
In this paper, we propose a nonlinear local metric learning (NLML) method to improve the state-of-the-art performance of person re-identification on public datasets.
no code implementations • CVPR 2015 • Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, Jie zhou
In this paper, we propose a new deep hashing (DH) approach to learn compact binary codes for large scale visual search.
no code implementations • CVPR 2015 • Junlin Hu, Jiwen Lu, Yap-Peng Tan
Conventional metric learning methods usually assume that the training and test samples are captured in similar scenarios so that their distributions are assumed to be the same.
no code implementations • CVPR 2015 • Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, Jie zhou
In this paper, we propose a multi-manifold deep metric learning (MMDML) method for image set classification, which aims to recognize an object of interest from a set of image instances captured from varying viewpoints or under varying illuminations.
no code implementations • 17 Nov 2014 • Xi Peng, Jiwen Lu, Zhang Yi, Rui Yan
In this paper, we address two challenging problems in unsupervised subspace learning: 1) how to automatically identify the feature dimension of the learned subspace (i. e., automatic subspace learning), and 2) how to learn the underlying subspace in the presence of Gaussian noise (i. e., robust subspace learning).
no code implementations • 4 Oct 2014 • Rahul Rama Varior, Gang Wang, Jiwen Lu
We model color feature generation as a learning problem by jointly learning a linear transformation and a dictionary to encode pixel values.
no code implementations • CVPR 2014 • Junlin Hu, Jiwen Lu, Yap-Peng Tan
This paper presents a new discriminative deep metric learning (DDML) method for face verification in the wild.
2 code implementations • 14 Apr 2014 • Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, Yi Ma
In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms.
Ranked #48 on
Image Classification
on MNIST
no code implementations • 6 Nov 2013 • Sheng Huang, Dan Yang, Fei Yang, Yongxin Ge, Xiaohong Zhang, Jiwen Lu
We present an improved Locality Preserving Projections (LPP) method, named Gloablity-Locality Preserving Projections (GLPP), to preserve both the global and local geometric structures of data.