Search Results for author: Rongrong Ji

Found 242 papers, 154 papers with code

FreeAnchor: Learning to Match Anchors for Visual Object Detection

4 code implementations • NeurIPS 2019 • Xiaosong Zhang, Fang Wan, Chang Liu, Rongrong Ji, Qixiang Ye

In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner.

Ranked #125 on Object Detection on COCO test-dev

Object object-detection +1

27,765

Paper
Code

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

3 code implementations • 23 Jun 2023 • Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Yunsheng Wu, Rongrong Ji

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image.

Benchmarking Language Modelling +3

8,890

Paper
Code

Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain

1 code implementation • 15 Jul 2022 • Jiazhen Ji, Huan Wang, Yuge Huang, Jiaxiang Wu, Xingkun Xu, Shouhong Ding, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

This paper proposes a privacy-preserving face recognition method using differential privacy in the frequency domain.

Face Recognition Privacy Preserving

1,223

Paper
Code

Hypergraph Neural Networks

2 code implementations • 25 Sep 2018 • Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, Yue Gao

In this paper, we present a hypergraph neural networks (HGNN) framework for data representation learning, which can encode high-order data correlation in a hypergraph structure.

Object Recognition Representation Learning

618

Paper
Code

DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results

2 code implementations • 18 Feb 2021 • Liming Jiang, Zhengkui Guo, Wayne Wu, Zhaoyang Liu, Ziwei Liu, Chen Change Loy, Shuo Yang, Yuanjun Xiong, Wei Xia, Baoying Chen, Peiyu Zhuang, Sili Li, Shen Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Liujuan Cao, Rongrong Ji, Changlei Lu, Ganchao Tan

This paper reports methods and results in the DeeperForensics Challenge 2020 on real-world face forgery detection.

valid

524

Paper
Code

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models

1 code implementation • NeurIPS 2023 • Gen Luo, Yiyi Zhou, Tianhe Ren, Shengxin Chen, Xiaoshuai Sun, Rongrong Ji

To validate MMA, we apply it to a recent LLM called LLaMA and term this formed large vision-language instructed model as LaVIN.

Chatbot Natural Language Understanding +1

473

Paper
Code

Aligning and Prompting Everything All at Once for Universal Visual Perception

2 code implementations • 4 Dec 2023 • Yunhang Shen, Chaoyou Fu, Peixian Chen, Mengdan Zhang, Ke Li, Xing Sun, Yunsheng Wu, Shaohui Lin, Rongrong Ji

However, predominant paradigms, driven by casting instance-level tasks as an object-word alignment, bring heavy cross-modality interaction, which is not effective in prompting object detection and visual grounding.

Object object-detection +6

415

Paper
Code

Image-to-image Translation via Hierarchical Style Disentanglement

1 code implementation • CVPR 2021 • Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji

Recently, image-to-image translation has made significant progress in achieving both multi-label (\ie, translation conditioned on different labels) and multi-style (\ie, generation with diverse styles) tasks.

Disentanglement Multimodal Unsupervised Image-To-Image Translation +1

383

Paper
Code

Fast Learning of Temporal Action Proposal via Dense Boundary Generator

3 code implementations • 11 Nov 2019 • Chuming Lin, Jian Li, Yabiao Wang, Ying Tai, Donghao Luo, Zhipeng Cui, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

In this paper, we propose an efficient and unified framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classification and action completeness regression for densely distributed proposals.

Ranked #7 on Temporal Action Localization on FineAction

General Classification Optical Flow Estimation +2

344

Paper
Code

HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping

1 code implementation • 18 Jun 2021 • YuHan Wang, Xu Chen, Junwei Zhu, Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Yongjian Wu, Feiyue Huang, Rongrong Ji

In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results.

Ranked #7 on Face Swapping on FaceForensics++

3D Face Reconstruction Face Recognition +2

328

Paper
Code

Siamese Box Adaptive Network for Visual Tracking

2 code implementations • CVPR 2020 • Zedu Chen, Bineng Zhong, Guorong Li, Shengping Zhang, Rongrong Ji

Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target.

Visual Tracking

269

Paper
Code

HRank: Filter Pruning using High-Rank Feature Map

2 code implementations • CVPR 2020 • Mingbao Lin, Rongrong Ji, Yan Wang, Yichen Zhang, Baochang Zhang, Yonghong Tian, Ling Shao

The principle behind our pruning is that low-rank feature maps contain less information, and thus pruned results can be easily reproduced.

Network Pruning Vocal Bursts Intensity Prediction

245

Paper
Code

You Only Segment Once: Towards Real-Time Panoptic Segmentation

2 code implementations • CVPR 2023 • Jie Hu, Linyan Huang, Tianhe Ren, Shengchuan Zhang, Rongrong Ji, Liujuan Cao

To reduce the computational overhead, we design a feature pyramid aggregator for the feature map extraction, and a separable dynamic decoder for the panoptic kernel generation.

Panoptic Segmentation Segmentation

227

Paper
Code

Multinomial Distribution Learning for Effective Neural Architecture Search

1 code implementation • ICCV 2019 • Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian

Therefore, NAS can be transformed to a multinomial distribution learning problem, i. e., the distribution is optimized to have a high expectation of the performance.

Neural Architecture Search

207

Paper
Code

ISTR: End-to-End Instance Segmentation with Transformers

1 code implementation • 3 May 2021 • Jie Hu, Liujuan Cao, Yao Lu, Shengchuan Zhang, Yan Wang, Ke Li, Feiyue Huang, Ling Shao, Rongrong Ji

However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection.

Ranked #21 on Instance Segmentation on COCO test-dev

Instance Segmentation object-detection +3

200

Paper
Code

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

1 code implementation • 13 Dec 2020 • Jiayi Ji, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, Rongrong Ji

The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.

Caption Generation Image Captioning

193

Paper
Code

Dual-Level Collaborative Transformer for Image Captioning

1 code implementation • 16 Jan 2021 • Yunpeng Luo, Jiayi Ji, Xiaoshuai Sun, Liujuan Cao, Yongjian Wu, Feiyue Huang, Chia-Wen Lin, Rongrong Ji

Descriptive region features extracted by object detection networks have played an important role in the recent advancements of image captioning.

Descriptive Image Captioning +2

193

Paper
Code

Towards Efficient Visual Adaption via Structural Re-parameterization

1 code implementation • 16 Feb 2023 • Gen Luo, Minglang Huang, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Zhiyu Wang, Rongrong Ji

Experimental results show the superior performance and efficiency of RepAdapter than the state-of-the-art PETL methods.

Semantic Segmentation Transfer Learning

176

Paper
Code

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation • CVPR 2020 • Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

165

Paper
Code

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models

1 code implementation • 5 Mar 2024 • Gen Luo, Yiyi Zhou, Yuxin Zhang, Xiawu Zheng, Xiaoshuai Sun, Rongrong Ji

Contrary to previous works, we study this problem from the perspective of image resolution, and reveal that a combination of low- and high-resolution visual features can effectively mitigate this shortcoming.

Ranked #57 on Visual Question Answering on MM-Vet

Visual Question Answering

157

Paper
Code

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

2 code implementations • CVPR 2021 • Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun

Then we force the model to pull the feature of the distracting video and the feature of the original video closer, so that the model is explicitly restricted to resist the background influence, focusing more on the motion changes.

Representation Learning Self-Supervised Learning

153

Paper
Code

Filter Grafting for Deep Neural Networks

2 code implementations • CVPR 2020 • Fanxu Meng, Hao Cheng, Ke Li, Zhixin Xu, Rongrong Ji, Xing Sun, Gaungming Lu

To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.

140

Paper
Code

Filter Grafting for Deep Neural Networks: Reason, Method, and Cultivation

1 code implementation • 26 Apr 2020 • Hao Cheng, Fanxu Meng, Ke Li, Yuting Gao, Guangming Lu, Xing Sun, Rongrong Ji

To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting .

valid

140

Paper
Code

Channel Pruning via Automatic Structure Search

1 code implementation • 23 Jan 2020 • Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, Yonghong Tian

In this paper, we propose a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i. e., channel number in each layer, rather than selecting "important" channels as previous works did.

137

Paper
Code

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

1 code implementation • CVPR 2020 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Liujuan Cao, Chenglin Wu, Cheng Deng, Rongrong Ji

In addition, we address a key challenge in this multi-task setup, i. e., the prediction conflict, with two innovative designs namely, Consistency Energy Maximization (CEM) and Adaptive Soft Non-Located Suppression (ASNLS).

Ranked #4 on Generalized Referring Expression Comprehension on gRefCOCO

Generalized Referring Expression Comprehension Referring Expression +2

131

Paper
Code

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection

1 code implementation • CVPR 2023 • Zhenglin Zhou, Huaxia Li, Hong Liu, Nanyang Wang, Gang Yu, Rongrong Ji

To solve this problem, we propose a Self-adapTive Ambiguity Reduction (STAR) loss by exploiting the properties of semantic ambiguity.

Ranked #1 on Face Alignment on 300W

Face Alignment Facial Landmark Detection

130

Paper
Code

SeqTR: A Simple yet Universal Network for Visual Grounding

3 code implementations • 30 Mar 2022 • Chaoyang Zhu, Yiyi Zhou, Yunhang Shen, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji

In this paper, we propose a simple yet universal network termed SeqTR for visual grounding tasks, e. g., phrase localization, referring expression comprehension (REC) and segmentation (RES).

Referring Expression Referring Expression Comprehension +1

122

Paper
Code

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words

1 code implementation • CVPR 2021 • Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji

Then, we build a BERTbased language model to extract language context and propose Adaptive-Attention (AA) module on top of a transformer decoder to adaptively measure the contribution of visual and language cues before making decisions for word prediction.

Image Captioning Language Modelling +2

118

Paper
Code

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

3 code implementations • 12 Sep 2020 • Jinpeng Wang, Yuting Gao, Ke Li, Jianguo Hu, Xinyang Jiang, Xiaowei Guo, Rongrong Ji, Xing Sun

Specifically, we construct a positive clip and a negative clip for each video.

Action Recognition Representation Learning

113

Paper
Code

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval

1 code implementation • 15 Jul 2022 • Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji

However, cross-grained contrast, which is the contrast between coarse-grained representations and fine-grained representations, has rarely been explored in prior research.

Ranked #12 on Video Retrieval on MSVD

Contrastive Learning Retrieval +2

110

Paper
Code

Asymmetric Co-Teaching for Unsupervised Cross Domain Person Re-Identification

1 code implementation • 3 Dec 2019 • Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, Shaozi Li

This procedure encourages that the selected training samples can be both clean and miscellaneous, and that the two models can promote each other iteratively.

Ranked #9 on Unsupervised Domain Adaptation on Market to Duke

Clustering Miscellaneous +2

107

Paper
Code

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

2 code implementations • ECCV 2020 • Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian

Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored.

Domain Adaptive Person Re-Identification Ensemble Learning +1

103

Paper
Code

Boosting Crowd Counting via Multifaceted Attention

1 code implementation • CVPR 2022 • Hui Lin, Zhiheng Ma, Rongrong Ji, YaoWei Wang, Xiaopeng Hong

Secondly, we design the Local Attention Regularization to supervise the training of LRA by minimizing the deviation among the attention for different feature locations.

Crowd Counting

100

Paper
Code

CF-ViT: A General Coarse-to-Fine Method for Vision Transformer

1 code implementation • 8 Mar 2022 • Mengzhao Chen, Mingbao Lin, Ke Li, Yunhang Shen, Yongjian Wu, Fei Chao, Rongrong Ji

Our proposed CF-ViT is motivated by two important observations in modern ViT models: (1) The coarse-grained patch splitting can locate informative regions of an input image.

Paper
Code

Rotated Binary Neural Network

2 code implementations • NeurIPS 2020 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Yan Wang, Yongjian Wu, Feiyue Huang, Chia-Wen Lin

In this paper, for the first time, we explore the influence of angular bias on the quantization error and then introduce a Rotated Binary Neural Network (RBNN), which considers the angle alignment between the full-precision weight vector and its binarized version.

Binarization Quantization

Paper
Code

SiMaN: Sign-to-Magnitude Network Binarization

2 code implementations • 16 Feb 2021 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Fei Chao, Chia-Wen Lin, Ling Shao

In this paper, we show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise.

Binarization

Paper
Code

Meta Architecture for Point Cloud Analysis

1 code implementation • CVPR 2023 • Haojia Lin, Xiawu Zheng, Lijiang Li, Fei Chao, Shanshan Wang, Yan Wang, Yonghong Tian, Rongrong Ji

However, the lack of a unified framework to interpret those networks makes any systematic comparison, contrast, or analysis challenging, and practically limits healthy development of the field.

Ranked #2 on 3D Semantic Segmentation on OpenTrench3D

3D Semantic Segmentation

Paper
Code

Towards End-to-end Semi-supervised Learning for One-stage Object Detection

1 code implementation • 22 Feb 2023 • Gen Luo, Yiyi Zhou, Lei Jin, Xiaoshuai Sun, Rongrong Ji

In addition to this challenge, we also reveal two key issues in one-stage SSOD, which are low-quality pseudo-labeling and multi-task optimization conflict, respectively.

object-detection Object Detection +2

Paper
Code

Active Teacher for Semi-Supervised Object Detection

1 code implementation • CVPR 2022 • Peng Mi, Jianghang Lin, Yiyi Zhou, Yunhang Shen, Gen Luo, Xiaoshuai Sun, Liujuan Cao, Rongrong Fu, Qiang Xu, Rongrong Ji

In this paper, we study teacher-student learning from the perspective of data initialization and propose a novel algorithm called Active Teacher(Source code are available at: \url{https://github. com/HunterJ-Lin/ActiveTeacher}) for semi-supervised object detection (SSOD).

Object object-detection +2

Paper
Code

ARM: Any-Time Super-Resolution Method

1 code implementation • 21 Mar 2022 • Bohong Chen, Mingbao Lin, Kekai Sheng, Mengdan Zhang, Peixian Chen, Ke Li, Liujuan Cao, Rongrong Ji

To that effect, we construct an Edge-to-PSNR lookup table that maps the edge score of an image patch to the PSNR performance for each subnet, together with a set of computation costs for the subnets.

Image Super-Resolution

Paper
Code

DiffRate : Differentiable Compression Rate for Efficient Vision Transformers

1 code implementation • ICCV 2023 • Mengzhao Chen, Wenqi Shao, Peng Xu, Mingbao Lin, Kaipeng Zhang, Fei Chao, Rongrong Ji, Yu Qiao, Ping Luo

Token compression aims to speed up large-scale vision transformers (e. g. ViTs) by pruning (dropping) or merging tokens.

Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs

Paper
Code

X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation

1 code implementation • 30 Nov 2023 • Yiwei Ma, Yijun Fan, Jiayi Ji, Haowei Wang, Xiaoshuai Sun, Guannan Jiang, Annan Shu, Rongrong Ji

Nevertheless, a substantial domain gap exists between 2D images and 3D assets, primarily attributed to variations in camera-related attributes and the exclusive presence of foreground objects.

3D Generation Text to 3D

Paper
Code

Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection

2 code implementations • CVPR 2021 • Bohao Li, Boyu Yang, Chang Liu, Feng Liu, Rongrong Ji, Qixiang Ye

Few-shot object detection has made substantial progressby representing novel class objects using the feature representation learned upon a set of base class objects.

Ranked #14 on Few-Shot Object Detection on MS-COCO (10-shot)

Few-Shot Object Detection object-detection

Paper
Code

TRAR: Routing the Attention Spans in Transformer for Visual Question Answering

1 code implementation • ICCV 2021 • Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun, Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji

Due to the superior ability of global dependency modeling, Transformer and its variants have become the primary choice of many vision-and-language tasks.

Question Answering Referring Expression +2

Paper
Code

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

1 code implementation • 19 Dec 2023 • Sihan Liu, Yiwei Ma, Xiaoqing Zhang, Haowei Wang, Jiayi Ji, Xiaoshuai Sun, Rongrong Ji

Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that combines computer vision and natural language processing, delineating specific regions in aerial images as described by textual queries.

Image Segmentation Segmentation +1

Paper
Code

Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification

1 code implementation • CVPR 2021 • Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji

In this paper, we propose a joint Modality and Pattern Alignment Network (MPANet) to discover cross-modality nuances in different patterns for visible-infrared person Re-ID, which introduces a modality alleviation module and a pattern alignment module to jointly extract discriminative features.

Person Re-Identification

Paper
Code

PAMS: Quantized Super-Resolution via Parameterized Max Scale

1 code implementation • ECCV 2020 • Huixia Li, Chenqian Yan, Shaohui Lin, Xiawu Zheng, Yuchao Li, Baochang Zhang, Fan Yang, Rongrong Ji

Specifically, most state-of-the-art SR models without batch normalization have a large dynamic quantization range, which also serves as another cause of performance drop.

Quantization Super-Resolution +1

Paper
Code

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning

1 code implementation • CVPR 2019 • Shaohui Lin, Rongrong Ji, Chenqian Yan, Baochang Zhang, Liujuan Cao, Qixiang Ye, Feiyue Huang, David Doermann

In this paper, we propose an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner.

Paper
Code

Filter Sketch for Network Pruning

1 code implementation • 23 Jan 2020 • Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

Paper
Code

Learning Efficient GANs for Image Translation via Differentiable Masks and co-Attention Distillation

1 code implementation • 17 Nov 2020 • Shaojie Li, Mingbao Lin, Yan Wang, Fei Chao, Ling Shao, Rongrong Ji

The latter simultaneously distills informative attention maps from both the generator and discriminator of a pre-trained model to the searched generator, effectively stabilizing the adversarial training of our light-weight model.

Translation

Paper
Code

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

1 code implementation • 28 May 2019 • Xiawu Zheng, Chenyi Yang, Shaokun Zhang, Yan Wang, Baochang Zhang, Yongjian Wu, Yunsheng Wu, Ling Shao, Rongrong Ji

With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints.

Neural Architecture Search

Paper
Code

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient

1 code implementation • 4 Jun 2021 • Shaokun Zhang, Xiawu Zheng, Chenyi Yang, Yuchao Li, Yan Wang, Fei Chao, Mengdi Wang, Shen Li, Jun Yang, Rongrong Ji

Motivated by the necessity of efficient inference across various constraints on BERT, we propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere.

AutoML Model Compression

Paper
Code

A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension

1 code implementation • 17 Apr 2022 • Gen Luo, Yiyi Zhou, Jiamu Sun, Xiaoshuai Sun, Rongrong Ji

But the most encouraging finding is that with much less training overhead and parameters, SimREC can still achieve better performance than a set of large-scale pre-trained models, e. g., UNITER and VILLA, portraying the special role of REC in existing V&L research.

Data Augmentation Referring Expression +1

Paper
Code

Neural Architecture Search With Representation Mutual Information

1 code implementation • CVPR 2022 • Xiawu Zheng, Xiang Fei, Lei Zhang, Chenglin Wu, Fei Chao, Jianzhuang Liu, Wei Zeng, Yonghong Tian, Rongrong Ji

Building upon RMI, we further propose a new search algorithm termed RMI-NAS, facilitating with a theorem to guarantee the global optimal of the searched architecture.

Neural Architecture Search

Paper
Code

Exploring Target Representations for Masked Autoencoders

1 code implementation • 8 Sep 2022 • Xingbin Liu, Jinghao Zhou, Tao Kong, Xianming Lin, Rongrong Ji

Masked autoencoders have become popular training paradigms for self-supervised visual representation learning.

Ranked #6 on Self-Supervised Image Classification on ImageNet (finetuned)

Instance Segmentation Knowledge Distillation +6

Paper
Code

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

1 code implementation • CVPR 2019 • Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji

The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression.

Clustering Model Compression

Paper
Code

Improving Face Recognition from Hard Samples via Distribution Distillation Loss

2 code implementations • ECCV 2020 • Yuge Huang, Pengcheng Shen, Ying Tai, Shaoxin Li, Xiaoming Liu, Jilin Li, Feiyue Huang, Rongrong Ji

To improve the performance on those hard samples for general tasks, we propose a novel Distribution Distillation Loss to narrow the performance gap between easy and hard samples, which is a simple, effective and generic for various types of facial variations.

Face Recognition

Paper
Code

Attribute Guided Unpaired Image-to-Image Translation with Semi-supervised Learning

1 code implementation • 29 Apr 2019 • Xinyang Li, Jie Hu, Shengchuan Zhang, Xiaopeng Hong, Qixiang Ye, Chenglin Wu, Rongrong Ji

Especially, AGUIT benefits from two-fold: (1) It adopts a novel semi-supervised learning process by translating attributes of labeled data to unlabeled data, and then reconstructing the unlabeled data by a cycle consistency operation.

Attribute Disentanglement +2

Paper
Code

Long-Range Feature Propagating for Natural Image Matting

1 code implementation • 25 Sep 2021 • Qinglin Liu, Haozhe Xie, Shengping Zhang, Bineng Zhong, Rongrong Ji

Finally, we use the matting module which takes the image, trimap and context features to estimate the alpha matte.

Ranked #6 on Image Matting on Composition-1K (using extra training data)

Image Matting

Paper
Code

Super Vision Transformer

1 code implementation • 23 May 2022 • Mingbao Lin, Mengzhao Chen, Yuxin Zhang, Chunhua Shen, Rongrong Ji, Liujuan Cao

Experimental results on ImageNet demonstrate that our SuperViT can considerably reduce the computational costs of ViT models with even performance increase.

Paper
Code

ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement

1 code implementation • 25 Sep 2022 • Dongli Tan, Jiang-Jiang Liu, Xingyu Chen, Chao Chen, Ruixin Zhang, Yunhang Shen, Shouhong Ding, Rongrong Ji

In this paper, we propose an efficient structure named Efficient Correspondence Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner, which significantly improves the efficiency of functional correspondence model.

Outlier Detection

Paper
Code

Asynchronous Bidirectional Decoding for Neural Machine Translation

2 code implementations • 16 Jan 2018 • Xiangwen Zhang, Jinsong Su, Yue Qin, Yang Liu, Rongrong Ji, Hongji Wang

The dominant neural machine translation (NMT) models apply unified attentional encoder-decoder neural networks for translation.

Machine Translation NMT +1

Paper
Code

1xN Pattern for Pruning Convolutional Neural Networks

1 code implementation • 31 May 2021 • Mingbao Lin, Yuxin Zhang, Yuchao Li, Bohong Chen, Fei Chao, Mengdi Wang, Shen Li, Yonghong Tian, Rongrong Ji

We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations.

Network Pruning

Paper
Code

Clover: Towards A Unified Video-Language Alignment and Fusion Model

1 code implementation • CVPR 2023 • Jingjia Huang, Yinan Li, Jiashi Feng, Xinglong Wu, Xiaoshuai Sun, Rongrong Ji

We then introduce \textbf{Clover}\textemdash a Correlated Video-Language pre-training method\textemdash towards a universal Video-Language model for solving multiple video understanding tasks with neither performance nor efficiency compromise.

Ranked #1 on Video Question Answering on LSMDC-FiB

Language Modelling Question Answering +10

Paper
Code

ReCU: Reviving the Dead Weights in Binary Neural Networks

3 code implementations • ICCV 2021 • Zihan Xu, Mingbao Lin, Jianzhuang Liu, Jie Chen, Ling Shao, Yue Gao, Yonghong Tian, Rongrong Ji

We prove that reviving the "dead weights" by ReCU can result in a smaller quantization error.

Binarization Quantization

Paper
Code

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration

1 code implementation • ICCV 2023 • Lijiang Li, Huixia Li, Xiawu Zheng, Jie Wu, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan, Fei Chao, Rongrong Ji

Therefore, we propose to search the optimal time steps sequence and compressed model architecture in a unified framework to achieve effective image generation for diffusion models without any further training.

Image Generation single-image-generation

Paper
Code

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

1 code implementation • 11 Oct 2022 • Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao

One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.

Paper
Code

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

1 code implementation • NeurIPS 2021 • Shaojie Li, Jie Wu, Xuefeng Xiao, Fei Chao, Xudong Mao, Rongrong Ji

In this work, we revisit the role of discriminator in GAN compression and design a novel generator-discriminator cooperative compression scheme for GAN compression, termed GCC.

Paper
Code

Real-Time Image Demoireing on Mobile Devices

1 code implementation • 4 Feb 2023 • Yuxin Zhang, Mingbao Lin, Xunchao Li, Han Liu, Guozhi Wang, Fei Chao, Shuai Ren, Yafei Wen, Xiaoxin Chen, Rongrong Ji

In this paper, we launch the first study on accelerating demoireing networks and propose a dynamic demoireing acceleration method (DDA) towards a real-time deployment on mobile devices.

Paper
Code

Training-free Transformer Architecture Search

1 code implementation • CVPR 2022 • Qinqin Zhou, Kekai Sheng, Xiawu Zheng, Ke Li, Xing Sun, Yonghong Tian, Jie Chen, Rongrong Ji

Recently, Vision Transformer (ViT) has achieved remarkable success in several computer vision tasks.

Paper
Code

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

1 code implementation • CVPR 2023 • Yuexiao Ma, Huixia Li, Xiawu Zheng, Xuefeng Xiao, Rui Wang, Shilei Wen, Xin Pan, Fei Chao, Rongrong Ji

In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity.

Quantization

Paper
Code

Lottery Jackpots Exist in Pre-trained Models

2 code implementations • 18 Apr 2021 • Yuxin Zhang, Mingbao Lin, Yunshan Zhong, Fei Chao, Rongrong Ji

Existing studies achieve the sparsity of neural networks via time-consuming weight training or complex searching on networks with expanded width, which greatly limits the applications of network pruning.

Network Pruning

Paper
Code

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

1 code implementation • CVPR 2022 • Yunshan Zhong, Mingbao Lin, Gongrui Nan, Jianzhuang Liu, Baochang Zhang, Yonghong Tian, Rongrong Ji

In this paper, we observe an interesting phenomenon of intra-class heterogeneity in real data and show that existing methods fail to retain this property in their synthetic images, which causes a limited performance increase.

Quantization

Paper
Code

InterFormer: Real-time Interactive Image Segmentation

1 code implementation • ICCV 2023 • You Huang, Hao Yang, Ke Sun, Shengchuan Zhang, Liujuan Cao, Guannan Jiang, Rongrong Ji

Interactive image segmentation enables annotators to efficiently perform pixel-level annotation for segmentation tasks.

Computational Efficiency Image Segmentation +3

Paper
Code

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models

1 code implementation • 29 May 2023 • Zhongxi Chen, Ke Sun, Xianming Lin, Rongrong Ji

Due to the stochastic sampling process of diffusion, our model is capable of sampling multiple possible predictions from the mask distribution, avoiding the problem of overconfident point estimation.

Denoising Object +3

Paper
Code

Distilling a Powerful Student Model via Online Knowledge Distillation

1 code implementation • 26 Mar 2021 • Shaojie Li, Mingbao Lin, Yan Wang, Yongjian Wu, Yonghong Tian, Ling Shao, Rongrong Ji

Besides, a self-distillation module is adopted to convert the feature map of deeper layers into a shallower one.

Knowledge Distillation

Paper
Code

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

1 code implementation • 8 Mar 2022 • Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji

However, these methods suffer from severe performance degradation when quantizing the SR models to ultra-low precision (e. g., 2-bit and 3-bit) with the low-cost layer-wise quantizer.

Quantization Super-Resolution

Paper
Code

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

1 code implementation • 2 Apr 2022 • Jing He, Yiyi Zhou, Qi Zhang, Jun Peng, Yunhang Shen, Xiaoshuai Sun, Chao Chen, Rongrong Ji

Pixel synthesis is a promising research paradigm for image generation, which can well exploit pixel-wise prior knowledge for generation.

Image Generation regression

Paper
Code

Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs

1 code implementation • 13 Oct 2023 • Yuxin Zhang, Lirui Zhao, Mingbao Lin, Yunyun Sun, Yiwu Yao, Xingjia Han, Jared Tanner, Shiwei Liu, Rongrong Ji

Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs, in the fashion of performing iterative weight pruning-and-growing on top of sparse LLMs.

Network Pruning

Paper
Code

Towards Optimal Discrete Online Hashing with Balanced Similarity

1 code implementation • 29 Jan 2019 • Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Yongjian Wu, Yunsheng Wu

In this paper, we propose a novel supervised online hashing method, termed Balanced Similarity for Online Discrete Hashing (BSODH), to solve the above problems in a unified framework.

Retrieval

Paper
Code

Supervised Online Hashing via Hadamard Codebook Learning

1 code implementation • 28 Apr 2019 • Mingbao Lin, Rongrong Ji, Hong Liu, Yongjian Liu

Notably, the proposed HCOH can be embedded with supervised labels and it not limited to a predefined category number.

Retrieval Semantic Similarity +1

Paper
Code

Hadamard Matrix Guided Online Hashing

1 code implementation • 11 May 2019 • Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Shen Chen, Qi Tian

We then treat the learning of hash functions as a set of binary classification problems to fit the assigned target code.

Binary Classification

Paper
Code

Towards Robustness Against Natural Language Word Substitutions

1 code implementation • ICLR 2021 • Xinshuai Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu

Robustness against word substitutions has a well-defined and widely acceptable form, i. e., using semantically similar words as substitutions, and thus it is considered as a fundamental stepping-stone towards broader robustness in natural language processing.

Natural Language Inference Sentiment Analysis

Paper
Code

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

1 code implementation • ICCV 2023 • Yiwei Ma, Xiaioqing Zhang, Xiaoshuai Sun, Jiayi Ji, Haowei Wang, Guannan Jiang, Weilin Zhuang, Rongrong Ji

Text-driven 3D stylization is a complex and crucial task in the fields of computer vision (CV) and computer graphics (CG), aimed at transforming a bare mesh to fit a target text.

Attribute

Paper
Code

Towards Visual Feature Translation

1 code implementation • CVPR 2019 • Jie Hu, Rongrong Ji, Hong Liu, Shengchuan Zhang, Cheng Deng, Qi Tian

In this paper, we make the first attempt towards visual feature translation to break through the barrier of using features across different visual search systems.

Translation

Paper
Code

An Information Theory-inspired Strategy for Automatic Network Pruning

1 code implementation • 19 Aug 2021 • Xiawu Zheng, Yuexiao Ma, Teng Xi, Gang Zhang, Errui Ding, Yuchao Li, Jie Chen, Yonghong Tian, Rongrong Ji

This practically limits the application of model compression when the model needs to be deployed on a wide range of devices.

AutoML Model Compression +1

Paper
Code

OMPQ: Orthogonal Mixed Precision Quantization

1 code implementation • 16 Sep 2021 • Yuexiao Ma, Taisong Jin, Xiawu Zheng, Yan Wang, Huixia Li, Yongjian Wu, Guannan Jiang, Wei zhang, Rongrong Ji

Instead of solving a problem of the original integer programming, we propose to optimize a proxy metric, the concept of network orthogonality, which is highly correlated with the loss of the integer programming but also easy to optimize with linear programming.

AutoML Quantization

Paper
Code

DistilPose: Tokenized Pose Regression with Heatmap Distillation

1 code implementation • CVPR 2023 • Suhang Ye, Yingyi Zhang, Jie Hu, Liujuan Cao, Shengchuan Zhang, Lei Shen, Jun Wang, Shouhong Ding, Rongrong Ji

Specifically, DistilPose maximizes the transfer of knowledge from the teacher model (heatmap-based) to the student model (regression-based) through Token-distilling Encoder (TDE) and Simulated Heatmaps.

Knowledge Distillation Pose Estimation +1

Paper
Code

Pseudo-label Alignment for Semi-supervised Instance Segmentation

1 code implementation • ICCV 2023 • Jie Hu, Chen Chen, Liujuan Cao, Shengchuan Zhang, Annan Shu, Guannan Jiang, Rongrong Ji

Through extensive experiments conducted on the COCO and Cityscapes datasets, we demonstrate that PAIS is a promising framework for semi-supervised instance segmentation, particularly in cases where labeled data is severely limited.

Instance Segmentation Pseudo Label +3

Paper
Code

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training

1 code implementation • CVPR 2019 • Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji

Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other.

Ranked #2 on Person Re-Identification on CUHK03-C

Person Re-Identification

Paper
Code

UWSOD: Toward Fully-Supervised-Level Capacity Weakly Supervised Object Detection

1 code implementation • NeurIPS 2020 • Yunhang Shen, Rongrong Ji, Zhiwei Chen, Yongjian Wu, Feiyue Huang

In this paper, we propose a unified WSOD framework, termed UWSOD, to develop a high-capacity general detection model with only image-level labels, which is self-contained and does not require external modules or additional supervision.

Object object-detection +2

Paper
Code

Network Pruning using Adaptive Exemplar Filters

1 code implementation • 20 Jan 2021 • Mingbao Lin, Rongrong Ji, Shaojie Li, Yan Wang, Yongjian Wu, Feiyue Huang, Qixiang Ye

Inspired by the face recognition community, we use a message passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters.

Face Recognition Network Pruning

Paper
Code

Dynamic Prototype Mask for Occluded Person Re-Identification

1 code implementation • 19 Jul 2022 • Lei Tan, Pingyang Dai, Rongrong Ji, Yongjian Wu

Although person re-identification has achieved an impressive improvement in recent years, the common occlusion case caused by different obstacles is still an unsettled issue in real application scenarios.

Person Re-Identification

Paper
Code

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation

1 code implementation • 6 Aug 2023 • Haowei Wang, Jiji Tang, Jiayi Ji, Xiaoshuai Sun, Rongsheng Zhang, Yiwei Ma, Minda Zhao, Lincheng Li, Zeng Zhao, Tangjie Lv, Rongrong Ji

Insufficient synergy neglects the idea that a robust 3D representation should align with the joint vision-language space, rather than independently aligning with each modality.

Ranked #1 on Zero-shot 3D Point Cloud Classification on ModelNet40

3D Classification 3D Part Segmentation +5

Paper
Code

JM3D & JM3D-LLM: Elevating 3D Understanding with Joint Multi-modal Cues

1 code implementation • 14 Oct 2023 • Jiayi Ji, Haowei Wang, Changli Wu, Yiwei Ma, Xiaoshuai Sun, Rongrong Ji

The rising importance of 3D understanding, pivotal in computer vision, autonomous driving, and robotics, is evident.

Autonomous Driving Representation Learning

Paper
Code

Information Competing Process for Learning Diversified Representations

1 code implementation • NeurIPS 2019 • Jie Hu, Rongrong Ji, Shengchuan Zhang, Xiaoshuai Sun, Qixiang Ye, Chia-Wen Lin, Qi Tian

Learning representations with diversified information remains as an open problem.

General Classification Image Classification +2

Paper
Code

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation • 27 Jul 2020 • Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

Paper
Code

Unified-Width Adaptive Dynamic Network for All-In-One Image Restoration

1 code implementation • 24 Jan 2024 • Yimin Xu, Nanxi Gao, Zhongyun Shan, Fei Chao, Rongrong Ji

In contrast to traditional image restoration methods, all-in-one image restoration techniques are gaining increased attention for their ability to restore images affected by diverse and unknown corruption types and levels.

Computational Efficiency Image Restoration

Paper
Code

Variational Neural Discourse Relation Recognizer

1 code implementation • EMNLP 2016 • Biao Zhang, Deyi Xiong, Jinsong Su, Qun Liu, Rongrong Ji, Hong Duan, Min Zhang

In order to perform efficient inference and learning, we introduce neural discourse relation models to approximate the prior and posterior distributions of the latent variable, and employ these approximated distributions to optimize a reparameterized variational lower bound.

Relation

Paper
Code

Carrying out CNN Channel Pruning in a White Box

1 code implementation • 24 Apr 2021 • Yuxin Zhang, Mingbao Lin, Chia-Wen Lin, Jie Chen, Feiyue Huang, Yongjian Wu, Yonghong Tian, Rongrong Ji

Specifically, to model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel, implemented in a dynamic training manner w. r. t.

Image Classification

Paper
Code

Training Compact CNNs for Image Classification using Dynamic-coded Filter Fusion

1 code implementation • 14 Jul 2021 • Mingbao Lin, Bohong Chen, Fei Chao, Rongrong Ji

Each filter in our DCFF is firstly given an inter-similarity distribution with a temperature parameter as a filter proxy, on top of which, a fresh Kullback-Leibler divergence based dynamic-coded criterion is proposed to evaluate the filter importance.

Image Classification

Paper
Code

Architecture Disentanglement for Deep Neural Networks

1 code implementation • ICCV 2021 • Jie Hu, Liujuan Cao, Qixiang Ye, Tong Tong, Shengchuan Zhang, Ke Li, Feiyue Huang, Rongrong Ji, Ling Shao

Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs.

AutoML Disentanglement

Paper
Code

Prioritized Subnet Sampling for Resource-Adaptive Supernet Training

1 code implementation • 12 Sep 2021 • Bohong Chen, Mingbao Lin, Rongrong Ji, Liujuan Cao

At the end of training, our PSS-Net retains the best subnet in each pool to entitle a fast switch of high-quality subnets for inference when the available resources vary.

Paper
Code

A Unified Framework for 3D Point Cloud Visual Grounding

1 code implementation • 23 Aug 2023 • Haojia Lin, Yongdong Luo, Xiawu Zheng, Lijiang Li, Fei Chao, Taisong Jin, Donghao Luo, Yan Wang, Liujuan Cao, Rongrong Ji

This elaborate design enables 3DRefTR to achieve both well-performing 3DRES and 3DREC capacities with only a 6% additional latency compared to the original 3DREC model.

Referring Expression Referring Expression Comprehension +1

Paper
Code

Fine-grained Data Distribution Alignment for Post-Training Quantization

1 code implementation • 9 Sep 2021 • Yunshan Zhong, Mingbao Lin, Mengzhao Chen, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji

While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images.

Quantization

Paper
Code

Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks

1 code implementation • 16 Apr 2022 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Yan Wang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji

Despite the exciting performance, Transformer is criticized for its excessive parameters and computation cost.

Image Classification

Paper
Code

Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability

1 code implementation • 19 Jul 2022 • Xudong Mao, Liujuan Cao, Aurele T. Gnanha, Zhenguo Yang, Qing Li, Rongrong Ji

The recently proposed pivotal tuning model makes significant progress towards reconstruction and editability, by using a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.

Paper
Code

SMMix: Self-Motivated Image Mixing for Vision Transformers

1 code implementation • ICCV 2023 • Mengzhao Chen, Mingbao Lin, Zhihang Lin, Yuxin Zhang, Fei Chao, Rongrong Ji

Due to the subtle designs of the self-motivated paradigm, our SMMix is significant in its smaller training overhead and better performance than other CutMix variants.

Paper
Code

Towards Local Visual Modeling for Image Captioning

1 code implementation • 13 Feb 2023 • Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji

In this paper, we study the local visual modeling with grid features for image captioning, which is critical for generating accurate and detailed captions.

Image Captioning Object Recognition

Paper
Code

Automatic Network Pruning via Hilbert-Schmidt Independence Criterion Lasso under Information Bottleneck Principle

1 code implementation • ICCV 2023 • Song Guo, Lei Zhang, Xiawu Zheng, Yan Wang, Yuchao Li, Fei Chao, Chenglin Wu, Shengchuan Zhang, Rongrong Ji

In this paper, we try to solve this problem by introducing a principled and unified framework based on Information Bottleneck (IB) theory, which further guides us to an automatic pruning approach.

Network Pruning

Paper
Code

Projection & Probability-Driven Black-Box Attack

1 code implementation • CVPR 2020 • Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.

Paper
Code

Learning Best Combination for Efficient N:M Sparsity

1 code implementation • 14 Jun 2022 • Yuxin Zhang, Mingbao Lin, Zhihang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji

In this paper, we show that the N:M learning can be naturally characterized as a combinatorial problem which searches for the best combination candidate within a finite collection.

Paper
Code

Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation

1 code implementation • 18 Jan 2024 • Zesen Cheng, Kehan Li, Hao Li, Peng Jin, Chang Liu, Xiawu Zheng, Rongrong Ji, Jie Chen

To mold instance queries to follow Brownian bridge and accomplish alignment with class texts, we design Bridge-Text Alignment (BTA) to learn discriminative bridge-level representations of instances via contrastive objectives.

Instance Segmentation Semantic Segmentation +1

Paper
Code

Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning

1 code implementation • 23 Jan 2019 • Shaohui Lin, Rongrong Ji, Yuchao Li, Cheng Deng, Xuelong. Li

In this paper, we propose a novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speedup the computation and reduce the memory overhead of CNNs, which can be well supported by various off-the-shelf deep learning libraries.

Domain Adaptation object-detection +2

Paper
Code

Towards Unified Token Learning for Vision-Language Tracking

1 code implementation • 27 Aug 2023 • Yaozong Zheng, Bineng Zhong, Qihua Liang, Guorong Li, Rongrong Ji, Xianxian Li

In this paper, we present a simple, flexible and effective vision-language (VL) tracking pipeline, termed \textbf{MMTrack}, which casts VL tracking as a token generation task.

Paper
Code

DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

1 code implementation • 31 Mar 2024 • Lirui Zhao, Yue Yang, Kaipeng Zhang, Wenqi Shao, Yuxin Zhang, Yu Qiao, Ping Luo, Rongrong Ji

Text-to-image (T2I) generative models have attracted significant attention and found extensive applications within and beyond academic research.

Language Modelling Large Language Model

Paper
Code

LAB-Net: LAB Color-Space Oriented Lightweight Network for Shadow Removal

1 code implementation • 27 Aug 2022 • Hong Yang, Gongrui Nan, Mingbao Lin, Fei Chao, Yunhang Shen, Ke Li, Rongrong Ji

Finally, the LSA modules are further developed to fully use the prior information in non-shadow regions to cleanse the shadow regions.

Shadow Removal

Paper
Code

Towards Compact CNNs via Collaborative Compression

1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.

Neural Network Compression Tensor Decomposition

Paper
Code

Aha! Adaptive History-Driven Attack for Decision-Based Black-Box Models

1 code implementation • ICCV 2021 • Jie Li, Rongrong Ji, Peixian Chen, Baochang Zhang, Xiaopeng Hong, Ruixin Zhang, Shaoxin Li, Jilin Li, Feiyue Huang, Yongjian Wu

A common practice is to start from a large perturbation and then iteratively reduce it with a deterministic direction and a random one while keeping it adversarial.

Dimensionality Reduction

Paper
Code

Pruning Networks with Cross-Layer Ranking & k-Reciprocal Nearest Filters

1 code implementation • 15 Feb 2022 • Mingbao Lin, Liujuan Cao, Yuxin Zhang, Ling Shao, Chia-Wen Lin, Rongrong Ji

Then, we introduce a recommendation-based filter selection scheme where each filter recommends a group of its closest filters.

Image Classification Network Pruning

Paper
Code

Bi-directional Masks for Efficient N:M Sparse Training

1 code implementation • 13 Feb 2023 • Yuxin Zhang, Yiting Luo, Mingbao Lin, Yunshan Zhong, Jingjing Xie, Fei Chao, Rongrong Ji

We focus on addressing the dense backward propagation issue for training efficiency of N:M fine-grained sparsity that preserves at most N out of M consecutive weights and achieves practical speedups supported by the N:M sparse tensor core.

Paper
Code

A Real-time Global Inference Network for One-stage Referring Expression Comprehension

1 code implementation • 7 Dec 2019 • Yiyi Zhou, Rongrong Ji, Gen Luo, Xiaoshuai Sun, Jinsong Su, Xinghao Ding, Chia-Wen Lin, Qi Tian

Referring Expression Comprehension (REC) is an emerging research spot in computer vision, which refers to detecting the target region in an image given an text description.

feature selection Referring Expression +1

Paper
Code

Discriminator-Cooperated Feature Map Distillation for GAN Compression

1 code implementation • CVPR 2023 • Tie Hu, Mingbao Lin, Lizhou You, Fei Chao, Rongrong Ji

In contrast to conventional pixel-to-pixel match methods in feature map distillation, our DCD utilizes teacher discriminator as a transformation to drive intermediate results of the student generator to be perceptually close to corresponding outputs of the teacher generator.

Image Generation Knowledge Distillation

Paper
Code

Distribution-Flexible Subset Quantization for Post-Quantizing Super-Resolution Networks

1 code implementation • 10 May 2023 • Yunshan Zhong, Mingbao Lin, Jingjing Xie, Yuxin Zhang, Fei Chao, Rongrong Ji

Compared to the common iterative exhaustive search algorithm, our strategy avoids the enumeration of all possible combinations in the universal set, reducing the time complexity from exponential to linear.

Quantization Super-Resolution

Paper
Code

GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as Reference

1 code implementation • 29 Jun 2021 • Peng Tu, Yawen Huang, Rongrong Ji, Feng Zheng, Ling Shao

To take advantage of the labeled examples and guide unlabeled data learning, we further propose a mask generation module to generate high-quality pseudo masks for the unlabeled data.

Ranked #1 on Semi-Supervised Semantic Segmentation on PASCAL VOC 2012 500 labels

Semi-Supervised Semantic Segmentation

Paper
Code

Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure

1 code implementation • 9 Apr 2018 • Deng-Ping Fan, Shengchuan Zhang, Yu-Huan Wu, Ming-Ming Cheng, Bo Ren, Rongrong Ji, Paul L. Rosin

However, human perception of the similarity of two sketches will consider both structure and texture as essential factors and is not sensitive to slight ("pixel-level") mismatches.

Face Sketch Synthesis

Paper
Code

Scoot: A Perceptual Metric for Facial Sketches

1 code implementation • ICCV 2019 • Deng-Ping Fan, Shengchuan Zhang, Yu-Huan Wu, Yun Liu, Ming-Ming Cheng, Bo Ren, Paul L. Rosin, Rongrong Ji

In this paper, we design a perceptual metric, called Structure Co-Occurrence Texture (Scoot), which simultaneously considers the block-level spatial structure and co-occurrence texture statistics.

Face Sketch Synthesis SSIM

Paper
Code

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization

1 code implementation • 14 May 2023 • Yunshan Zhong, Mingbao Lin, Yuyao Zhou, Mengzhao Chen, Yuxin Zhang, Fei Chao, Rongrong Ji

However, in this paper, we investigate existing methods and observe a significant accumulation of quantization errors caused by frequent bit-width switching of weights and activations, leading to limited performance.

Quantization

Paper
Code

DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis

1 code implementation • 27 Mar 2024 • Zhongxi Chen, Ke Sun, Ziyin Zhou, Xianming Lin, Xiaoshuai Sun, Liujuan Cao, Rongrong Ji

The rapid progress in deep learning has given rise to hyper-realistic facial forgery methods, leading to concerns related to misinformation and security risks.

Image Generation Misinformation

Paper
Code

Deep Instruction Tuning for Segment Anything Model

1 code implementation • 31 Mar 2024 • Xiaorui Huang, Gen Luo, Chaoyang Zhu, Bo Tong, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji

Segment Anything Model (SAM) exhibits powerful yet versatile capabilities on (un) conditional image segmentation tasks recently.

Image Segmentation Segmentation +1

Paper
Code

Semi-Supervised Panoptic Narrative Grounding

1 code implementation • 27 Oct 2023 • Danni Yang, Jiayi Ji, Xiaoshuai Sun, Haowei Wang, Yinan Li, Yiwei Ma, Rongrong Ji

Remarkably, our SS-PNG-NW+ outperforms fully-supervised models with only 30% and 50% supervision data, exceeding their performance by 0. 8% and 1. 1% respectively.

Data Augmentation Pseudo Label

Paper
Code

AffineQuant: Affine Transformation Quantization for Large Language Models

1 code implementation • 19 Mar 2024 • Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng Ling, Xuefeng Xiao, Rui Wang, Shilei Wen, Fei Chao, Rongrong Ji

Among these techniques, Post-Training Quantization (PTQ) has emerged as a subject of considerable interest due to its noteworthy compression efficiency and cost-effectiveness in the context of training.

Quantization

Paper
Code

Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting

1 code implementation • 1 Jun 2023 • Shubin Huang, Qiong Wu, Yiyi Zhou, WeiJie Chen, Rongsheng Zhang, Xiaoshuai Sun, Rongrong Ji

In addition, we also experiment DVP with the recently popular adapter approach to keep the most parameters of PLMs intact when adapting to VL tasks, helping PLMs achieve a quick shift between single- and multi-modal tasks.

Transfer Learning Visual Prompting

Paper
Code

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

1 code implementation • 17 Oct 2023 • Haowei Wang, Jiayi Ji, Tianyu Guo, Yilong Yang, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji

To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively.

Segmentation Visual Grounding

Paper
Code

Learning to Learn Transferable Attack

1 code implementation • 10 Dec 2021 • Shuman Fang, Jie Li, Xianming Lin, Rongrong Ji

By treating the attack of both specific data and a modified model as a task, we expect the adversarial perturbations to adopt enough tasks for generalization.

Adversarial Attack Data Augmentation +1

Paper
Code

OptG: Optimizing Gradient-driven Criteria in Network Sparsity

1 code implementation • 30 Jan 2022 • Yuxin Zhang, Mingbao Lin, Mengzhao Chen, Fei Chao, Rongrong Ji

We prove that supermask training is to accumulate the criteria of gradient-driven sparsity for both removed and preserved weights, and it can partly solve the independence paradox.

Paper
Code

Shadow Removal by High-Quality Shadow Synthesis

1 code implementation • 8 Dec 2022 • Yunshan Zhong, Lizhou You, Yuxin Zhang, Fei Chao, Yonghong Tian, Rongrong Ji

Specifically, the encoder extracts the shadow feature of a region identity which is then paired with another region identity to serve as the generator input to synthesize a pseudo image.

Image Generation Shadow Removal +1

Paper
Code

Boosting the Cross-Architecture Generalization of Dataset Distillation through an Empirical Study

1 code implementation • 9 Dec 2023 • Lirui Zhao, Yuxin Zhang, Mingbao Lin, Fei Chao, Rongrong Ji

The poor cross-architecture generalization of dataset distillation greatly weakens its practical significance.

Inductive Bias

Paper
Code

Learning Image Demoireing from Unpaired Real Data

1 code implementation • 5 Jan 2024 • Yunshan Zhong, Yuyao Zhou, Yuxin Zhang, Fei Chao, Rongrong Ji

The proposed method, referred to as Unpaired Demoireing (UnDeM), synthesizes pseudo moire images from unpaired datasets, generating pairs with clean images for training demoireing models.

Paper
Code

EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

1 code implementation • 19 Feb 2024 • Song Guo, Fan Wu, Lei Zhang, Xiawu Zheng, Shengchuan Zhang, Fei Chao, Yiyu Shi, Rongrong Ji

For instance, on the Wikitext2 dataset with LlamaV1-7B at 70% sparsity, our proposed EBFT achieves a perplexity of 16. 88, surpassing the state-of-the-art DSnoT with a perplexity of 75. 14.

Paper
Code

Exploring Content Relationships for Distilling Efficient GANs

1 code implementation • 21 Dec 2022 • Lizhou You, Mingbao Lin, Tie Hu, Fei Chao, Rongrong Ji

This paper proposes a content relationship distillation (CRD) to tackle the over-parameterized generative adversarial networks (GANs) for the serviceability in cutting-edge devices.

Paper
Code

CAT:Collaborative Adversarial Training

1 code implementation • 27 Mar 2023 • Xingbin Liu, Huafeng Kuang, Xianming Lin, Yongjian Wu, Rongrong Ji

By revisiting the previous methods, we find different adversarial training methods have distinct robustness for sample instances.

Adversarial Robustness

Paper
Code

Latent Feature Relation Consistency for Adversarial Robustness

1 code implementation • 29 Mar 2023 • Xingbin Liu, Huafeng Kuang, Hong Liu, Xianming Lin, Yongjian Wu, Rongrong Ji

Deep neural networks have been applied in many computer vision tasks and achieved state-of-the-art performance.

Adversarial Robustness Relation

Paper
Code

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

1 code implementation • 30 Jun 2023 • Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun, Tongliang Liu, Rongrong Ji, DaCheng Tao

Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight.

Paper
Code

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization

1 code implementation • 11 Mar 2024 • Jinlu Zhang, Yiyi Zhou, Qiancheng Zheng, Xiaoxiong Du, Gen Luo, Jun Peng, Xiaoshuai Sun, Rongrong Ji

Text-to-3D-aware face (T3D Face) generation and manipulation is an emerging research hot spot in machine learning, which still suffers from low efficiency and poor quality.

Face Generation Text to 3D

Paper
Code

CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

1 code implementation • 23 Apr 2024 • Mingbao Lin, Zhihang Lin, Wengyi Zhan, Liujuan Cao, Rongrong Ji

Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i. e., diffusion extrapolation, significantly improves diffusion adaptability.

Paper
Code

Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection

1 code implementation • ICCV 2019 • Yingyue Xu, Dan Xu, Xiaopeng Hong, Wanli Ouyang, Rongrong Ji, Min Xu, Guoying Zhao

We formulate the CRF graphical model that involves message-passing of feature-feature, feature-prediction, and prediction-prediction, from the coarse scale to the finer scale, to update the features and the corresponding predictions.

object-detection RGB Salient Object Detection +1

Paper
Code

Shadow-Aware Dynamic Convolution for Shadow Removal

2 code implementations • 10 May 2022 • Yimin Xu, Mingbao Lin, Hong Yang, Fei Chao, Rongrong Ji

Inspired by the fact that the color mapping of the non-shadow region is easier to learn, our SADC processes the non-shadow region with a lightweight convolution module in a computationally cheap manner and recovers the shadow region with a more complicated convolution module to ensure the quality of image reconstruction.

Image Reconstruction Shadow Removal

Paper
Code

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Paper
Code

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

1 code implementation • 16 Nov 2023 • Yunshan Zhong, Jiawei Hu, Mingbao Lin, Mengzhao Chen, Rongrong Ji

Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications.

Quantization

Paper
Code

CerfGAN: A Compact, Effective, Robust, and Fast Model for Unsupervised Multi-Domain Image-to-Image Translation

no code implementations • 28 May 2018 • Xiao Liu, Shengchuan Zhang, Hong Liu, Xin Liu, Cheng Deng, Rongrong Ji

In principle, CerfGAN contains a novel component, i. e., a multi-class discriminator (MCD), which gives the model an extremely powerful ability to match multiple translation mappings.

Attribute Face Hallucination +4

Paper
Add Code

Action-Attending Graphic Neural Network

no code implementations • 17 Nov 2017 • Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Rongrong Ji, Jian Yang

The motion analysis of human skeletons is crucial for human action recognition, which is one of the most active topics in computer vision.

Action Analysis Action Recognition +3

Paper
Add Code

Deep Spatio-temporal Manifold Network for Action Recognition

no code implementations • 9 May 2017 • Ce Li, Chen Chen, Baochang Zhang, Qixiang Ye, Jungong Han, Rongrong Ji

Visual data such as videos are often sampled from complex manifold.

Action Recognition Temporal Action Localization

Paper
Add Code

Output Constraint Transfer for Kernelized Correlation Filter in Tracking

no code implementations • 16 Dec 2016 • Baochang Zhang, Zhigang Li, Xian-Bin Cao, Qixiang Ye, Chen Chen, Linlin Shen, Alessandro Perina, Rongrong Ji

Kernelized Correlation Filter (KCF) is one of the state-of-the-art object trackers.

Bayesian Optimization

Paper
Add Code

Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation

no code implementations • 25 Sep 2016 • Jinsong Su, Zhixing Tan, Deyi Xiong, Rongrong Ji, Xiaodong Shi, Yang Liu

Neural machine translation (NMT) heavily relies on word-level modelling to learn semantic representations of input sentences.

Machine Translation NMT +2

Paper
Add Code

Ordinal Constrained Binary Code Learning for Nearest Neighbor Search

no code implementations • 19 Nov 2016 • Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang

By given a large-scale training data set, it is very expensive to embed such ranking tuples in binary code learning.

Retrieval Small Data Image Classification

Paper
Add Code

Video (GIF) Sentiment Analysis using Large-Scale Mid-Level Ontology

no code implementations • 2 Jun 2015 • Zheng Cai, Donglin Cao, Rongrong Ji

However, GIF sentiment analysis is quite challenging, not only because it hinges on spatio-temporal visual contentabstraction, but also for the relationship between such abstraction and final sentiment remains unknown. In this paper, we dedicated to find out such relationship. We proposed a SentiPairSequence basedspatiotemporal visual sentiment ontology, which forms the midlevel representations for GIFsentiment.

Sentiment Analysis

Paper
Add Code

PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition

1 code implementation • 23 Aug 2018 • Haoxuan You, Yifan Feng, Rongrong Ji, Yue Gao

With the recent proliferation of deep learning, various deep models with different representations have achieved the state-of-the-art performance.

3D Object Recognition 3D Shape Classification +3

Paper
Code

Universal Perturbation Attack Against Image Retrieval

no code implementations • ICCV 2019 • Jie Li, Rongrong Ji, Hong Liu, Xiaopeng Hong, Yue Gao, Qi Tian

In this paper, we make the first attempt in attacking image retrieval systems.

Image Classification Image Retrieval +1

Paper
Add Code

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition

no code implementations • 2 Dec 2018 • Haoxuan You, Yifan Feng, Xibin Zhao, Changqing Zou, Rongrong Ji, Yue Gao

More specifically, based on the relation score module, the point-single-view fusion feature is first extracted by fusing the point cloud feature and each single view feature with point-singe-view relation, then the point-multi-view fusion feature is extracted by fusing the point cloud feature and the features of different number of views with point-multi-view relation.

3D Shape Classification 3D Shape Recognition +3

Paper
Add Code

GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition

no code implementations • CVPR 2018 • Yifan Feng, Zizhao Zhang, Xibin Zhao, Rongrong Ji, Yue Gao

The proposed GVCNN framework is composed of a hierarchical view-group-shape architecture, i. e., from the view level, the group level and the shape level, which are organized using a grouping strategy.

3D Shape Classification 3D Shape Recognition +2

Paper
Add Code

Modulated Convolutional Networks

no code implementations • CVPR 2018 • Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu

In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.

Paper
Add Code

GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints

no code implementations • CVPR 2018 • Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Jinsong Su

In offline optimization, we adopt an end-to-end formulation, which jointly trains the visual tree parser, the structured relevance and diversity constraints, as well as the LSTM based captioning model.

Image Captioning

Paper
Add Code

Generative Adversarial Learning Towards Fast Weakly Supervised Detection

no code implementations • CVPR 2018 • Yunhan Shen, Rongrong Ji, Shengchuan Zhang, WangMeng Zuo, Yan Wang

Without the need of annotating bounding boxes, the existing methods usually follow a two/multi-stage pipeline with an online compulsive stage to extract object proposals, which is an order of magnitude slower than fast fully supervised object detectors such as SSD [31] and YOLO [34].

Object object-detection +1

Paper
Add Code

Label Propagation from ImageNet to 3D Point Clouds

no code implementations • CVPR 2013 • Yan Wang, Rongrong Ji, Shih-Fu Chang

Our approach shows further major gains in accuracy when the training data from the target scenes is used, outperforming state-ofthe-art approaches with far better efficiency.

Paper
Add Code

Towards 3D Object Detection With Bimodal Deep Boltzmann Machines Over RGBD Imagery

no code implementations • CVPR 2015 • Wei Liu, Rongrong Ji, Shaozi Li

In particular, we slide a 3D detection window in the 3D point cloud to match the exemplar shape, which the lack of training data in 3D domain is conquered via (1) We collect 3D CAD models and 2D positive samples from Internet.

3D Object Detection object-detection

Paper
Add Code

Understanding Image Structure via Hierarchical Shape Parsing

no code implementations • CVPR 2015 • Xian-Ming Liu, Rongrong Ji, Changhu Wang, Wei Liu, Bineng Zhong, Thomas S. Huang

A hierarchical shape parsing strategy is proposed to partition and organize image components into a hierarchical structure in the scale space.

Paper
Add Code

Cross-Modality Binary Code Learning via Fusion Similarity Hashing

no code implementations • CVPR 2017 • Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang, Baochang Zhang

In this paper, we propose a hashing scheme, termed Fusion Similarity Hashing (FSH), which explicitly embeds the graph-based fusion similarity across modalities into a common Hamming space.

Retrieval

Paper
Add Code

Top Rank Supervised Binary Coding for Visual Search

no code implementations • ICCV 2015 • Dongjin Song, Wei Liu, Rongrong Ji, David A. Meyer, John R. Smith

In this paper, we propose a novel supervised binary coding approach, namely Top Rank Supervised Binary Coding (Top-RSBC), which explicitly focuses on optimizing the precision of top positions in a Hamming-distance ranking list towards preserving the supervision information.

Image Retrieval

Paper
Add Code

Aurora Guard: Real-Time Face Anti-Spoofing via Light Reflection

no code implementations • 27 Feb 2019 • Yao Liu, Ying Tai, Jilin Li, Shouhong Ding, Chengjie Wang, Feiyue Huang, Dongyang Li, Wenshuai Qi, Rongrong Ji

In this paper, we propose a light reflection based face anti-spoofing method named Aurora Guard (AG), which is fast, simple yet effective that has already been deployed in real-world systems serving for millions of users.

Face Anti-Spoofing General Classification

Paper
Add Code

Supervised Online Hashing via Similarity Distribution Learning

no code implementations • 31 May 2019 • Mingbao Lin, Rongrong Ji, Shen Chen, Feng Zheng, Xiaoshuai Sun, Baochang Zhang, Liujuan Cao, Guodong Guo, Feiyue Huang

In this paper, we propose to model the similarity distributions between the input data and the hashing codes, upon which a novel supervised online hashing method, dubbed as Similarity Distribution based Online Hashing (SDOH), is proposed, to keep the intrinsic semantic relationship in the produced Hamming space.

Retrieval

Paper
Add Code

Interpretable Neural Network Decoupling

no code implementations • ECCV 2020 • Yuchao Li, Rongrong Ji, Shaohui Lin, Baochang Zhang, Chenqian Yan, Yongjian Wu, Feiyue Huang, Ling Shao

More specifically, we introduce a novel architecture controlling module in each layer to encode the network architecture by a vector.

Network Interpretation

Paper
Add Code

Semi-Supervised Adversarial Monocular Depth Estimation

no code implementations • 6 Aug 2019 • Rongrong Ji, Ke Li, Yan Wang, Xiaoshuai Sun, Feng Guo, Xiaowei Guo, Yongjian Wu, Feiyue Huang, Jiebo Luo

In this paper, we address the problem of monocular depth estimation when only a limited number of training image-depth pairs are available.

Monocular Depth Estimation

Paper
Add Code

Scene-based Factored Attention for Image Captioning

no code implementations • 7 Aug 2019 • Chen Shen, Rongrong Ji, Fuhai Chen, Xiaoshuai Sun, Xiangming Li

Specifically, the proposed module first embeds the scene concepts into factored weights explicitly and attends the visual information extracted from the input image.

Caption Generation Image Captioning +1

Paper
Add Code

Bayesian Optimized 1-Bit CNNs

no code implementations • ICCV 2019 • Jiaxin Gu, Junhe Zhao, Xiao-Long Jiang, Baochang Zhang, Jianzhuang Liu, Guodong Guo, Rongrong Ji

Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models.

Paper
Add Code

Semantic-aware Image Deblurring

no code implementations • 9 Oct 2019 • Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xiaoshuai Sun, Chia-Wen Lin, Jiayi Ji, Baochang Zhang, Feiyue Huang, Liujuan Cao

Specially, we propose a novel Structured-Spatial Semantic Embedding model for image deblurring (termed S3E-Deblur), which introduces a novel Structured-Spatial Semantic tree model (S3-tree) to bridge two basic tasks in computer vision: image deblurring (ImD) and image captioning (ImC).

Deblurring Image Captioning +1

Paper
Add Code

Hadamard Codebook Based Deep Hashing

no code implementations • 21 Oct 2019 • Shen Chen, Liujuan Cao, Mingbao Lin, Yan Wang, Xiaoshuai Sun, Chenglin Wu, Jingfei Qiu, Rongrong Ji

Specifically, we utilize an off-the-shelf algorithm to generate a binary Hadamard codebook to satisfy the requirement of bit independence and bit balance, which subsequently serves as the desired outputs of the hash functions learning.

Deep Hashing Image Retrieval

Paper
Add Code

Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation

no code implementations • CVPR 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Baochang Zhang, Jiaxin Gu, Jianzhuang Liu, Rongrong Ji, David Doermann

The CiFs can be easily incorporated into existing deep convolutional neural networks (DCNNs), which leads to new Circulant Binary Convolutional Networks (CBCNs).

Paper
Add Code

Beyond Universal Person Re-ID Attack

no code implementations • 30 Oct 2019 • Wenjie Ding, Xing Wei, Rongrong Ji, Xiaopeng Hong, Qi Tian, Yihong Gong

We propose a \emph{more universal} adversarial perturbation (MUAP) method for both image-agnostic and model-insensitive person Re-ID attack.

General Classification Person Re-Identification

Paper
Add Code

Variational Structured Semantic Inference for Diverse Image Captioning

no code implementations • NeurIPS 2019 • Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang

To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.

Image Captioning

Paper
Add Code

Binarized Neural Architecture Search

no code implementations • 25 Nov 2019 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, David Doermann, Rongrong Ji

A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models.

Neural Architecture Search

Paper
Add Code

Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning

no code implementations • 26 Nov 2019 • Kekai Sheng, Wei-Ming Dong, Menglei Chai, Guohui Wang, Peng Zhou, Feiyue Huang, Bao-Gang Hu, Rongrong Ji, Chongyang Ma

In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective.

Paper
Add Code

ASFD: Automatic and Scalable Face Detector

no code implementations • 25 Mar 2020 • Bin Zhang, Jian Li, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yili Xia, Wenjiang Pei, Rongrong Ji

In this paper, we propose a novel Automatic and Scalable Face Detector (ASFD), which is based on a combination of neural architecture search techniques as well as a new loss design.

Neural Architecture Search

Paper
Add Code

AD-Cluster: Augmented Discriminative Clustering for Domain Adaptive Person Re-identification

no code implementations • CVPR 2020 • Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, Yonghong Tian

Domain adaptive person re-identification (re-ID) is a challenging task, especially when person identities in target domains are unknown.

Ranked #8 on Unsupervised Domain Adaptation on Duke to Market

Clustering Domain Adaptive Person Re-Identification +2

Paper
Add Code

Cogradient Descent for Bilinear Optimization

no code implementations • CVPR 2020 • Li'an Zhuo, Baochang Zhang, Linlin Yang, Hanlin Chen, Qixiang Ye, David Doermann, Guodong Guo, Rongrong Ji

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure.

Image Reconstruction Network Pruning

Paper
Add Code

Learning Task-oriented Disentangled Representations for Unsupervised Domain Adaptation

no code implementations • 27 Jul 2020 • Pingyang Dai, Peixian Chen, Qiong Wu, Xiaopeng Hong, Qixiang Ye, Qi Tian, Rongrong Ji

This drawback limits the flexibility of UDA in complicated open-set tasks where no labels are shared between domains.

Retrieval Unsupervised Domain Adaptation

Paper
Add Code

Anti-Bandit Neural Architecture Search for Model Defense

no code implementations • ECCV 2020 • Hanlin Chen, Baochang Zhang, Song Xue, Xuan Gong, Hong Liu, Rongrong Ji, David Doermann

Deep convolutional neural networks (DCNNs) have dominated as the best performers in machine learning, but can be challenged by adversarial attacks.

Denoising Neural Architecture Search

Paper
Add Code

Dual Channel Hypergraph Collaborative Filtering

no code implementations • SIGKDD 2020 • Shuyi Ji, Yifan Feng, Rongrong Ji, Xibin Zhao, Wanwan Tang, Yue Gao.

Second, the hypergraph structure is employed for modeling users and items with explicit hybrid high-order correlations.

Collaborative Filtering Recommendation Systems

Paper
Add Code

Enabling Deep Residual Networks for Weakly Supervised Object Detection

no code implementations • ECCV 2020 • Yunhang Shen, Rongrong Ji, Yan Wang, Zhiwei Chen, Feng Zheng, Feiyue Huang, Yunsheng Wu

Weakly supervised object detection (WSOD) has attracted extensive research attention due to its great flexibility of exploiting large-scale image-level annotation for detector training.

Object object-detection +1

Paper
Add Code

SSCGAN: Facial Attribute Editing via Style Skip Connections

no code implementations • ECCV 2020 • Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

Each connection extracts the style feature of the latent feature maps in the encoder and then performs a residual learning based mapping function in the global information space guided by the target attributes.

Attribute Generative Adversarial Network

Paper
Add Code

Binarized Neural Architecture Search for Efficient Object Recognition

no code implementations • 8 Sep 2020 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, Rongrong Ji, David Doermann, Guodong Guo

In this paper, binarized neural architecture search (BNAS), with a search space of binarized convolutions, is introduced to produce extremely compressed models to reduce huge computational cost on embedded devices for edge computing.

Edge-computing Face Recognition +3

Paper
Add Code

Fast Class-wise Updating for Online Hashing

no code implementations • 1 Dec 2020 • Mingbao Lin, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Feiyue Huang, Yonghong Tian, DaCheng Tao

To achieve fast online adaptivity, a class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches.

Paper
Add Code

Aurora Guard: Reliable Face Anti-Spoofing via Mobile Lighting System

no code implementations • 1 Feb 2021 • Jian Zhang, Ying Tai, Taiping Yao, Jia Meng, Shouhong Ding, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

Face authentication on mobile end has been widely applied in various scenarios.

Face Anti-Spoofing

Paper
Add Code

On Evolving Attention Towards Domain Adaptation

no code implementations • 25 Mar 2021 • Kekai Sheng, Ke Li, Xiawu Zheng, Jian Liang, WeiMing Dong, Feiyue Huang, Rongrong Ji, Xing Sun

However, considering that the configuration of attention, i. e., the type and the position of attention module, affects the performance significantly, it is more generalized to optimize the attention configuration automatically to be specialized for arbitrary UDA scenario.

Ranked #1 on Partial Domain Adaptation on Office-Home

Partial Domain Adaptation Unsupervised Domain Adaptation

Paper
Add Code

Learnable Expansion-and-Compression Network for Few-shot Class-Incremental Learning

no code implementations • 6 Apr 2021 • Boyu Yang, Mingbao Lin, Binghao Liu, Mengying Fu, Chang Liu, Rongrong Ji, Qixiang Ye

By tentatively expanding network nodes, LEC-Net enlarges the representation capacity of features, alleviating feature drift of old network from the perspective of model regularization.

Few-Shot Class-Incremental Learning Incremental Learning

Paper
Add Code

Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack

no code implementations • NeurIPS 2021 • Yixu Wang, Jie Li, Hong Liu, Yan Wang, Yongjian Wu, Feiyue Huang, Rongrong Ji

We argue this is due to the lack of rich information in the probability prediction and the overfitting caused by hard labels.

Self-Knowledge Distillation

Paper
Add Code

Local Relation Learning for Face Forgery Detection

no code implementations • 6 May 2021 • Shen Chen, Taiping Yao, Yang Chen, Shouhong Ding, Jilin Li, Rongrong Ji

Specifically, we propose a Multi-scale Patch Similarity Module (MPSM), which measures the similarity between features of local regions and forms a robust and generalized similarity pattern.

Relation

Paper
Add Code

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation

no code implementations • CVPR 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

To date, learning weakly supervised panoptic segmentation (WSPS) with only image-level labels remains unexplored.

Instance Segmentation Multiple Instance Learning +6

Paper
Add Code

Seminar Learning for Click-Level Weakly Supervised Semantic Segmentation

no code implementations • ICCV 2021 • Hongjun Chen, Jinbao Wang, Hong Cai Chen, XianTong Zhen, Feng Zheng, Rongrong Ji, Ling Shao

Annotation burden has become one of the biggest barriers to semantic segmentation.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Paper
Add Code

Occlude Them All: Occlusion-Aware Attention Network for Occluded Person Re-ID

no code implementations • ICCV 2021 • Peixian Chen, Wenfeng Liu, Pingyang Dai, Jianzhuang Liu, Qixiang Ye, Mingliang Xu, Qi'an Chen, Rongrong Ji

To avoid such problematic models in occluded person ReID, we propose the Occlusion-Aware Mask Network (OAMN).

Person Re-Identification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.