Search Results for author: Sucheng Ren

Found 25 papers, 15 papers with code

Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation

no code implementations • 8 Mar 2024 • Yijiang Li, Sucheng Ren, Weipeng Deng, Yuzhi Xu, Ying Gao, Edith Ngai, Haohan Wang

Starting with the class of interest, we query the LLMs to extract relevant knowledge for these novel domains.

Domain Generalization Out-of-Distribution Generalization +1

Paper
Add Code

Compress & Align: Curating Image-Text Data with Human Knowledge

no code implementations • 11 Dec 2023 • Lei Zhang, Fangxun Shu, Sucheng Ren, Bingchen Zhao, Hao Jiang, Cihang Xie

The massive growth of image-text data through web crawling inherently presents the challenge of variability in data quality.

Image Captioning Text Retrieval

Paper
Add Code

Rejuvenating image-GPT as Strong Visual Representation Learners

1 code implementation • 4 Dec 2023 • Sucheng Ren, Zeyu Wang, Hongru Zhu, Junfei Xiao, Alan Yuille, Cihang Xie

This paper enhances image-GPT (iGPT), one of the pioneering works that introduce autoregressive pretraining to predict next pixels for visual representation learning.

Representation Learning

Paper
Code

NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos

1 code implementation • 23 Aug 2023 • Ziyu Yang, Sucheng Ren, Zongwei Wu, Nanxuan Zhao, Junle Wang, Jing Qin, Shengfeng He

Non-photorealistic videos are in demand with the wave of the metaverse, but lack of sufficient research studies.

Saliency Detection

Paper
Code

SG-Former: Self-guided Transformer with Evolving Token Reallocation

1 code implementation • ICCV 2023 • Sucheng Ren, Xingyi Yang, Songhua Liu, Xinchao Wang

At the heart of our approach is to utilize a significance map, which is estimated through hybrid-scale self-attention and evolves itself during training, to reallocate tokens based on the significance of each region.

Paper
Code

DeepMIM: Deep Supervision for Masked Image Modeling

1 code implementation • 15 Mar 2023 • Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu

Deep supervision, which involves extra supervisions to the intermediate features of a neural network, was widely used in image classification in the early deep learning era since it significantly reduces the training difficulty and eases the optimization like avoiding gradient vanish over the vanilla training.

Image Classification object-detection +2

Paper
Code

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

2 code implementations • CVPR 2023 • Sucheng Ren, Fangyun Wei, Zheng Zhang, Han Hu

Our TinyMIM model of tiny size achieves 79. 6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget.

Image Classification Semantic Segmentation

147

Paper
Code

Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion

1 code implementation • 22 Jul 2022 • Zhengqi Gao, Fan-Keng Sun, Mingran Yang, Sucheng Ren, Zikai Xiong, Marc Engeler, Antonio Burazer, Linda Wildling, Luca Daniel, Duane S. Boning

Data lies at the core of modern deep learning.

Paper
Code

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

1 code implementation • 13 Jul 2022 • Songhua Liu, Jingwen Ye, Sucheng Ren, Xinchao Wang

Prior approaches, despite the promising results, have relied on either estimating dense attention to compute per-point matching, which is limited to only coarse scales due to the quadratic memory cost, or fixing the number of correspondences to achieve linear complexity, which lacks flexibility.

Face Generation Style Transfer

Paper
Code

A Simple Data Mixing Prior for Improving Self-Supervised Learning

1 code implementation • CVPR 2022 • Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie

More notably, our SDMP is the first method that successfully leverages data mixing to improve (rather than hurt) the performance of Vision Transformers in the self-supervised setting.

Representation Learning Self-Supervised Learning

Paper
Code

The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation

2 code implementations • 13 Jun 2022 • Zihui Xue, Zhengqi Gao, Sucheng Ren, Hang Zhao

Crossmodal knowledge distillation (KD) extends traditional knowledge distillation to the area of multimodal learning and demonstrates great success in various applications.

Knowledge Distillation Transfer Learning

Paper
Code

Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting

no code implementations • 29 May 2022 • Zheng Xiong, Liangyu Chai, Wenxi Liu, Yongtuo Liu, Sucheng Ren, Shengfeng He

To enable training under this new setting, we convert the crowd count regression problem to a ranking potential prediction problem.

Crowd Counting Learning-To-Rank

Paper
Add Code

Training-Free Robust Multimodal Learning via Sample-Wise Jacobian Regularization

no code implementations • 5 Apr 2022 • Zhengqi Gao, Sucheng Ren, Zihui Xue, Siting Li, Hang Zhao

Multimodal fusion emerges as an appealing technique to improve model performances on many tasks.

Paper
Add Code

Self-supervision through Random Segments with Autoregressive Coding (RandSAC)

no code implementations • 22 Mar 2022 • Tianyu Hua, Yonglong Tian, Sucheng Ren, Michalis Raptis, Hang Zhao, Leonid Sigal

We illustrate that randomized serialization of the segments significantly improves the performance and results in distribution over spatially-long (across-segments) and -short (within-segment) predictions which are effective for feature learning.

Representation Learning Self-Supervised Learning

Paper
Add Code

Shunted Self-Attention via Multi-Scale Token Aggregation

1 code implementation • CVPR 2022 • Sucheng Ren, Daquan Zhou, Shengfeng He, Jiashi Feng, Xinchao Wang

This novel merging scheme enables the self-attention to learn relationships between objects with different sizes and simultaneously reduces the token numbers and the computational cost.

204

Paper
Code

Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting

no code implementations • 6 Aug 2021 • Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Jing Qin, Dan Xu, Shengfeng He

In this way, we can transfer the original spatial labeling redundancy caused by individual similarities to effective supervision signals on the unlabeled regions.

Crowd Counting

Paper
Add Code

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation

no code implementations • 6 Aug 2021 • Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He

To this end, we propose to untangle \emph{domain-invariant} crowd and \emph{domain-specific} background from crowd images and design a fine-grained domain adaption method for crowd counting.

Crowd Counting Domain Adaptation +1

Paper
Add Code

Unifying Global-Local Representations in Salient Object Detection with Transformer

1 code implementation • 5 Aug 2021 • Sucheng Ren, Qiang Wen, Nanxuan Zhao, Guoqiang Han, Shengfeng He

In this paper, we introduce a new attention-based encoder, vision transformer, into salient object detection to ensure the globalization of the representations from shallow to deep layers.

object-detection Object Detection +1

Paper
Code

Co-advise: Cross Inductive Bias Distillation

no code implementations • CVPR 2022 • Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks.

Inductive Bias

Paper
Add Code

Reciprocal Transformations for Unsupervised Video Object Segmentation

1 code implementation • CVPR 2021 • Sucheng Ren, Wenxi Liu, Yongtuo Liu, Haoxin Chen, Guoqiang Han, Shengfeng He

Additionally, to exclude the information of the moving background objects from motion features, our transformation module enables to reciprocally transform the appearance features to enhance the motion features, so as to focus on the moving objects with salient appearance while removing the co-moving outliers.

Ranked #8 on Unsupervised Video Object Segmentation on YouTube-Objects

Object Optical Flow Estimation +3

Paper
Code

Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation

1 code implementation • CVPR 2021 • Haoxin Chen, Hanjie Wu, Nanxuan Zhao, Sucheng Ren, Shengfeng He

The key is to model the relationship between the query videos and the support images for propagating the object information.

Meta-Learning Semantic Segmentation +2

Paper
Code

Learning From the Master: Distilling Cross-Modal Advanced Knowledge for Lip Reading

no code implementations • CVPR 2021 • Sucheng Ren, Yong Du, Jianming Lv, Guoqiang Han, Shengfeng He

To these ends, we introduce a trainable "master" network which ingests both audio signals and silent lip videos instead of a pretrained teacher.

Lip Reading Sentence +2

Paper
Add Code

On Feature Decorrelation in Self-Supervised Learning

1 code implementation • ICCV 2021 • Tianyu Hua, Wenxiao Wang, Zihui Xue, Sucheng Ren, Yue Wang, Hang Zhao

In self-supervised representation learning, a common idea behind most of the state-of-the-art approaches is to enforce the robustness of the representations to predefined augmentations.

Representation Learning Self-Supervised Learning

Paper
Code

Multimodal Knowledge Expansion

1 code implementation • ICCV 2021 • Zihui Xue, Sucheng Ren, Zhengqi Gao, Hang Zhao

The popularity of multimodal sensors and the accessibility of the Internet have brought us a massive amount of unlabeled multimodal data.

Ranked #63 on Semantic Segmentation on NYU Depth v2

Denoising Knowledge Distillation +1

Paper
Code

TENet: Triple Excitation Network for Video Salient Object Detection

no code implementations • ECCV 2020 • Sucheng Ren, Chu Han, Xin Yang, Guoqiang Han, Shengfeng He

In this paper, we propose a simple yet effective approach, named Triple Excitation Network, to reinforce the training of video salient object detection (VSOD) from three aspects, spatial, temporal, and online excitations.

Object object-detection +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.