Search Results for author: Ruihuang Li

Found 20 papers, 16 papers with code

FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling

1 code implementation24 Oct 2024 Zhengqiang Zhang, Ruihuang Li, Lei Zhang

While image generation with diffusion models has achieved a great success, generating images of higher resolution than the training size remains a challenging task due to the high computational cost.

Image Generation

Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding

no code implementations13 Jul 2024 Ruihuang Li, Zhengqiang Zhang, Chenhang He, Zhiyuan Ma, Vishal M. Patel, Lei Zhang

Recent vision-language pre-training models have exhibited remarkable generalization ability in zero-shot recognition tasks.

Scene Understanding Zero-Shot Learning

Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models

1 code implementation17 Mar 2024 Ruibin Li, Ruihuang Li, Song Guo, Lei Zhang

Text-driven diffusion models have significantly advanced the image editing performance by using text prompts as inputs.

Image Generation

ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention

1 code implementation1 Jan 2024 Chenhang He, Ruihuang Li, Guowen Zhang, Lei Zhang

In this paper, we introduce ScatterFormer, which to the best of our knowledge, is the first to directly apply attention to voxels across different windows as a single sequence.

Blocking

TMP: Temporal Motion Propagation for Online Video Super-Resolution

1 code implementation15 Dec 2023 Zhengqiang Zhang, Ruihuang Li, Shi Guo, Yang Cao, Lei Zhang

Online video super-resolution (online-VSR) highly relies on an effective alignment module to aggregate temporal information, while the strict latency requirement makes accurate and efficient alignment very challenging.

Video Super-Resolution

One-to-Few Label Assignment for End-to-End Dense Detection

1 code implementation CVPR 2023 Shuai Li, Minghan Li, Ruihuang Li, Chenhang He, Lei Zhang

The positive and negative weights of these soft anchors are dynamically adjusted during training so that they can contribute more to ``representation learning'' in the early training stage, and contribute more to ``duplicated prediction removal'' in the later stage.

Decoder Representation Learning

MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences

1 code implementation CVPR 2023 Chenhang He, Ruihuang Li, Yabin Zhang, Shuai Li, Lei Zhang

Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the sequence and fuses them to detect the objects in the current frame.

3D Object Detection Autonomous Driving +1

DynaMask: Dynamic Mask Selection for Instance Segmentation

1 code implementation CVPR 2023 Ruihuang Li, Chenhang He, Shuai Li, Yabin Zhang, Lei Zhang

The representative instance segmentation methods mostly segment different object instances with a mask of the fixed resolution, e. g., 28*28 grid.

Instance Segmentation Segmentation +1

Adversarial Style Augmentation for Domain Generalization

no code implementations30 Jan 2023 Yabin Zhang, Bin Deng, Ruihuang Li, Kui Jia, Lei Zhang

By updating the model against the adversarial statistics perturbation during training, we allow the model to explore the worst-case domain and hence improve its generalization performance.

domain classification Domain Generalization +1

Point-DAE: Denoising Autoencoders for Self-supervised Point Cloud Learning

1 code implementation13 Nov 2022 Yabin Zhang, Jiehong Lin, Ruihuang Li, Kui Jia, Lei Zhang

We also validate the effectiveness of affine transformation corruption with the Transformer backbones, where we decompose the reconstruction of the complete point cloud into the reconstructions of detailed local patches and rough global shape, alleviating the position leakage problem in the reconstruction.

3D Object Detection Decoder +3

Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling

1 code implementation CVPR 2022 Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, Yu-Wing Tai

Instead of directly feeding consecutive frames into a VSR model, we propose to compute the temporal difference between frames and divide those pixels into two subsets according to the level of difference.

Motion Compensation Optical Flow Estimation +1

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds

1 code implementation CVPR 2022 Chenhang He, Ruihuang Li, Shuai Li, Lei Zhang

VoxSeT is built upon a voxel-based set attention (VSA) module, which reduces the self-attention in each voxel by two cross-attentions and models features in a hidden space induced by a group of latent codes.

3D Object Detection object-detection

A Dual Weighting Label Assignment Scheme for Object Detection

1 code implementation CVPR 2022 Shuai Li, Chenhang He, Ruihuang Li, Lei Zhang

Existing LA methods mostly focus on the design of pos weighting function, while the neg weight is directly derived from the pos weight.

Object object-detection +2

Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization

2 code implementations CVPR 2022 Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, Lei Zhang

In this work, we, for the first time to our best knowledge, propose to perform Exact Feature Distribution Matching (EFDM) by exactly matching the empirical Cumulative Distribution Functions (eCDFs) of image features, which could be implemented by applying the Exact Histogram Matching (EHM) in the image feature space.

Domain Generalization Style Transfer

T-SVDNet: Exploring High-Order Prototypical Correlations for Multi-Source Domain Adaptation

1 code implementation ICCV 2021 Ruihuang Li, Xu Jia, Jianzhong He, Shuaijun Chen, QinGhua Hu

Most existing domain adaptation methods focus on adaptation from only one source domain, however, in practice there are a number of relevant sources that could be leveraged to help improve performance on target domain.

Unsupervised Domain Adaptation

Reciprocal Multi-Layer Subspace Learning for Multi-View Clustering

no code implementations ICCV 2019 Ruihuang Li, Changqing Zhang, Huazhu Fu, Xi Peng, Tianyi Zhou, Qinghua Hu

Multi-view clustering is a long-standing important research topic, however, remains challenging when handling high-dimensional data and simultaneously exploring the consistency and complementarity of different views.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.