Search Results for author: Guoxi Huang

Found 7 papers, 4 papers with code

Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis

1 code implementation • 30 Nov 2023 • Zipeng Qi, Guoxi Huang, Zebin Huang, Qin Guo, Jinwen Chen, Junyu Han, Jian Wang, Gang Zhang, Lufei Liu, Errui Ding, Jingdong Wang

The LRDiff framework constructs an image-rendering process with multiple layers, each of which applies the vision guidance to instructively estimate the denoising direction for a single object.

Denoising Image Generation

Paper
Code

Masked Image Residual Learning for Scaling Deeper Vision Transformers

1 code implementation • NeurIPS 2023 • Guoxi Huang, Hongtao Fu, Adrian G. Bors

With the same level of computational complexity as ViT-Base and ViT-Large, we instantiate 4. 5$\times$ and 2$\times$ deeper ViTs, dubbed ViT-S-54 and ViT-B-48.

Ranked #16 on Self-Supervised Image Classification on ImageNet (finetuned)

object-detection Object Detection +3

Paper
Code

Dynamic Appearance: A Video Representation for Action Recognition with Joint Training

no code implementations • 23 Nov 2022 • Guoxi Huang, Adrian G. Bors

Static appearance of video may impede the ability of a deep neural network to learn motion-relevant features in video action recognition.

Action Recognition Temporal Action Localization +1

Paper
Add Code

BQN: Busy-Quiet Net Enabled by Motion Band-Pass Module for Action Recognition

no code implementations • TIP 2022 • Guoxi Huang, Adrian G. Bors

Through experiments we show that the proposed MBPM can be used as a plug-in module in various CNN backbone architectures, significantly boosting their performance.

Action Recognition

Paper
Add Code

Busy-Quiet Video Disentangling for Video Classification

2 code implementations • 29 Mar 2021 • Guoxi Huang, Adrian G. Bors

We design a trainable Motion Band-Pass Module (MBPM) for separating busy information from quiet information in raw video data.

Ranked #15 on Action Recognition on UCF101

Action Classification Action Recognition In Videos +3

Paper
Code

Region-based Non-local Operation for Video Classification

1 code implementation • 17 Jul 2020 • Guoxi Huang, Adrian G. Bors

Convolutional Neural Networks (CNNs) model long-range dependencies by deeply stacking convolution operations with small window sizes, which makes the optimizations difficult.

Ranked #32 on Action Recognition on Something-Something V1

Action Classification Action Recognition In Videos +4

Paper
Code

Learning spatio-temporal representations with temporal squeeze pooling

no code implementations • 11 Feb 2020 • Guoxi Huang, Adrian G. Bors

In this paper, we propose a new video representation learning method, named Temporal Squeeze (TS) pooling, which can extract the essential movement information from a long sequence of video frames and map it into a set of few images, named Squeezed Images.

Ranked #43 on Action Recognition on UCF101 (using extra training data)

Action Recognition Classification +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.