Search Results for author: Qing Song

Found 22 papers, 14 papers with code

E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance

no code implementations • 15 Mar 2024 • Tianrui Huang, Pu Cao, Lu Yang, Chun Liu, Mengjie Hu, Zhiwei Liu, Qing Song

Diffusion-based image editing is a composite process of preserving the source image content and generating new content or applying modifications.

Text-based Image Editing

Paper
Add Code

Controllable Generation with Text-to-Image Diffusion Models: A Survey

1 code implementation • 7 Mar 2024 • Pu Cao, Feng Zhou, Qing Song, Lu Yang

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the landscape, marking a significant shift in capabilities with their impressive text-guided generative functions.

Denoising

615

Paper
Code

Concept-centric Personalization with Large-scale Diffusion Priors

1 code implementation • 13 Dec 2023 • Pu Cao, Lu Yang, Feng Zhou, Tianrui Huang, Qing Song

In this work, we present the task of customizing large-scale diffusion priors for specific concepts as concept-centric personalization.

Diffusion Personalization

Paper
Code

Large-Scale Person Detection and Localization using Overhead Fisheye Cameras

no code implementations • ICCV 2023 • Lu Yang, Liulei Li, Xueshi Xin, Yifan Sun, Qing Song, Wenguan Wang

Instead of existing efforts devoted to localizing tourist photos captured by perspective cameras, in this article, we focus on devising person positioning solutions using overhead fisheye cameras.

Human Detection

Paper
Add Code

CoT-MISR:Marrying Convolution and Transformer for Multi-Image Super-Resolution

no code implementations • 12 Mar 2023 • Mingming Xiu, Yang Nie, Qing Song, Chun Liu

How to transform a low-resolution image to restore its high-resolution image information is a problem that researchers have been exploring.

Image Restoration Image Super-Resolution

Paper
Add Code

Faster Learning of Temporal Action Proposal via Sparse Multilevel Boundary Generator

1 code implementation • 6 Mar 2023 • Qing Song, Yang Zhou, Mengjie Hu, Chun Liu

Temporal action localization in videos presents significant challenges in the field of computer vision.

Temporal Action Localization

Paper
Code

What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion

2 code implementations • 28 Jan 2023 • Pu Cao, Lu Yang, Dongxv Liu, Xiaoya Yang, Tianrui Huang, Qing Song

To tackle this problem, we introduce Domain-Specific Hybrid Refinement (DHR), which draws on the advantages and disadvantages of two mainstream refinement techniques to maintain editing ability with fidelity improvement.

Paper
Code

Deep Learning Technique for Human Parsing: A Survey and Outlook

2 code implementations • 1 Jan 2023 • Lu Yang, Wenhe Jia, Shan Li, Qing Song

Human parsing aims to partition humans in image or video into multiple pixel-level semantic parts.

Human Parsing

Paper
Code

UV R-CNN: Stable and Efficient Dense Human Pose Estimation

no code implementations • 4 Nov 2022 • Wenhe Jia, Yilin Zhou, Xuhan Zhu, Mengjie Hu, Chun Liu, Qing Song

Dense pose estimation is a dense 3D prediction task for instance-level human analysis, aiming to map human pixels from an RGB image to a 3D surface of the human body.

Pose Estimation regression

Paper
Add Code

TIVE: A Toolbox for Identifying Video Instance Segmentation Errors

1 code implementation • 17 Oct 2022 • Wenhe Jia, Lu Yang, Zilong Jia, Wenyi Zhao, Yilin Zhou, Qing Song

More importantly, as the fundamental model abilities demanded by the task, spatial segmentation and temporal association are still understudied in both evaluation and interaction mechanisms.

Instance Segmentation Segmentation +2

Paper
Code

LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN Latent Space

1 code implementation • 26 Sep 2022 • Pu Cao, Lu Yang, Dongxu Liu, Zhiwei Liu, Shan Li, Qing Song

In this work, we first point out that these two characteristics are related to the degree of alignment (or disalignment) of the inverse codes with the synthetic distribution.

Paper
Code

SGM-Net: Semantic Guided Matting Net

no code implementations • 16 Aug 2022 • Qing Song, Wenfeng Sun, Donghan Yang, Mengjie Hu, Chun Liu

When the green screen is not available, the existing human matting methods need the help of additional inputs (such as trimap, background image, etc.

Image Generation Image Matting

Paper
Add Code

A Survey on Long-Tailed Visual Recognition

no code implementations • 27 May 2022 • Lu Yang, He Jiang, Qing Song, Jun Guo

Data quality directly dominates the effect of deep learning models, and the long-tailed distribution is one of the factors affecting data quality.

Representation Learning

Paper
Add Code

Quality-Aware Network for Face Parsing

1 code implementation • 14 Jun 2021 • Lu Yang, Qing Song, Xueshi Xin, Wenhe Jia, Zhiwei Liu

This is a very short technical report, which introduces the solution of the Team BUPT-CASIA for Short-video Face Parsing Track of The 3rd Person in Context (PIC) Workshop and Challenge at CVPR 2021.

Face Parsing Human Parsing

Paper
Code

CAT: Cross Attention in Vision Transformer

1 code implementation • 10 Jun 2021 • Hezheng Lin, Xing Cheng, Xiangyu Wu, Fan Yang, Dong Shen, Zhongyuan Wang, Qing Song, Wei Yuan

In this paper, we propose a new attention mechanism in Transformer termed Cross Attention, which alternates attention inner the image patch instead of the whole image to capture local information and apply attention between image patches which are divided from single-channel feature maps capture global information.

132

Paper
Code

Quality-Aware Network for Human Parsing

1 code implementation • 10 Mar 2021 • Lu Yang, Qing Song, Zhihui Wang, Zhiwei Liu, Songcen Xu, Zhihao LI

How to estimate the quality of the network output is an important issue, and currently there is no effective solution in the field of human parsing.

Human Parsing Instance Segmentation +1

Paper
Code

Renovating Parsing R-CNN for Accurate Multiple Human Parsing

1 code implementation • ECCV 2020 • Lu Yang, Qing Song, Zhihui Wang, Mengjie Hu, Chun Liu, Xueshi Xin, Wenhe Jia, Songcen Xu

Multiple human parsing aims to segment various human parts and associate each part with the corresponding instance simultaneously.

Human Parsing

Paper
Code

CPM R-CNN: Calibrating Point-guided Misalignment in Object Detection

1 code implementation • 7 Mar 2020 • Bin Zhu, Qing Song, Lu Yang, Zhihui Wang, Chun Liu, Mengjie Hu

In object detection, offset-guided and point-guided regression dominate anchor-based and anchor-free method separately.

object-detection Object Detection

Paper
Code

High-speed Railway Fastener Detection and Localization Method based on convolutional neural network

no code implementations • 2 Jul 2019 • Qing Song, Yao Guo, Jianan Jiang, Chun Liu, Mengjie Hu

Railway transportation is the artery of China's national economy and plays an important role in the development of today's society.

Paper
Add Code

Detector-in-Detector: Multi-Level Analysis for Human-Parts

2 code implementations • 19 Feb 2019 • Xiaojie Li, Lu Yang, Qing Song, Fuqiang Zhou

In particular, we adopt a region-based object detection structure with two carefully designed detectors to separately pay attention to the human body and body parts in a coarse-to-fine manner, which we call Detector-in-Detector network (DID-Net).

Body Detection Face Detection +2