Search Results for author: Luchuan Song

Found 11 papers, 6 papers with code

Adaptive Super Resolution For One-Shot Talking-Head Generation

1 code implementation • 23 Mar 2024 • Luchuan Song, Pinxin Liu, Guojun Yin, Chenliang Xu

In this work, we propose an adaptive high-quality talking-head video generation method, which synthesizes high-resolution video without additional pre-trained modules.

Super-Resolution Talking Head Generation +1

124

Paper
Code

Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering

no code implementations • 1 Feb 2024 • Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu

To address the above problems, we propose the Efficient Monotonic Video Style Avatar (Emo-Avatar) through deferred neural rendering that enhances StyleGAN's capacity for producing dynamic, drivable portrait videos.

Contrastive Learning Neural Rendering

Paper
Add Code

Tri$^{2}$-plane: Volumetric Avatar Reconstruction with Feature Pyramid

1 code implementation • 17 Jan 2024 • Luchuan Song, Pinxin Liu, Lele Chen, Celong Liu, Chenliang Xu

Recent years have witnessed considerable achievements in facial avatar reconstruction with neural volume rendering.

Paper
Code

Video Understanding with Large Language Models: A Survey

1 code implementation • 29 Dec 2023 • Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.

Video Understanding

650

Paper
Code

IDRNet: Intervention-Driven Relation Network for Semantic Segmentation

1 code implementation • NeurIPS 2023 • Zhenchao Jin, Xiaowei Hu, Lingting Zhu, Luchuan Song, Li Yuan, Lequan Yu

Next, a deletion diagnostics procedure is conducted to model relations of these semantic-level representations via perceiving the network outputs and the extracted relations are utilized to guide the semantic-level representations to interact with each other.

Relation Relation Network +1

730

Paper
Code

Emotional Listener Portrait: Neural Listener Head Generation with Emotion

no code implementations • ICCV 2023 • Luchuan Song, Guojun Yin, Zhenchao Jin, Xiaoyi Dong, Chenliang Xu

Listener head generation centers on generating non-verbal behaviors (e. g., smile) of a listener in reference to the information delivered by a speaker.

Paper
Add Code

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning

no code implementations • 25 Jul 2022 • Jingqun Tang, Wenming Qian, Luchuan Song, Xiena Dong, Lan Li, Xiang Bai

Text detection and recognition are essential components of a modern OCR system.

Domain Adaptation Optical Character Recognition (OCR) +2

Paper
Add Code

You Should Look at All Objects

1 code implementation • 16 Jul 2022 • Zhenchao Jin, Dongdong Yu, Luchuan Song, Zehuan Yuan, Lequan Yu

Feature pyramid network (FPN) is one of the key components for object detectors.

Paper
Code

Cascaded Residual Density Network for Crowd Counting

no code implementations • 29 Jul 2021 • Kun Zhao, Luchuan Song, Bin Liu, Qi Chu, Nenghai Yu

Crowd counting is a challenging task due to the issues such as scale variation and perspective variation in real crowd scenes.

Crowd Counting

Paper
Add Code

Abnormal Behavior Detection Based on Target Analysis

no code implementations • 29 Jul 2021 • Luchuan Song, Bin Liu, Huihui Zhu, Qi Chu, Nenghai Yu

To this end, we propose a multivariate fusion method that analyzes each target through three branches: object, action and motion.

Object

Paper
Add Code

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

2 code implementations • CVPR 2021 • Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu

To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real / fake), three-way (real / fake with identity-replaced forgery approaches / fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification.

Benchmarking Classification +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.