Search Results for author: Teng Hu

Found 18 papers, 9 papers with code

A$^\text{T}$A: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting

no code implementations2 Apr 2025 Yizhe Tang, Zhimin Sun, Yuzhen Du, Ran Yi, Guangben Lu, Teng Hu, Luying Li, Lizhuang Ma, Fangyuan Zou

Existing background inpainting methods typically strictly preserve the subject's original position from the source image, resulting in inconsistencies between the subject and the generated background.

Image Inpainting Position

Image Inversion: A Survey from GANs to Diffusion and Beyond

1 code implementation17 Feb 2025 Yinan Chen, Jiangning Zhang, Yali Bi, Xiaobin Hu, Teng Hu, Zhucun Xue, Ran Yi, Yong liu, Ying Tai

This paper provides a comprehensive review of the latest advancements in image inversion techniques, focusing on two main paradigms: Generative Adversarial Network (GAN) inversion and diffusion model inversion.

Generative Adversarial Network Style Transfer +1

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

no code implementations1 Jan 2025 Teng Hu, Jiangning Zhang, Ran Yi, Jieyu Weng, Yabiao Wang, Xianfang Zeng, Zhucun Xue, Lizhuang Ma

Leveraging the rearranged codebook, we propose a Cluster-oriented Cross-entropy Loss that guides the model to correctly predict the cluster where the token is located.

Prediction

EMOv2: Pushing 5M Vision Model Frontier

1 code implementation9 Dec 2024 Jiangning Zhang, Teng Hu, Haoyang He, Zhucun Xue, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li, DaCheng Tao

Our goal is to set up the new frontier of the 5M magnitude lightweight model on various downstream tasks.

Image Generation model +2

Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration

no code implementations5 Dec 2024 Yuzhen Du, Teng Hu, Jiangning Zhang, Ran Yi Chengming Xu, Xiaobin Hu, Kai Wu, Donghao Luo, Yabiao Wang, Lizhuang Ma

Instead of directly using Vision-RWKV, we replace the original Q-Shift in RWKV with a Depth-wise Convolution shift to better model local dependencies, combined with Bi-directional attention for comprehensive linear attention.

Image Restoration

AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting

no code implementations21 Oct 2024 Yizhe Tang, Yue Wang, Teng Hu, Ran Yi, Xin Tan, Lizhuang Ma, Yu-Kun Lai, Paul L. Rosin

Stroke-based Rendering (SBR) aims to decompose an input image into a sequence of parameterized strokes, which can be rendered into a painting that resembles the input image.

reinforcement-learning Reinforcement Learning

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

1 code implementation10 Sep 2024 Teng Hu, Jiangning Zhang, Ran Yi, Hongrui Huang, Yabiao Wang, Lizhuang Ma

Inspired by model pruning which lightens large pre-trained models by removing unimportant parameters, we propose a novel model fine-tuning method to make full use of these ineffective parameters and enable the pre-trained model with new task-specified capabilities.

Video Generation

DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

1 code implementation24 Aug 2024 Ying Jin, Jinlong Peng, Qingdong He, Teng Hu, Hao Chen, Jiafu Wu, Wenbing Zhu, Mingmin Chi, Jun Liu, Yabiao Wang, Chengjie Wang

In this paper, we overcome these challenges from a new perspective, simultaneously generating a pair of the overall image and the corresponding anomaly part.

Anomaly Classification Anomaly Detection +3

SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

1 code implementation CVPR 2024 Teng Hu, Ran Yi, Baihong Qian, Jiangning Zhang, Paul L. Rosin, Yu-Kun Lai

Then, we propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details.

Superpixels Vector Graphics

MotionMaster: Training-free Camera Motion Transfer For Video Generation

no code implementations24 Apr 2024 Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma

Furthermore, we propose a few-shot camera motion disentanglement method to extract the common camera motion from multiple videos with similar camera motions, which employs a window-based clustering technique to extract the common features in temporal attention maps of multiple videos.

Disentanglement Motion Disentanglement +2

Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization

no code implementations15 Dec 2023 Yige Chen, Teng Hu, Yizhe Tang, Siyuan Chen, Ang Chen, Ran Yi

With the help of Score Distillation Sampling (SDS) and the rapid development of neural 3D representations, some methods have been proposed to perform 3D editing such as adding additional geometries, or overwriting textures.

3D Generation Plasticine3D +1

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

1 code implementation10 Dec 2023 Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.

Image Generation

Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region

2 code implementations7 Sep 2023 Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.

Style Transfer

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

1 code implementation ICCV 2023 Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.

Domain Adaptation model

DocPrompt: Large-scale continue pretrain for zero-shot and few-shot document question answering

no code implementations21 Aug 2023 Sijin Wu, Dan Zhang, Teng Hu, Shikun Feng

In this paper, we propose Docprompt for document question answering tasks with powerful zero-shot and few-shot performance.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.