Search Results for author: Bohan Zeng

Found 15 papers, 10 papers with code

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

no code implementations14 Apr 2025 Yang Shi, Jiaheng Liu, Yushuo Guan, Zhenhua Wu, Yuanxing Zhang, ZiHao Wang, WeiHong Lin, Jingyun Hua, Zekun Wang, Xinlong Chen, Bohan Zeng, Wentao Zhang, Fuzheng Zhang, Wenjing Yang, Di Zhang

Long-context video understanding in multimodal large language models (MLLMs) faces a critical challenge: balancing computational efficiency with the retention of fine-grained spatio-temporal patterns.

Computational Efficiency Language Modeling +4

WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes

1 code implementation17 Mar 2025 Ling Yang, Kaixin Zhu, Juanxi Tian, Bohan Zeng, Mingbao Lin, Hongjuan Pei, Wentao Zhang, Shuicheng Yan

In this paper, we focus on 4D scene reconstruction with significant object spatial movements and propose a novel 4D reconstruction benchmark, WideRange4D.

3D Reconstruction 4D reconstruction

Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks

no code implementations27 Jan 2025 Hailong Guo, Bohan Zeng, Yiren Song, Wentao Zhang, Chuang Zhang, Jiaming Liu

However, the scarcity of paired garment-model data makes it challenging for existing methods to achieve high generalization and quality in VTON.

Position Virtual Try-on

Semantic Score Distillation Sampling for Compositional Text-to-3D Generation

no code implementations11 Oct 2024 Ling Yang, Zixiang Zhang, Junlin Han, Bohan Zeng, Runjia Li, Philip Torr, Wentao Zhang

To overcome these challenges, we introduce a novel SDS approach, Semantic Score Distillation Sampling (SemanticSDS), designed to effectively improve the expressiveness and accuracy of compositional text-to-3D generation.

3D Generation Text to 3D

Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis

no code implementations9 Oct 2024 Bohan Zeng, Ling Yang, Siyu Li, Jiaming Liu, Zixiang Zhang, Juanxi Tian, Kaixin Zhu, Yongzhen Guo, Fu-Yun Wang, Minkai Xu, Stefano Ermon, Wentao Zhang

Then we propose a geometry-aware 4D transition network to realize a complex scene-level 4D transition based on the plan, which involves expressive geometrical object deformation.

Video Generation

EditWorld: Simulating World Dynamics for Instruction-Following Image Editing

1 code implementation23 May 2024 Ling Yang, Bohan Zeng, Jiaming Liu, Hong Li, Minghao Xu, Wentao Zhang, Shuicheng Yan

Therefore, this work, EditWorld, introduces a new editing task, namely world-instructed image editing, which defines and categorizes the instructions grounded by various world scenarios.

Instruction Following

UV-IDM: Identity-Conditioned Latent Diffusion Model for Face UV-Texture Generation

1 code implementation CVPR 2024 Hong Li, Yutang Feng, Song Xue, Xuhui Liu, Bohan Zeng, Shanglin Li, Boyu Liu, Jianzhuang Liu, Shumin Han, Baochang Zhang

To solve these problems we introduce an Identity-Conditioned Latent Diffusion Model for face UV-texture generation (UV-IDM) to generate photo-realistic textures based on the Basel Face Model (BFM).

3D Face Reconstruction Face Model +1

IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts

1 code implementation9 Oct 2023 Bohan Zeng, Shanglin Li, Yutang Feng, Ling Yang, Hong Li, Sicheng Gao, Jiaming Liu, Conghui He, Wentao Zhang, Jianzhuang Liu, Baochang Zhang, Shuicheng Yan

Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to guide 3D object generation.

3D Generation Image to 3D +2

Controllable Mind Visual Diffusion Model

1 code implementation17 May 2023 Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.

Attribute Image Generation +1

Implicit Diffusion Models for Continuous Super-Resolution

1 code implementation CVPR 2023 Sicheng Gao, Xuhui Liu, Bohan Zeng, Sheng Xu, Yanjing Li, Xiaoyan Luo, Jianzhuang Liu, XianTong Zhen, Baochang Zhang

IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework, where the implicit neural representation is adopted in the decoding process to learn continuous-resolution representation.

Denoising Image Super-Resolution

IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors

1 code implementation7 Oct 2022 Sheng Xu, Yanjing Li, Bohan Zeng, Teli Ma, Baochang Zhang, Xianbin Cao, Peng Gao, Jinhu Lv

This explains why existing KD methods are less effective for 1-bit detectors, caused by a significant information discrepancy between the real-valued teacher and the 1-bit student.

Knowledge Distillation object-detection +1

FNeVR: Neural Volume Rendering for Face Animation

1 code implementation21 Sep 2022 Bohan Zeng, Boyu Liu, Hong Li, Xuhui Liu, Jianzhuang Liu, Dapeng Chen, Wei Peng, Baochang Zhang

In FNeVR, we design a 3D Face Volume Rendering (FVR) module to enhance the facial details for image rendering.

Talking Face Generation

TerViT: An Efficient Ternary Vision Transformer

no code implementations20 Jan 2022 Sheng Xu, Yanjing Li, Teli Ma, Bohan Zeng, Baochang Zhang, Peng Gao, Jinhu Lv

Vision transformers (ViTs) have demonstrated great potential in various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices.

Cannot find the paper you are looking for? You can Submit a new open access paper.