Search Results for author: Haonan Lu

Found 19 papers, 10 papers with code

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

no code implementations17 Apr 2024 Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, Haonan Lu

To this end, we proposed the layer pruning and normalized distillation for compressing diffusion models (LAPTOP-Diff).

Knowledge Distillation

SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation

no code implementations3 Mar 2024 Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-Jun Zha, Haonan Lu

In contrast to vanilla consistency distillation (CD) which distills the ordinary differential equation solvers-based sampling process of a pretrained teacher model into a student, SCott explores the possibility and validates the efficacy of integrating stochastic differential equation (SDE) solvers into CD to fully unleash the potential of the teacher.

Text-to-Image Generation

Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image Outpainting

no code implementations19 Jan 2024 Hao Ai, Zidong Cao, Haonan Lu, Chen Chen, Jian Ma, Pengyuan Zhou, Tae-Kyun Kim, Pan Hui, Lin Wang

To this end, we propose a transformer-based 360 image outpainting framework called Dream360, which can generate diverse, high-fidelity, and high-resolution panoramas from user-selected viewports, considering the spherical properties of 360 images.

Image Outpainting

MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval

no code implementations30 Oct 2023 Youbo Lei, Feifei He, Chen Chen, Yingbin Mo, Si Jia Li, Defeng Xie, Haonan Lu

Due to the success of large-scale visual-language pretraining (VLP) models and the widespread use of image-text retrieval in industry areas, it is now critically necessary to reduce the model size and streamline their mobile-device deployment.

Retrieval Text Retrieval

MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers

no code implementations8 Sep 2023 Sijia Li, Chen Chen, Haonan Lu

In this work, we propose a method with a mixture-of-expert (MOE) controllers to align the text-guided capacity of diffusion models with different kinds of human instructions, enabling our model to handle various open-domain image manipulation tasks with natural language instructions.

Image Generation Image Manipulation

Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning

1 code implementation21 Jul 2023 Jian Ma, Junhao Liang, Chen Chen, Haonan Lu

In this paper, we propose Subject-Diffusion, a novel open-domain personalized image generation model that, in addition to not requiring test-time fine-tuning, also only requires a single reference image to support personalized generation of single- or multi-subject in any domain.

Diffusion Personalization Tuning Free Text-to-Image Generation

Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback

no code implementations25 May 2023 Yiqi Lin, Hao Wu, Ruichen Wang, Haonan Lu, Xiaodong Lin, Hui Xiong, Lin Wang

Generating and editing a 3D scene guided by natural language poses a challenge, primarily due to the complexity of specifying the positional relations and volumetric changes within the 3D space.

3D Generation

Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models

1 code implementation23 May 2023 Ruichen Wang, Zekang Chen, Chen Chen, Jian Ma, Haonan Lu, Xiaodong Lin

Our approach produces a more semantically accurate synthesis by constraining the attention regions of each token in the prompt to the image.

Attribute Image Generation

Edit Everything: A Text-Guided Generative System for Images Editing

1 code implementation27 Apr 2023 Defeng Xie, Ruichen Wang, Jian Ma, Chen Chen, Haonan Lu, Dong Yang, Fobo Shi, Xiaodong Lin

We introduce a new generative system called Edit Everything, which can take image and text inputs and produce image outputs.

GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation

3 code implementations31 Mar 2023 Jian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin

Recent breakthroughs in the field of language-guided image generation have yielded impressive achievements, enabling the creation of high-quality and diverse images based on user instructions. Although the synthesis performance is fascinating, one significant limitation of current image generation models is their insufficient ability to generate text coherently within images, particularly for complex glyph structures like Chinese characters.

Optical Character Recognition (OCR) Text-to-Image Generation

CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout

no code implementations24 Mar 2023 Haotian Bai, Yuanhuiyi Lyu, Lutao Jiang, Sijia Li, Haonan Lu, Xiaodong Lin, Lin Wang

To tackle the issue of 'guidance collapse' and enhance consistency, we propose a novel framework, dubbed CompoNeRF, by integrating an editable 3D scene layout with object specific and scene-wide guidance mechanisms.

Object Text to 3D

AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation

1 code implementation11 Jun 2021 Mingxiang Chen, Zhanguo Chang, Haonan Lu, Bitao Yang, Zhuang Li, Liufang Guo, Zhecheng Wang

In our evaluations, the method outperforms all the state-of-the-art image retrieval algorithms on some out-of-domain image datasets.

Clustering Image Augmentation +4

DensE: An Enhanced Non-commutative Representation for Knowledge Graph Embedding with Adaptive Semantic Hierarchy

1 code implementation11 Aug 2020 Haonan Lu, Hailin Hu, Xiaodong Lin

This design principle leads to several advantages of our method: (1) For composite relations, the corresponding diagonal relation matrices can be non-commutative, reflecting a predominant scenario in real world applications; (2) Our model preserves the natural interaction between relational operations and entity embeddings; (3) The scaling operation provides the modeling power for the intrinsic semantic hierarchical structure of entities; (4) The enhanced expressiveness of DensE is achieved with high computational efficiency in terms of both parameter size and training time; and (5) Modeling entities in Euclidean space instead of quaternion space keeps the direct geometrical interpretations of relational patterns.

Computational Efficiency Entity Embeddings +2

Cannot find the paper you are looking for? You can Submit a new open access paper.