Search Results for author: Yifeng Geng

Found 31 papers, 19 papers with code

AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation

1 code implementation16 Jan 2025 Junjie He, Yuxiang Tuo, Binghui Chen, Chongyang Zhong, Yifeng Geng, Liefeng Bo

AnyStory not only achieves high-fidelity personalization for single subjects, but also for multiple subjects, without sacrificing subject fidelity.

Text to Image Generation Text-to-Image Generation

AnyText2: Visual Text Generation and Editing With Customizable Attributes

1 code implementation22 Nov 2024 Yuxiang Tuo, Yifeng Geng, Liefeng Bo

As an extension of AnyText, this method allows for customization of attributes for each line of text, leading to improvements of 3. 3% and 9. 3% in text accuracy for Chinese and English, respectively.

Image Generation Text Generation

GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts

no code implementations18 Nov 2024 Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Chenyang Li, Hanyuan Chen, Jin-Peng Lan, Bin Luo, Yifeng Geng

Text logo design heavily relies on the creativity and expertise of professional designers, in which arranging element layouts is one of the most important procedures.

Layout Design Layout Generation

Prune and Repaint: Content-Aware Image Retargeting for any Ratio

1 code implementation30 Oct 2024 Feihong Shen, Chao Li, Yifeng Geng, Yongjian Deng, Hao Chen

By focusing on the content and structure of the foreground, our PruneRepaint approach adaptively avoids key content loss and deformation, while effectively mitigating artifacts with local repainting.

Image Retargeting

AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing

1 code implementation16 Oct 2024 Duosheng Chen, Binghui Chen, Yifeng Geng, Liefeng Bo

Next, we leverage a pre-trained diffusion model to optimize the latent, enabling the dragging of features from handle points to target points.

UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization

1 code implementation12 Aug 2024 Junjie He, Yifeng Geng, Liefeng Bo

UniPortrait consists of only two plug-and-play modules: an ID embedding module and an ID routing module.

Layout Generation

MetaDesigner: Advancing Artistic Typography Through AI-Driven, User-Centric, and Multilingual WordArt Synthesis

no code implementations28 Jun 2024 Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Qi He, Wangmeng Xiang, Hanyuan Chen, Jin-Peng Lan, Xianhui Lin, Kang Zhu, Bin Luo, Yifeng Geng, Xuansong Xie, Alexander G. Hauptmann

MetaDesigner introduces a transformative framework for artistic typography synthesis, powered by Large Language Models (LLMs) and grounded in a user-centric design paradigm.

VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing

no code implementations16 May 2024 Binghui Chen, Chongyang Zhong, Wangmeng Xiang, Yifeng Geng, Xuansong Xie

Due to the significant advances in large-scale text-to-image generation by diffusion model (DM), controllable human image generation has been attracting much attention recently.

Human-Object Interaction Detection Marketing +3

Strictly-ID-Preserved and Controllable Accessory Advertising Image Generation

no code implementations7 Apr 2024 Youze Xue, Binghui Chen, Yifeng Geng, Xuansong Xie, Jiansheng Chen, Hongbing Ma

Customized generative text-to-image models have the ability to produce images that closely resemble a given subject.

Image Generation

ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion Model

no code implementations7 Apr 2024 Binghui Chen, Wenyu Li, Yifeng Geng, Xuansong Xie, WangMeng Zuo

Specifically, we propose a shoe-wearing system, called Shoe-Model, to generate plausible images of human legs interacting with the given shoes.

Image Generation Marketing

Data-efficient Event Camera Pre-training via Disentangled Masked Modeling

no code implementations1 Mar 2024 Zhenpeng Huang, Chao Li, Hao Chen, Yongjian Deng, Yifeng Geng, LiMin Wang

Our pre-training overcomes the limitations of previous methods, which either sacrifice temporal information by converting event sequences into 2D images for utilizing pre-trained image models or directly employ paired image data for knowledge distillation to enhance the learning of event streams.

Knowledge Distillation Self-Supervised Learning

WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope

no code implementations3 Jan 2024 Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Yusen Hu, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Bin Luo, Yifeng Geng, Xuansong Xie, Jingren Zhou

This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope.

FMViT: A multiple-frequency mixing Vision Transformer

no code implementations9 Nov 2023 Wei Tan, Yifeng Geng, Xuansong Xie

On CoreML, FMViT outperforms MobileOne by 2. 6% in top-1 accuracy on the ImageNet dataset, with inference latency comparable to MobileOne (78. 5% vs. 75. 9%).

AnyText: Multilingual Visual Text Generation And Editing

1 code implementation6 Nov 2023 Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He, Yifeng Geng, Xuansong Xie

Based on AnyWord-3M dataset, we propose AnyText-benchmark for the evaluation of visual text generation accuracy and quality.

Image Generation Optical Character Recognition (OCR) +1

Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action Recognition through Redefined Skeletal Topology Awareness

1 code implementation19 May 2023 Yuxuan Zhou, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Yifeng Geng, Xuansong Xie

As a remedy, we propose a threefold strategy: (1) We forge an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.

Action Recognition Skeleton Based Action Recognition

DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving

1 code implementation30 Mar 2023 Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Wangmeng Xiang, Binghui Chen, Bin Luo, Yifeng Geng, Xuansong Xie

Real-time perception, or streaming perception, is a crucial aspect of autonomous driving that has yet to be thoroughly explored in existing research.

Autonomous Driving

Optimal Proposal Learning for Deployable End-to-End Pedestrian Detection

no code implementations CVPR 2023 Xiaolin Song, Binghui Chen, Pengyu Li, Jun-Yan He, Biao Wang, Yifeng Geng, Xuansong Xie, Honggang Zhang

End-to-end pedestrian detection focuses on training a pedestrian detection model via discarding the Non-Maximum Suppression (NMS) post-processing.

Pedestrian Detection

LongShortNet: Exploring Temporal and Semantic Features Fusion in Streaming Perception

2 code implementations27 Oct 2022 Chenyang Li, Zhi-Qi Cheng, Jun-Yan He, Pengyu Li, Bin Luo, Hanyuan Chen, Yifeng Geng, Jin-Peng Lan, Xuansong Xie

Streaming perception is a critical task in autonomous driving that requires balancing the latency and accuracy of the autopilot system.

Autonomous Driving

Learning to Focus: Cascaded Feature Matching Network for Few-shot Image Recognition

no code implementations13 Jan 2021 Mengting Chen, Xinggang Wang, Heng Luo, Yifeng Geng, Wenyu Liu

By applying the proposed feature matching block in different layers of the few-shot recognition network, multi-scale information among the compared images can be incorporated into the final cascaded matching feature, which boosts the recognition performance further and generalizes better by learning on relationships.

Few-Shot Learning

Diversity Transfer Network for Few-Shot Learning

1 code implementation31 Dec 2019 Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xin-Yu Zhang, Chang Huang, Wenyu Liu, Bo wang

The learning problem of the sample generation (i. e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works.

Diversity Few-Shot Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.