Search Results for author: Wenqing Zhang

Found 35 papers, 8 papers with code

ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions

1 code implementation21 Jan 2025 Shiyue Zhang, Zheng Chong, Xi Lu, Wenqing Zhang, Haoxiang Li, Xujie Zhang, Jiehui Huang, Xiao Dong, Xiaodan Liang

Building on the success of diffusion models, significant advancements have been made in multimodal image generation tasks.

Image Generation

CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation

no code implementations20 Jan 2025 Zheng Chong, Wenqing Zhang, Shiyue Zhang, Jun Zheng, Xiao Dong, Haoxiang Li, Yiling Wu, Dongmei Jiang, Xiaodan Liang

Comprehensive experiments demonstrate that CatV2TON outperforms existing methods in both image and video try-on tasks, offering a versatile and reliable solution for realistic virtual try-ons across diverse scenarios.

Video Generation Virtual Try-on

Generating Multimodal Images with GAN: Integrating Text, Image, and Style

no code implementations4 Jan 2025 Chaoyi Tan, Wenqing Zhang, Zhen Qi, Kowei Shih, Xinshi Li, Ao Xiang

In the field of computer vision, multimodal image generation has become a research hotspot, especially the task of integrating text, image, and style.

Image Generation

Mitigating Knowledge Conflicts in Language Model-Driven Question Answering

no code implementations18 Nov 2024 Han Cao, Zhaoyang Zhang, Xiangtian Li, Chufan Wu, Hansong Zhang, Wenqing Zhang

In the context of knowledge-driven seq-to-seq generation tasks, such as document-based question answering and document summarization systems, two fundamental knowledge sources play crucial roles: the inherent knowledge embedded within model parameters and the external knowledge obtained through context.

Document Summarization Hallucination +3

Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning

no code implementations4 Nov 2024 Zihao Zhao, Yijiang Li, Yuchen Yang, Wenqing Zhang, Nuno Vasconcelos, Yinzhi Cao

Machine unlearning--enabling a trained model to forget specific data--is crucial for addressing biased data and adhering to privacy regulations like the General Data Protection Regulation (GDPR)'s "right to be forgotten".

Machine Unlearning Privacy Preserving

Asset pricing under model uncertainty with finite time and states

no code implementations23 Aug 2024 Shuzhen Yang, Wenqing Zhang

In this study, we consider the asset pricing under model uncertainty with finite time and under a family of probability, and explore its relationship with risk neutral probability meastates structure.

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

1 code implementation21 Jul 2024 Zheng Chong, Xiao Dong, Haoxiang Li, Shiyue Zhang, Wenqing Zhang, Xujie Zhang, Hanqing Zhao, Xiaodan Liang

Virtual try-on methods based on diffusion models achieve realistic try-on effects but often replicate the backbone network as a ReferenceNet or use additional image encoders to process condition inputs, leading to high training and inference costs.

Fashion Synthesis Image Generation +1

Uncertainty in the financial market and application to forecastabnormal financial fluctuations

no code implementations19 Mar 2024 Shige Peng, Shuzhen Yang, Wenqing Zhang

The integration and innovation of finance and technology have gradually transformed the financial system into a complex one.

Debiasing Text-to-Image Diffusion Models

no code implementations22 Feb 2024 Ruifei He, Chuhui Xue, Haoru Tan, Wenqing Zhang, Yingchen Yu, Song Bai, Xiaojuan Qi

Despite its simplicity, we show that IDA shows efficiency and fast convergence in resolving the social bias in TTI diffusion models.

Dataset Condensation via Generative Model

no code implementations14 Sep 2023 David Junhao Zhang, Heng Wang, Chuhui Xue, Rui Yan, Wenqing Zhang, Song Bai, Mike Zheng Shou

Dataset condensation aims to condense a large dataset with a lot of training samples into a small set.

Dataset Condensation model

DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment

no code implementations ICCV 2023 Xujie Zhang, BinBin Yang, Michael C. Kampffmeyer, Wenqing Zhang, Shiyue Zhang, Guansong Lu, Liang Lin, Hang Xu, Xiaodan Liang

Cross-modal garment synthesis and manipulation will significantly benefit the way fashion designers generate garments and modify their designs via flexible linguistic interfaces. Current approaches follow the general text-to-image paradigm and mine cross-modal relations via simple cross-attention modules, neglecting the structural correspondence between visual and textual representations in the fashion design domain.

Attribute Constituency Parsing +2

Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks

no code implementations13 Aug 2023 David Junhao Zhang, Mutian Xu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou

Despite the rapid advancement of unsupervised learning in visual representation, it requires training on large-scale datasets that demand costly data collection, and pose additional challenges due to concerns regarding data privacy.

Contrastive Learning Image Classification +2

Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding

no code implementations1 Aug 2023 Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi

To address this challenge, we propose to harness pre-trained vision-language (VL) foundation models that encode extensive knowledge from image-text pairs to generate captions for multi-view images of 3D scenes.

3D geometry 3D Open-Vocabulary Instance Segmentation +5

Fixed-point iterative algorithm for SVI model

no code implementations19 Jan 2023 Shuzhen Yang, Wenqing Zhang

In this study, we develop an efficient iterative algorithm for the SVI model based on a fixed-point and least-square optimizer.

model

PV3D: A 3D Generative Model for Portrait Video Generation

no code implementations13 Dec 2022 Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Wenqing Zhang, Song Bai, Jiashi Feng, Mike Zheng Shou

While some prior works have applied such image GANs to unconditional 2D portrait video generation and static 3D portrait synthesis, there are few works successfully extending GANs for generating 3D-aware portrait videos.

Video Generation

Data-based Polymer-Unit Fingerprint (PUFp): A Newly Accessible Expression of Polymer Organic Semiconductors for Machine Learning

no code implementations3 Nov 2022 Xinyue Zhang, Genwang Wei, Ye Sheng, Jiong Yang, Caichao Ye, Wenqing Zhang

By investigating the combinations of polymer units with mobility performance, a scheme for designing polymer OSC materials by combining ML approaches and PUFp information is proposed to not only passively predict OSC mobility but also actively provide structural guidance for new high-mobility OSC material design.

Is synthetic data from generative models ready for image recognition?

1 code implementation14 Oct 2022 Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, Xiaojuan Qi

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.

Text-to-Image Generation Transfer Learning

Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning

2 code implementations1 Oct 2022 Yujun Shi, Jian Liang, Wenqing Zhang, Vincent Y. F. Tan, Song Bai

To remedy this problem caused by the data heterogeneity, we propose {\sc FedDecorr}, a novel method that can effectively mitigate dimensional collapse in federated learning.

Federated Learning

Runner-Up Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition

no code implementations4 Aug 2022 Zhangzi Zhu, Yu Hao, Wenqing Zhang, Chuhui Xue, Song Bai

This report presents our 2nd place solution to ECCV 2022 challenge on Out-of-Vocabulary Scene Text Understanding (OOV-ST) : Cropped Word Recognition.

VMRF: View Matching Neural Radiance Fields

no code implementations6 Jul 2022 Jiahui Zhang, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Bai Song, Xiaoqin Zhang, Shijian Lu

With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images.

NeRF Novel View Synthesis

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

no code implementations CVPR 2022 Jingqun Tang, Wenqing Zhang, Hongye Liu, Mingkun Yang, Bo Jiang, Guanglong Hu, Xiang Bai

Different from previous approaches that learn robust deep representations of scene text in a holistic manner, our method performs scene text detection based on a few representative features, which avoids the disturbance by background and reduces the computational cost.

Ranked #24 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +2

Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting

no code implementations8 Mar 2022 Chuhui Xue, Wenqing Zhang, Yu Hao, Shijian Lu, Philip Torr, Song Bai

Our network consists of an image encoder and a character-aware text encoder that extract visual and textual features, respectively, as well as a visual-textual decoder that models the interaction among textual and visual features for learning effective scene text representations.

Optical Character Recognition Optical Character Recognition (OCR) +2

SeqFormer: Sequential Transformer for Video Instance Segmentation

2 code implementations15 Dec 2021 Junfeng Wu, Yi Jiang, Song Bai, Wenqing Zhang, Xiang Bai

Nevertheless, we observe that a stand-alone instance query suffices for capturing a time sequence of instances in a video, but attention mechanisms shall be done with each frame independently.

Instance Segmentation Semantic Segmentation +1

Contextual Text Detection

no code implementations29 Sep 2021 Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Song Bai, Changhu Wang

This paper presents Contextual Text Detection, a new setup that detects contextual text blocks for better understanding of texts in scenes.

Text Detection

I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition

no code implementations18 May 2021 Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Changhu Wang, Song Bai

The first task focuses on image-to-character (I2C) mapping which detects a set of character candidates from images based on different alignments of visual features in an non-sequential way.

Decoder Scene Text Recognition

Scene Text Detection with Scribble Lines

no code implementations9 Dec 2020 Wenqing Zhang, Yang Qiu, Minghui Liao, Rui Zhang, Xiaolin Wei, Xiang Bai

It is a general labeling method for texts with various shapes and requires low labeling costs.

Scene Text Detection Text Detection

FedOCR: Communication-Efficient Federated Learning for Scene Text Recognition

no code implementations22 Jul 2020 Wenqing Zhang, Yang Qiu, Song Bai, Rui Zhang, Xiaolin Wei, Xiang Bai

In this paper, we study how to make use of decentralized datasets for training a robust scene text recognizer while keeping them stay on local devices.

Federated Learning Privacy Preserving +1

FIS-GAN: GAN with Flow-based Importance Sampling

no code implementations6 Oct 2019 Shiyu Yi, Donglin Zhan, Wenqing Zhang, Denglin Jiang, Kang An, Hao Wang

Generative Adversarial Networks (GAN) training process, in most cases, apply Uniform or Gaussian sampling methods in the latent space, which probably spends most of the computation on examples that can be properly handled and easy to generate.

Density Estimation Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.