Search Results for author: Zhuohang Dang

Found 9 papers, 0 papers with code

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

no code implementations26 Nov 2024 Chengyou Jia, Changliang Xia, Zhuohang Dang, Weijia Wu, Hangwei Qian, Minnan Luo

Despite the significant advancements in text-to-image (T2I) generative models, users often face a trial-and-error challenge in practical scenarios.

Text-to-Image Generation

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

no code implementations24 Oct 2024 Chengyou Jia, Minnan Luo, Zhuohang Dang, Qiushi Sun, Fangzhi Xu, Junlin Hu, Tianbao Xie, Zhiyong Wu

Comprehensive quantitative and qualitative results further demonstrate AgentStore's ability to enhance agent systems in both generalization and specialization, underscoring its potential for developing the specialized generalist computer assistant.

Disentangled Noisy Correspondence Learning

no code implementations10 Aug 2024 Zhuohang Dang, Minnan Luo, Jihong Wang, Chengyou Jia, Haochen Han, Herun Wan, Guang Dai, Xiaojun Chang, Jingdong Wang

Moreover, although intuitive, directly applying previous cross-modal disentanglement methods suffers from limited noise tolerance and disentanglement efficacy.

cross-modal alignment Cross-Modal Retrieval +2

Disentangled Representation Learning with Transmitted Information Bottleneck

no code implementations3 Nov 2023 Zhuohang Dang, Minnan Luo, Chengyou Jia, Guang Dai, Jihong Wang, Xiaojun Chang, Jingdong Wang

Encoding only the task-related information from the raw data, \ie, disentangled representation learning, can greatly contribute to the robustness and generalizability of models.

Disentanglement Variational Inference

PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement

no code implementations20 Sep 2023 Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Jingdong Wang

Dominant Person Search methods aim to localize and recognize query persons in a unified network, which jointly optimizes two sub-tasks, \ie, pedestrian detection and Re-IDentification (ReID).

Denoising Pedestrian Detection +2

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation

no code implementations20 Aug 2023 Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Mengmeng Wang, Jingdong Wang

Despite significant progress in Text-to-Image (T2I) generative models, even lengthy and complex text descriptions still struggle to convey detailed controls.

Diversity Form +1

Disentangled Generation with Information Bottleneck for Few-Shot Learning

no code implementations29 Nov 2022 Zhuohang Dang, Jihong Wang, Minnan Luo, Chengyou Jia, Caixia Yan, Qinghua Zheng

To these challenges, we propose a novel Information Bottleneck (IB) based Disentangled Generation Framework for FSL, termed as DisGenIB, that can simultaneously guarantee the discrimination and diversity of generated samples.

Disentanglement Diversity +1

Cannot find the paper you are looking for? You can Submit a new open access paper.