no code implementations • 18 Apr 2025 • Zhen Wen, Luoxuan Weng, Yinghao Tang, Runjin Zhang, Yuxin Liu, Bo Pan, Minfeng Zhu, Wei Chen
To explore the potential of multimodal prompting in visualization authoring, we design VisPilot, which enables users to easily create visualizations using multimodal prompts, including text, sketches, and direct manipulations on existing visualizations.
1 code implementation • 13 Mar 2025 • Yi Yang, Xiaoxuan He, Hongkun Pan, Xiyan Jiang, Yan Deng, Xingtao Yang, Haoyu Lu, Dacheng Yin, Fengyun Rao, Minfeng Zhu, Bo Zhang, Wei Chen
Existing visual-language models often struggle to effectively analyze and reason visual content, resulting in suboptimal performance on complex reasoning tasks.
no code implementations • 3 Dec 2024 • Luoxuan Weng, Yinghao Tang, Yingchaojie Feng, Zhuo Chang, Ruiqin Chen, Haozhe Feng, Chen Hou, Danqing Huang, Yang Li, Huaming Rao, Haonan Wang, Canshi Wei, Xiaofeng Yang, Yuhui Zhang, Yifeng Zheng, Xiuqi Huang, Minfeng Zhu, Yuxin Ma, Bin Cui, Peng Chen, Wei Chen
To achieve this unification, we design a domain knowledge incorporation module tailored for enterprise-specific BI tasks, an inter-agent communication mechanism to facilitate information sharing across the BI workflow, and a cell-based context management strategy to enhance context utilization efficiency in BI notebooks.
no code implementations • 19 Aug 2024 • Xinyang Wang, Yi Yang, Minfeng Zhu, Kecheng Zheng, Shi Liu, Wei Chen
Recent advancements in pre-trained Vision-Language Models (VLMs) have highlighted the significant potential of prompt tuning for adapting these models to a wide range of downstream tasks.
no code implementations • 12 Apr 2024 • Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei zhang, Wei Chen
Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential weaknesses.
1 code implementation • 21 Feb 2024 • Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu
The surge in Large Language Models (LLMs) has revolutionized natural language processing, but fine-tuning them for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities.
1 code implementation • 18 Jul 2023 • Yingchaojie Feng, Xingbo Wang, Kam Kwai Wong, Sijia Wang, Yuhong Lu, Minfeng Zhu, Baicheng Wang, Wei Chen
Generative text-to-image models have gained great popularity among the public for their powerful capability to generate high-quality images based on natural language prompts.
1 code implementation • 13 Apr 2023 • Haozhe Feng, Zhaorui Yang, Hesun Chen, Tianyu Pang, Chao Du, Minfeng Zhu, Wei Chen, Shuicheng Yan
Recently, SFDA has gained popularity due to the need to protect the data privacy of the source domain, but it suffers from catastrophic forgetting on the source domain due to the lack of data.
1 code implementation • CVPR 2022 • Bo wang, Tao Wu, Minfeng Zhu, Peng Du
In particular, the stuff layouts can take amorphous shapes and fill up the missing regions left out by the instance layouts.
Ranked #2 on
Layout-to-Image Generation
on Visual Genome 128x128
3 code implementations • 21 Nov 2020 • Hao-Zhe Feng, Kezhi Kong, Minghao Chen, Tianye Zhang, Minfeng Zhu, Wei Chen
Semi-supervised variational autoencoders (VAEs) have obtained strong results, but have also encountered the challenge that good ELBO values do not always imply accurate inference results.
1 code implementation • 19 Nov 2020 • Hao-Zhe Feng, Zhaoyang You, Minghao Chen, Tianye Zhang, Minfeng Zhu, Fei Wu, Chao Wu, Wei Chen
(2) A dynamic weighting strategy named Consensus Focus to identify both the malicious and irrelevant domains.
Knowledge Distillation
Multi-Source Unsupervised Domain Adaptation
+2
no code implementations • 1 Aug 2019 • Zhaosong Huang, Ye Zhao, Wei Chen, Shengjie Gao, Kejie Yu, Weixia Xu, Mingjie Tang, Minfeng Zhu, Mingliang Xu
Visual querying is essential for interactively exploring massive trajectory data.
4 code implementations • CVPR 2019 • Minfeng Zhu, Pingbo Pan, Wei Chen, Yi Yang
If the initial image is not well initialized, the following processes can hardly refine the image to a satisfactory quality.
Ranked #9 on
Text-to-Image Generation
on Multi-Modal-CelebA-HQ