1 code implementation • 18 Nov 2024 • Haoxing Chen, Zizheng Huang, Yan Hong, Yanshuo Wang, Zhongcai Lyu, Zhuoer Xu, Jun Lan, Zhangxuan Gu
Pre-trained vision-language models provide a robust foundation for efficient transfer learning across various downstream tasks.
1 code implementation • 30 May 2024 • Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li
We believe that the GenVideo dataset and the DeMamba module will significantly advance the field of AI-generated video detection.
1 code implementation • 15 Apr 2024 • Haoxing Chen, Yaohui Li, Zizheng Huang, Yan Hong, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang
Recent advancements in efficient transfer learning (ETL) have shown remarkable success in fine-tuning VLMs within the scenario of limited data, introducing only a few parameters to harness task-specific insights from VLMs.
no code implementations • 28 Feb 2024 • Zhuoer Xu, Jianping Zhang, Shiwen Cui, Changhua Meng, Weiqiang Wang
Not only are these methods labor-intensive and require large budget costs, but the controllability of test prompt generation is lacking for the specific testing domain of LLM applications.
no code implementations • 20 Dec 2023 • Haoxing Chen, Yaohui Li, Zhangxuan Gu, Zhuoer Xu, Jun Lan, Huaxiong Li
Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images.
1 code implementation • 21 Nov 2023 • Haoxing Chen, Yaohui Li, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang
Recent methods mainly focus on learning multi-modal features aligned with class names to enhance the generalization ability to unseen categories.
Ranked #1 on GZSL Video Classification on ActivityNet-GZSL (cls)
no code implementations • ICCV 2023 • Zhuoer Xu, Zhangxuan Gu, Jianping Zhang, Shiwen Cui, Changhua Meng, Weiqiang Wang
Transfer-based attackers craft adversarial examples against surrogate models and transfer them to victim models deployed in the black-box situation.
1 code implementation • 14 Jun 2023 • Jianping Zhang, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weibin Wu, Michael R. Lyu
Therefore, in this paper, we aim to analyze the robustness of latent diffusion models more thoroughly.
1 code implementation • NeurIPS 2023 • Haoxing Chen, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Xing Zheng, Yaohui Li, Changhua Meng, Huijia Zhu, Weiqiang Wang
Specifically, we build our model on a diffusion model and carefully modify the network structure to enable the model for drawing multilingual characters with the help of glyph and position information.
1 code implementation • CVPR 2023 • Zhangxuan Gu, Zhuoer Xu, Haoxing Chen, Jun Lan, Changhua Meng, Weiqiang Wang
Recent object detection approaches rely on pretrained vision-language models for image-text alignment.
1 code implementation • 8 Jan 2023 • Guanghui Zhu, Zhennan Zhu, Wenjie Wang, Zhuoer Xu, Chunfeng Yuan, Yihua Huang
Moreover, to improve the performance of the downstream graph learning task, attribute completion and the training of the heterogeneous GNN should be jointly optimized rather than viewed as two separate processes.
2 code implementations • 6 Dec 2022 • Zhangxuan Gu, Haoxing Chen, Zhuoer Xu, Jun Lan, Changhua Meng, Weiqiang Wang
Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance compared to existing instance segmentation models with various backbones, such as ResNet and Swin Transformers.
Ranked #9 on Instance Segmentation on LVIS v1.0 val
1 code implementation • 7 Oct 2022 • Zhuoer Xu, Guanghui Zhu, Changhua Meng, Shiwen Cui, ZhenZhe Ying, Weiqiang Wang, Ming Gu, Yihua Huang
In this paper, we propose an efficient automated attacker called A2 to boost AT by generating the optimal perturbations on-the-fly during training.
1 code implementation • 17 Jan 2022 • ZhenZhe Ying, Zhuoer Xu, Zhifeng Li, Weiqiang Wang, Changhua Meng
Despite the success of deep learning in computer vision and natural language processing, Gradient Boosted Decision Tree (GBDT) is yet one of the most powerful tools for applications with tabular data such as e-commerce and FinTech.
1 code implementation • 17 Oct 2020 • Guanghui Zhu, Zhuoer Xu, Xu Guo, Chunfeng Yuan, Yihua Huang
Extensive experiments on classification and regression datasets demonstrate that DIFER can significantly improve the performance of various machine learning algorithms and outperform current state-of-the-art AutoFE methods in terms of both efficiency and performance.