1 code implementation • 2 Apr 2025 • Huayang Huang, Xiangye Jin, Jiaxu Miao, Yu Wu
The proliferation of text-to-image diffusion models (T2I DMs) has led to an increased presence of AI-generated images in daily life.
no code implementations • 18 Mar 2025 • Kaixin Shen, Ruijie Quan, Jiaxu Miao, Jun Xiao, Yi Yang
The rapid advancement of image editing techniques has raised concerns about their misuse for generating Not-Safe-for-Work (NSFW) content.
1 code implementation • 13 Feb 2025 • Xiaoliu Guan, Yu Wu, Huayang Huang, Xiao Liu, Jiaxu Miao, Yi Yang
In this paper, we propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.
no code implementations • 14 Nov 2024 • Chutian Meng, Fan Ma, Jiaxu Miao, Chi Zhang, Yi Yang, Yueting Zhuang
We use GPT4V to bridge the gap between the reference image and the text input for the T2I model, allowing T2I models to understand image content.
1 code implementation • 22 Jul 2024 • Xiao Liu, Xiaoliu Guan, Yu Wu, Jiaxu Miao
In this paper, we propose a novel training framework for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.
no code implementations • 20 Jan 2024 • Yanlong Zang, Han Yang, Jiaxu Miao, Yi Yang
Image-based virtual try-on systems, which fit new garments onto human portraits, are gaining research attention. An ideal pipeline should preserve the static features of clothes(like textures and logos)while also generating dynamic elements(e. g. shadows, folds)that adapt to the model's pose and environment. Previous works fail specifically in generating dynamic features, as they preserve the warped in-shop clothes trivially with predicted an alpha mask by composition. To break the dilemma of over-preserving and textures losses, we propose a novel diffusion-based Product-level virtual try-on pipeline,\ie PLTON, which can preserve the fine details of logos and embroideries while producing realistic clothes shading and wrinkles. The main insights are in three folds:1)Adaptive Dynamic Rendering:We take a pre-trained diffusion model as a generative prior and tame it with image features, training a dynamic extractor from scratch to generate dynamic tokens that preserve high-fidelity semantic information.
no code implementations • ICCV 2023 • Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang
Recent advances in semi-supervised semantic segmentation have been heavily reliant on pseudo labeling to compensate for limited labeled data, disregarding the valuable relational knowledge among semantic concepts.
no code implementations • CVPR 2023 • Jiaxu Miao, Zongxin Yang, Leilei Fan, Yi Yang
In this work, we propose FedSeg, a basic federated learning approach for class-heterogeneous semantic segmentation.
1 code implementation • ICCV 2023 • Liangqi Li, Jiaxu Miao, Dahu Shi, Wenming Tan, Ye Ren, Yi Yang, ShiLiang Pu
Current methods for open-vocabulary object detection (OVOD) rely on a pre-trained vision-language model (VLM) to acquire the recognition ability.
no code implementations • CVPR 2023 • Mengze Li, Han Wang, Wenqiao Zhang, Jiaxu Miao, Zhou Zhao, Shengyu Zhang, Wei Ji, Fei Wu
WINNER first builds the language decomposition tree in a bottom-up manner, upon which the structural attention mechanism and top-down feature backtracking jointly build a multi-modal decomposition tree, permitting a hierarchical understanding of unstructured videos.
3 code implementations • 5 Oct 2022 • Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang
Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature, class).
Ranked #4 on
Out-of-Distribution Detection
on ADE-OoD
1 code implementation • 19 Jul 2022 • Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).
2 code implementations • 22 Mar 2022 • Zongxin Yang, Jiaxu Miao, Yunchao Wei, Wenguan Wang, Xiaohan Wang, Yi Yang
This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS).
1 code implementation • 18 Mar 2022 • Chen Liang, Wenguan Wang, Tianfei Zhou, Jiaxu Miao, Yawei Luo, Yi Yang
We explore the task of language-guided video segmentation (LVS).
Ranked #7 on
Referring Expression Segmentation
on A2D Sentences
Referring Expression Segmentation
Referring Video Object Segmentation
+5
no code implementations • ACL 2022 • Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu
To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.
1 code implementation • CVPR 2022 • Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, Yi Yang
In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3, 536 videos and 84, 750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories.
1 code implementation • CVPR 2022 • Yunqiu Xu, Yifan Sun, Zongxin Yang, Jiaxu Miao, Yi Yang
How to align the source and target domains is critical to the CDWSOD accuracy.
Ranked #1 on
Weakly Supervised Object Detection
on Clipart1k
no code implementations • CVPR 2021 • Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang
To the best of our knowledge, our VSPW is the first attempt to tackle the challenging video scene parsing task in the wild by considering diverse scenarios.
no code implementations • CVPR 2020 • Jiaxu Miao, Yunchao Wei, Yi Yang
Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions.
Ranked #5 on
Interactive Video Object Segmentation
on DAVIS 2017
(AUC-J metric)
1 code implementation • ICCV 2019 • Jiaxu Miao, Yu Wu, Ping Liu, Yuhang Ding, Yi Yang
Our method largely outperforms existing person re-id methods on three occlusion datasets, while remains top performance on two holistic datasets.
Ranked #30 on
Person Re-Identification
on Occluded-DukeMTMC