Search Results for author: Jiaxu Miao

Found 20 papers, 11 papers with code

Implicit Bias Injection Attacks against Text-to-Image Diffusion Models

1 code implementation2 Apr 2025 Huayang Huang, Xiangye Jin, Jiaxu Miao, Yu Wu

The proliferation of text-to-image diffusion models (T2I DMs) has led to an increased presence of AI-generated images in daily life.

TarPro: Targeted Protection against Malicious Image Editing

no code implementations18 Mar 2025 Kaixin Shen, Ruijie Quan, Jiaxu Miao, Jun Xiao, Yi Yang

The rapid advancement of image editing techniques has raised concerns about their misuse for generating Not-Safe-for-Work (NSFW) content.

Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models

1 code implementation13 Feb 2025 Xiaoliu Guan, Yu Wu, Huayang Huang, Xiao Liu, Jiaxu Miao, Yi Yang

In this paper, we propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.

Image Generation Memorization

Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models

no code implementations14 Nov 2024 Chutian Meng, Fan Ma, Jiaxu Miao, Chi Zhang, Yi Yang, Yueting Zhuang

We use GPT4V to bridge the gap between the reference image and the text input for the T2I model, allowing T2I models to understand image content.

Image Generation

Iterative Ensemble Training with Anti-Gradient Control for Mitigating Memorization in Diffusion Models

1 code implementation22 Jul 2024 Xiao Liu, Xiaoliu Guan, Yu Wu, Jiaxu Miao

In this paper, we propose a novel training framework for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.

Data Augmentation Memorization

Product-Level Try-on: Characteristics-preserving Try-on with Realistic Clothes Shading and Wrinkles

no code implementations20 Jan 2024 Yanlong Zang, Han Yang, Jiaxu Miao, Yi Yang

Image-based virtual try-on systems, which fit new garments onto human portraits, are gaining research attention. An ideal pipeline should preserve the static features of clothes(like textures and logos)while also generating dynamic elements(e. g. shadows, folds)that adapt to the model's pose and environment. Previous works fail specifically in generating dynamic features, as they preserve the warped in-shop clothes trivially with predicted an alpha mask by composition. To break the dilemma of over-preserving and textures losses, we propose a novel diffusion-based Product-level virtual try-on pipeline,\ie PLTON, which can preserve the fine details of logos and embroideries while producing realistic clothes shading and wrinkles. The main insights are in three folds:1)Adaptive Dynamic Rendering:We take a pre-trained diffusion model as a generative prior and tame it with image features, training a dynamic extractor from scratch to generate dynamic tokens that preserve high-fidelity semantic information.

Denoising Virtual Try-on

Logic-induced Diagnostic Reasoning for Semi-supervised Semantic Segmentation

no code implementations ICCV 2023 Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang

Recent advances in semi-supervised semantic segmentation have been heavily reliant on pseudo labeling to compensate for limited labeled data, disregarding the valuable relational knowledge among semantic concepts.

Diagnostic Segmentation +1

Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object Detection

1 code implementation ICCV 2023 Liangqi Li, Jiaxu Miao, Dahu Shi, Wenming Tan, Ye Ren, Yi Yang, ShiLiang Pu

Current methods for open-vocabulary object detection (OVOD) rely on a pre-trained vision-language model (VLM) to acquire the recognition ability.

Knowledge Distillation Language Modeling +4

WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding

no code implementations CVPR 2023 Mengze Li, Han Wang, Wenqiao Zhang, Jiaxu Miao, Zhou Zhao, Shengyu Zhang, Wei Ji, Fei Wu

WINNER first builds the language decomposition tree in a bottom-up manner, upon which the structural attention mechanism and top-down feature backtracking jointly build a multi-modal decomposition tree, permitting a hierarchical understanding of unstructured videos.

Contrastive Learning Spatio-Temporal Video Grounding +1

GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models

3 code implementations5 Oct 2022 Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang

Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature, class).

Out-of-Distribution Detection Segmentation +1

MHR-Net: Multiple-Hypothesis Reconstruction of Non-Rigid Shapes from 2D Views

1 code implementation19 Jul 2022 Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang

We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).

Scalable Video Object Segmentation with Identification Mechanism

2 code implementations22 Mar 2022 Zongxin Yang, Jiaxu Miao, Yunchao Wei, Wenguan Wang, Xiaohan Wang, Yi Yang

This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS).

Object Segmentation +3

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

no code implementations ACL 2022 Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu

To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.

Descriptive Representation Learning +1

Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark

1 code implementation CVPR 2022 Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, Yi Yang

In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3, 536 videos and 84, 750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories.

Segmentation Video Panoptic Segmentation

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild

no code implementations CVPR 2021 Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang

To the best of our knowledge, our VSPW is the first attempt to tackle the challenging video scene parsing task in the wild by considering diverse scenarios.

4k Scene Parsing

Memory Aggregation Networks for Efficient Interactive Video Object Segmentation

no code implementations CVPR 2020 Jiaxu Miao, Yunchao Wei, Yi Yang

Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions.

Interactive Video Object Segmentation Object +2

Cannot find the paper you are looking for? You can Submit a new open access paper.