Search Results for author: Yunchao Wei

Found 147 papers, 81 papers with code

Content-Consistent Matching for Domain Adaptive Semantic Segmentation

1 code implementation ECCV 2020 Guangrui Li, Guoliang Kang, Wu Liu, Yunchao Wei, Yi Yang

The target of CCM is to acquire those synthetic images that share similar distribution with the real ones in the target domain, so that the domain gap can be naturally alleviated by employing the content-consistent synthetic images for training.

Domain Adaptation Semantic Segmentation +1

Behavior Backdoor for Deep Learning Models

no code implementations2 Dec 2024 Jiakai Wang, Pengfei Zhang, Renshuai Tao, Jian Yang, Hao liu, Xianglong Liu, Yunchao Wei, Yao Zhao

Specifically, to adapt the optimization goal of behavior backdoor, we introduce the behavior-driven backdoor object optimizing method by a bi-target behavior backdoor training loss, thus we could guide the poisoned model optimization direction.

Backdoor Attack Deep Learning +1

BGM: Background Mixup for X-ray Prohibited Items Detection

no code implementations30 Nov 2024 Weizhe Liu, Renshuai Tao, Hongguang Zhu, YunDa Sun, Yao Zhao, Yunchao Wei

The approach introduces 1) contour information of baggage and 2) variation of material information into the original image by Mixup at patch level.

Image Augmentation

ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model

no code implementations29 Nov 2024 Kunyang Han, Yibo Hu, Mengxue Qu, Hailin Shi, Yao Zhao, Yunchao Wei

Advances in CLIP and large multimodal models (LMMs) have enabled open-vocabulary and free-text segmentation, yet existing models still require predefined category prompts, limiting free-form category self-generation.

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

no code implementations27 Nov 2024 Renshuai Tao, Haoyu Wang, Yuzhe Guo, Hairong Chen, Li Zhang, Xianglong Liu, Yunchao Wei, Yao Zhao

To emulate human intelligence in dual-view detection, we propose the Auxiliary-view Enhanced Network (AENet), a novel detection framework that leverages both the main and auxiliary views of the same object.

Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Language Models

1 code implementation22 Nov 2024 Wanqi Yang, Yanda Li, Meng Fang, Yunchao Wei, Tianyi Zhou, Ling Chen

We evaluate six state-of-the-art LLMs with voice interaction capabilities, including Gemini-1. 5-Pro, GPT-4o, and others, using three distinct evaluation methods on the CAA benchmark.

Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection

no code implementations20 Nov 2024 Xinhao Zhong, Siyu Jiao, Yao Zhao, Yunchao Wei

However, in open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.

Contrastive Learning object-detection +2

Foundations and Recent Trends in Multimodal Mobile Agents: A Survey

1 code implementation4 Nov 2024 Biao Wu, Yanda Li, Meng Fang, Zirui Song, Zhiwei Zhang, Yunchao Wei, Ling Chen

This survey provides a comprehensive review of mobile agent technologies, focusing on recent advancements that enhance real-time adaptability and multimodal interaction.

Survey

PSVMA+: Exploring Multi-granularity Semantic-visual Adaption for Generalized Zero-shot Learning

no code implementations15 Oct 2024 Man Liu, Huihui Bai, Feng Li, Chunjie Zhang, Yunchao Wei, Meng Wang, Tat-Seng Chua, Yao Zhao

Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features.

Attribute Diversity +1

Bridge the Points: Graph-based Few-shot Segment Anything Semantically

1 code implementation9 Oct 2024 Anqi Zhang, Guangyu Gao, Jianbo Jiao, Chi Harold Liu, Yunchao Wei

Another subsequent Point-Mask Clustering module aligns the granularity of masks and selected points as a directed graph, based on mask coverage over points.

Few-Shot Semantic Segmentation Semantic Segmentation

Collapsed Language Models Promote Fairness

1 code implementation6 Oct 2024 Jingxuan Xu, Wuyang Chen, Linyi Li, Yao Zhao, Yunchao Wei

To mitigate societal biases implicitly encoded in recent successful pretrained language models, a diverse array of approaches have been proposed to encourage model fairness, focusing on prompting, data augmentation, regularized fine-tuning, and more.

Data Augmentation Fairness +2

SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training

1 code implementation15 Aug 2024 Gengwei Zhang, Liyuan Wang, Guoliang Kang, Ling Chen, Yunchao Wei

Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Classifier Alignment (SLCA++) framework to unleash the power of Seq FT, serving as a strong baseline approach for CLPT.

Continual Learning Image Classification +2

DreamLCM: Towards High-Quality Text-to-3D Generation via Latent Consistency Model

1 code implementation6 Aug 2024 Yiming Zhong, Xiaolin Zhang, Yao Zhao, Yunchao Wei

Secondly, we propose a dual timestep strategy, increasing the consistency of guidance and optimizing 3D models from geometry to appearance in DreamLCM.

3D Generation Text to 3D

AppAgent v2: Advanced Agent for Flexible Mobile Interactions

no code implementations5 Aug 2024 Yanda Li, Chi Zhang, Wanqi Yang, Bin Fu, Pei Cheng, Xin Chen, Ling Chen, Yunchao Wei

In the deployment phase, RAG technology enables efficient retrieval and update from this knowledge base, thereby empowering the agent to perform tasks effectively and accurately.

RAG

ACTRESS: Active Retraining for Semi-supervised Visual Grounding

no code implementations3 Jul 2024 Weitai Kang, Mengxue Qu, Yunchao Wei, Yan Yan

Building upon this, ACTRESS consists of an active sampling strategy and a selective retraining strategy.

Binary Classification Visual Grounding

LayerMatch: Do Pseudo-labels Benefit All Layers?

no code implementations20 Jun 2024 Chaoqi Liang, Guanglei Yang, Lifeng Qiao, Zitong Huang, Hongliang Yan, Yunchao Wei, WangMeng Zuo

Our approach, LayerMatch, which integrates these two strategies, can avoid the severe interference of noisy pseudo-labels in the linear classification layer while accelerating the clustering capability of the feature extraction layer.

Avg Clustering +1

Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation

1 code implementation CVPR 2024 Bingfeng Zhang, Siyue Yu, Yunchao Wei, Yao Zhao, Jimin Xiao

Specifically, the frozen CLIP model is applied as the backbone for semantic feature extraction, and a new decoder is designed to interpret extracted semantic features for final prediction.

Decoder Segmentation +2

Instructing Prompt-to-Prompt Generation for Zero-Shot Learning

no code implementations5 Jun 2024 Man Liu, Huihui Bai, Feng Li, Chunjie Zhang, Yunchao Wei, Meng Wang, Tat-Seng Chua, Yao Zhao

Zero-shot learning (ZSL) aims to explore the semantic-visual interactions to discover comprehensive knowledge transferred from seen categories to classify unseen categories.

Domain Generalization Instruction Following +2

Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention

no code implementations28 May 2024 Weitai Kang, Mengxue Qu, Jyoti Kini, Yunchao Wei, Mubarak Shah, Yan Yan

To achieve detection based on human intention, it relies on humans to observe the scene, reason out the target that aligns with their intention ("pillow" in this case), and finally provide a reference to the AI system, such as "A pillow on the couch".

3D Object Detection 3D visual grounding +2

ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance

1 code implementation27 May 2024 Jiannan Huang, Jun Hao Liew, Hanshu Yan, Yuyang Yin, Yao Zhao, Yunchao Wei

Recent text-to-image customization works have been proven successful in generating images of given concepts by fine-tuning the diffusion models on a few examples.

Diffusion Personalization Video Generation

Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models

no code implementations26 May 2024 Hanwen Liang, Yuyang Yin, Dejia Xu, Hanxue Liang, Zhangyang Wang, Konstantinos N. Plataniotis, Yao Zhao, Yunchao Wei

Building on this foundation, we propose a strategy to migrate the temporal consistency in video diffusion models to the spatial-temporal consistency required for 4D generation.

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

1 code implementation CVPR 2024 Jingxuan Xu, Wuyang Chen, Yao Zhao, Yunchao Wei

In the context of efficient OVS, we target achieving performance that is comparable to or even better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.

Model Compression

Learning Trimaps via Clicks for Image Matting

1 code implementation30 Mar 2024 Chenyi Zhang, Yihan Hu, Henghui Ding, Humphrey Shi, Yao Zhao, Yunchao Wei

Despite significant advancements in image matting, existing models heavily depend on manually-drawn trimaps for accurate results in natural image scenarios.

Image Matting

Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning

1 code implementation12 Mar 2024 Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

Consequently, these detectors have exhibited a lack of proficiency in learning the frequency domain and tend to overfit to the artifacts present in the training data, leading to suboptimal performance on unseen sources.

DeepFake Detection Face Swapping

Learning Hierarchical Color Guidance for Depth Map Super-Resolution

no code implementations12 Mar 2024 Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, WangMeng Zuo, Yao Zhao, Sam Kwong

On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages.

Depth Map Super-Resolution

Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection

1 code implementation11 Mar 2024 Chuangchuang Tan, Ping Liu, Renshuai Tao, Huan Liu, Yao Zhao, Baoyuan Wu, Yunchao Wei

Due to its unbias towards both the training and test sources, we define it as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources.

DeepFake Detection Face Swapping

4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency

no code implementations28 Dec 2023 Yuyang Yin, Dejia Xu, Zhangyang Wang, Yao Zhao, Yunchao Wei

Our pipeline facilitates controllable 4D generation, enabling users to specify the motion via monocular video or adopt image-to-video generations, thus offering superior control over content creation.

Prompt Engineering

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

1 code implementation CVPR 2024 Huan Liu, Zichang Tan, Chuangchuang Tan, Yunchao Wei, Yao Zhao, Jingdong Wang

In this paper, we study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods, e. g., GANs and diffusion models.

Attribute Synthetic Image Detection

SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process

1 code implementation NeurIPS 2023 Mengyu Wang, Henghui Ding, Jun Hao Liew, Jiajun Liu, Yao Zhao, Yunchao Wei

We propose a model-agnostic solution called SegRefiner, which offers a novel perspective on this problem by interpreting segmentation refinement as a data generation process.

Denoising Dichotomous Image Segmentation +4

Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection

2 code implementations CVPR 2024 Chuangchuang Tan, Huan Liu, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

Recently, the proliferation of highly realistic synthetic images, facilitated through a variety of GANs and Diffusions, has significantly heightened the susceptibility to misuse.

DeepFake Detection Face Swapping

Diffusion for Natural Image Matting

1 code implementation10 Dec 2023 Yihan Hu, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei, Humphrey Shi

However, the presence of high computational overhead and the inconsistency of noise sampling between the training and inference processes pose significant obstacles to achieving this goal.

Decoder Image Matting

PixelLM: Pixel Reasoning with Large Multimodal Model

1 code implementation CVPR 2024 Zhongwei Ren, Zhicheng Huang, Yunchao Wei, Yao Zhao, Dongmei Fu, Jiashi Feng, Xiaojie Jin

PixelLM excels across various pixel-level image reasoning and understanding tasks, outperforming well-established methods in multiple benchmarks, including MUSE, single- and multi-referring segmentation.

Decoder Reasoning Segmentation +1

CoinSeg: Contrast Inter- and Intra- Class Representations for Incremental Segmentation

1 code implementation ICCV 2023 Zekang Zhang, Guangyu Gao, Jianbo Jiao, Chi Harold Liu, Yunchao Wei

However, most state-of-the-art methods use the freeze strategy for stability, which compromises the model's plasticity. In contrast, releasing parameter training for plasticity could lead to the best performance for all categories, but this requires discriminative feature representation. Therefore, we prioritize the model's plasticity and propose the Contrast inter- and intra-class representations for Incremental Segmentation (CoinSeg), which pursues discriminative representations for flexible parameter tuning.

Class-Incremental Semantic Segmentation Diversity

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation

2 code implementations NeurIPS 2023 Siyu Jiao, Yunchao Wei, YaoWei Wang, Yao Zhao, Humphrey Shi

However, in the paper, we reveal that CLIP is insensitive to different mask proposals and tends to produce similar predictions for various mask proposals of the same image.

Open Vocabulary Semantic Segmentation Zero Shot Segmentation

Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-Modal Manipulation

no code implementations18 Sep 2023 Huan Liu, Zichang Tan, Qiang Chen, Yunchao Wei, Yao Zhao, Jingdong Wang

Moreover, to address the semantic conflicts between image and frequency domains, the forgery-aware mutual module is developed to further enable the effective interaction of disparate image and frequency features, resulting in aligned and comprehensive visual forgery representations.

Decoder Misinformation

CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

1 code implementation14 Aug 2023 Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao

Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly.

Continual Learning Continual Pretraining

CLE Diffusion: Controllable Light Enhancement Diffusion Model

no code implementations13 Aug 2023 Yuyang Yin, Dejia Xu, Chuangchuang Tan, Ping Liu, Yao Zhao, Yunchao Wei

Low light enhancement has gained increasing importance with the rapid development of visual creation and editing.

Low-Light Image Enhancement

Disentangled Pre-training for Image Matting

1 code implementation3 Apr 2023 Yanda Li, Zilong Huang, Gang Yu, Ling Chen, Yunchao Wei, Jianbo Jiao

The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective.

Disentanglement Image Matting

Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

1 code implementation CVPR 2023 Man Liu, Feng Li, Chunjie Zhang, Yunchao Wei, Huihui Bai, Yao Zhao

Generalized Zero-Shot Learning (GZSL) identifies unseen categories by knowledge transferred from the seen domain, relying on the intrinsic interactions between visual and semantic information.

Attribute Decoder +1

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

1 code implementation ICCV 2023 Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Knowledge Distillation Open Vocabulary Semantic Segmentation +4

Learning To Segment Every Referring Object Point by Point

1 code implementation CVPR 2023 Mengxue Qu, Yu Wu, Yunchao Wei, Wu Liu, Xiaodan Liang, Yao Zhao

Extensive experiments show that our model achieves 52. 06% in terms of accuracy (versus 58. 93% in fully supervised setting) on RefCOCO+@testA, when only using 1% of the mask annotations.

Object Referring Expression +1

Adversarially Masking Synthetic To Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation

no code implementations CVPR 2023 Guangrui Li, Guoliang Kang, Xiaohan Wang, Yunchao Wei, Yi Yang

With the help of adversarial training, the masking module can learn to generate source masks to mimic the pattern of irregular target noise, thereby narrowing the domain gap.

Point Cloud Segmentation Semantic Segmentation

CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

1 code implementation ICCV 2023 Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao

Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly.

Continual Learning Continual Pretraining

Mask Matching Transformer for Few-Shot Segmentation

1 code implementation5 Dec 2022 Siyu Jiao, Gengwei Zhang, Shant Navasardyan, Ling Chen, Yao Zhao, Yunchao Wei, Humphrey Shi

Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results.

Few-Shot Semantic Segmentation Segmentation

Mining Unseen Classes via Regional Objectness: A Simple Baseline for Incremental Segmentation

1 code implementation13 Nov 2022 Zekang Zhang, Guangyu Gao, Zhiyuan Fang, Jianbo Jiao, Yunchao Wei

Our MicroSeg is based on the assumption that background regions with strong objectness possibly belong to those concepts in the historical or future stages.

Class-Incremental Semantic Segmentation Continual Learning +1

VMFormer: End-to-End Video Matting with Transformer

1 code implementation26 Aug 2022 Jiachen Li, Vidit Goel, Marianna Ohanyan, Shant Navasardyan, Yunchao Wei, Humphrey Shi

In this paper, we propose VMFormer: a transformer-based end-to-end method for video matting.

Decoder Video Matting

Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation

1 code implementation5 Aug 2022 Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei

In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way.

Instance Segmentation Semantic Segmentation +1

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

1 code implementation27 Jul 2022 Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei

Particularly, SiRi conveys a significant principle to the research of visual grounding, i. e., a better initialized vision-language encoder would help the model converge to a better local minimum, advancing the performance accordingly.

Visual Grounding

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval

no code implementations17 Jun 2022 Xiao Dong, Xunlin Zhan, Yunchao Wei, XiaoYong Wei, YaoWei Wang, Minlong Lu, Xiaochun Cao, Xiaodan Liang

Our goal in this research is to study a more realistic environment in which we can conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.

Retrieval

FisheyeEX: Polar Outpainting for Extending the FoV of Fisheye Lens

1 code implementation12 Jun 2022 Kang Liao, Chunyu Lin, Yunchao Wei, Yao Zhao

For the distortion synthesis, we propose a spiral distortion-aware perception module, in which the learning path keeps consistent with the distortion prior of the fisheye image.

Image Outpainting

Cylin-Painting: Seamless {360\textdegree} Panoramic Image Outpainting and Beyond

1 code implementation18 Apr 2022 Kang Liao, Xiangyu Xu, Chunyu Lin, Wenqi Ren, Yunchao Wei, Yao Zhao

Motivated by this analysis, we present a Cylin-Painting framework that involves meaningful collaborations between inpainting and outpainting and efficiently fuses the different arrangements, with a view to leveraging their complementary benefits on a seamless cylinder.

Depth Estimation Image Outpainting +3

L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation

1 code implementation CVPR 2022 Peng-Tao Jiang, YuQi Yang, Qibin Hou, Yunchao Wei

Our framework conducts the global network to learn the captured rich object detail knowledge from a global view and thereby produces high-quality attention maps that can be directly used as pseudo annotations for semantic segmentation networks.

Object Transfer Learning +2

Scalable Video Object Segmentation with Identification Mechanism

2 code implementations22 Mar 2022 Zongxin Yang, Jiaxu Miao, Yunchao Wei, Wenguan Wang, Xiaohan Wang, Yi Yang

This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS).

Object Segmentation +3

Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark

1 code implementation CVPR 2022 Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, Yi Yang

In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3, 536 videos and 84, 750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories.

Segmentation Video Panoptic Segmentation

Clicking Matters:Towards Interactive Human Parsing

no code implementations11 Nov 2021 Yutong Gao, Liqian Liang, Congyan Lang, Songhe Feng, Yidong Li, Yunchao Wei

In this work, we focus on Interactive Human Parsing (IHP), which aims to segment a human image into multiple human body parts with guidance from users' interactions.

Human Parsing Image Segmentation +1

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining

no code implementations CVPR 2022 Xiao Dong, Xunlin Zhan, Yangxin Wu, Yunchao Wei, Michael C. Kampffmeyer, XiaoYong Wei, Minlong Lu, YaoWei Wang, Xiaodan Liang

Despite the potential of multi-modal pre-training to learn highly discriminative feature representations from complementary data modalities, current progress is being slowed by the lack of large-scale modality-diverse datasets.

Contrastive Learning

Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics

1 code implementation26 Aug 2021 Wuyang Chen, Xinyu Gong, Junru Wu, Yunchao Wei, Humphrey Shi, Zhicheng Yan, Yi Yang, Zhangyang Wang

This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation.

Neural Architecture Search

Generating Superpixels for High-resolution Images with Decoupled Patch Calibration

no code implementations19 Aug 2021 Yaxiong Wang, Yunchao Wei, Xueming Qian, Li Zhu, Yi Yang

Superpixel segmentation has recently seen important progress benefiting from the advances in differentiable deep learning.

Segmentation Superpixels +1

Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining

1 code implementation ICCV 2021 Xunlin Zhan, Yangxin Wu, Xiao Dong, Yunchao Wei, Minlong Lu, Yichi Zhang, Hang Xu, Xiaodan Liang

In this paper, we investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval among fine-grained product categories.

Retrieval

LayerCAM: Exploring Hierarchical Class Activation Maps for Localization

3 code implementations IEEE 2021 Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, Yunchao Wei

To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation.

Object Semantic Segmentation +1

ReGO: Reference-Guided Outpainting for Scenery Image

1 code implementation20 Jun 2021 Yaxiong Wang, Yunchao Wei, Xueming Qian, Li Zhu, Yi Yang

We aim to tackle the challenging yet practical scenery image outpainting task in this work.

Image Outpainting

Automated Deepfake Detection

no code implementations20 Jun 2021 Ping Liu, Yuewei Lin, Yang He, Yunchao Wei, Liangli Zhen, Joey Tianyi Zhou, Rick Siow Mong Goh, Jingen Liu

In this paper, we propose to utilize Automated Machine Learning to adaptively search a neural architecture for deepfake detection.

BIG-bench Machine Learning DeepFake Detection +1

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild

no code implementations CVPR 2021 Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang

To the best of our knowledge, our VSPW is the first attempt to tackle the challenging video scene parsing task in the wild by considering diverse scenarios.

4k Scene Parsing

Affinity Attention Graph Neural Network for Weakly Supervised Semantic Segmentation

1 code implementation8 Jun 2021 Bingfeng Zhang, Jimin Xiao, Jianbo Jiao, Yunchao Wei, Yao Zhao

More importantly, our approach can be readily applied to bounding box supervised instance segmentation task or other weakly supervised semantic segmentation tasks, with state-of-the-art or comparable performance among almot all weakly supervised tasks on PASCAL VOC or COCO dataset.

Box-supervised Instance Segmentation Graph Neural Network +4

Domain Consensus Clustering for Universal Domain Adaptation

1 code implementation CVPR 2021 Guangrui Li, Guoliang Kang, Yi Zhu, Yunchao Wei, Yi Yang

To better exploit the intrinsic structure of the target domain, we propose Domain Consensus Clustering (DCC), which exploits the domain consensus knowledge to discover discriminative clusters on both common samples and private ones.

Clustering domain classification +3

Associating Objects with Transformers for Video Object Segmentation

2 code implementations NeurIPS 2021 Zongxin Yang, Yunchao Wei, Yi Yang

The state-of-the-art methods learn to decode features with a single positive object and thus have to match and segment each target separately under multi-object scenarios, consuming multiple times computing resources.

Ranked #2 on Video Object Segmentation on DAVIS 2017 (test-dev) (using extra training data)

Object One-shot visual object segmentation +2

Cross-Modal Progressive Comprehension for Referring Segmentation

1 code implementation15 May 2021 Si Liu, Tianrui Hui, Shaofei Huang, Yunchao Wei, Bo Li, Guanbin Li

In this paper, we propose a Cross-Modal Progressive Comprehension (CMPC) scheme to effectively mimic human behaviors and implement it as a CMPC-I (Image) module and a CMPC-V (Video) module to improve referring image and video segmentation models.

Attribute Image Segmentation +5

Decoupled Spatial Temporal Graphs for Generic Visual Grounding

no code implementations18 Mar 2021 Qianyu Feng, Yunchao Wei, MingMing Cheng, Yi Yang

Visual grounding is a long-lasting problem in vision-language understanding due to its diversity and complexity.

Contrastive Learning Visual Grounding

AINet: Association Implantation for Superpixel Segmentation

no code implementations ICCV 2021 Yaxiong Wang, Yunchao Wei, Xueming Qian, Li Zhu, Yi Yang

However, simply applying a series of convolution operations with limited receptive fields can only implicitly perceive the relations between the pixel and its surrounding grids.

Segmentation

Towards Complete Scene and Regular Shape for Distortion Rectification by Curve-Aware Extrapolation

no code implementations ICCV 2021 Kang Liao, Chunyu Lin, Yunchao Wei, Feng Li, Shangrong Yang, Yao Zhao

To our knowledge, we are the first to tackle the challenging rectification via outpainting, and our curve-aware strategy can reach a rectification construction with complete content and regular shape.

Consistent Structural Relation Learning for Zero-Shot Segmentation

no code implementations NeurIPS 2020 Peike Li, Yunchao Wei, Yi Yang

Concretely, by exploring the pair-wise and list-wise structures, we impose the relations of generated visual features to be consistent with their counterparts in the semantic word embedding space.

Relation Semantic Segmentation +3

Delving Deep into Label Smoothing

2 code implementations25 Nov 2020 Chang-Bin Zhang, Peng-Tao Jiang, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li, Ming-Ming Cheng

Experiments demonstrate that based on the same classification models, the proposed approach can effectively improve the classification performance on CIFAR-100, ImageNet, and fine-grained datasets.

Classification General Classification

Referring Image Segmentation via Cross-Modal Progressive Comprehension

1 code implementation CVPR 2020 Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li

In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.

Attribute Image Segmentation +2

DONet: Dual Objective Networks for Skin Lesion Segmentation

no code implementations19 Aug 2020 Yaxiong Wang, Yunchao Wei, Xueming Qian, Li Zhu, Yi Yang

Skin lesion segmentation is a crucial step in the computer-aided diagnosis of dermoscopic images.

Lesion Segmentation Segmentation +2

Inter-Image Communication for Weakly Supervised Localization

1 code implementation ECCV 2020 Xiaolin Zhang, Yunchao Wei, Yi Yang

We learn a feature center for each category and realize the global feature consistency by forcing the object features to approach class-specific centers.

Object

Sketch-Guided Scenery Image Outpainting

no code implementations17 Jun 2020 Yaxiong Wang, Yunchao Wei, Xueming Qian, Li Zhu, Yi Yang

In this work, we take the image outpainting one step forward by allowing users to harvest personal custom outpainting results using sketches as the guidance.

Decoder Image Outpainting

Omni-supervised Facial Expression Recognition via Distilled Data

no code implementations18 May 2020 Ping Liu, Yunchao Wei, Zibo Meng, Weihong Deng, Joey Tianyi Zhou, Yi Yang

However, the performance of the current state-of-the-art facial expression recognition (FER) approaches is directly related to the labeled data for training.

Dataset Distillation Facial Expression Recognition +1

Referring Image Segmentation by Generative Adversarial Learning

no code implementations IEEE 2020 Shuang Qiu, Yao Zhao, Jianbo Jiao, Yunchao Wei, Shikui Wei

To this end, we propose to train the referring image segmentation model in a generative adversarial fashion, which well addresses the distribution similarity problem.

Image Segmentation Referring Expression +4

VehicleNet: Learning Robust Visual Representation for Vehicle Re-identification

3 code implementations14 Apr 2020 Zhedong Zheng, Tao Ruan, Yunchao Wei, Yi Yang, Tao Mei

This stage relaxes the full alignment between the training and testing domains, as it is agnostic to the target vehicle domain.

Representation Learning Vehicle Re-Identification

Memory Aggregation Networks for Efficient Interactive Video Object Segmentation

no code implementations CVPR 2020 Jiaxu Miao, Yunchao Wei, Yi Yang

Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions.

Interactive Video Object Segmentation Object +2

Laplacian Denoising Autoencoder

no code implementations30 Mar 2020 Jianbo Jiao, Linchao Bao, Yunchao Wei, Shengfeng He, Honghui Shi, Rynson Lau, Thomas S. Huang

This can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data.

Denoising Self-Supervised Learning

Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation

1 code implementation CVPR 2020 Zhonghao Wang, Mo Yu, Yunchao Wei, Rogerio Feris, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi

We consider the problem of unsupervised domain adaptation for semantic segmentation by easing the domain shift between the source domain (synthetic data) and the target domain (real data) in this work.

Semantic Segmentation Unsupervised Domain Adaptation

University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

3 code implementations27 Feb 2020 Zhedong Zheng, Yunchao Wei, Yi Yang

To our knowledge, University-1652 is the first drone-based geo-localization dataset and enables two new tasks, i. e., drone-view target localization and drone navigation.

Drone navigation Drone-view target localization +3

AlignSeg: Feature-Aligned Segmentation Networks

1 code implementation24 Feb 2020 Zilong Huang, Yunchao Wei, Xinggang Wang, Wenyu Liu, Thomas S. Huang, Humphrey Shi

Aggregating features in terms of different convolutional blocks or contextual embeddings has been proven to be an effective way to strengthen feature representations for semantic segmentation.

Segmentation Semantic Segmentation

Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach

1 code implementation19 Nov 2019 Bingfeng Zhang, Jimin Xiao, Yunchao Wei, Ming-Jie Sun, Kai-Zhu Huang

Such reliable regions are then directly served as ground-truth labels for the parallel segmentation branch, where a newly designed dense energy loss function is adopted for optimization.

Image Classification Segmentation +2

Self-Correction for Human Parsing

2 code implementations22 Oct 2019 Peike Li, Yunqiu Xu, Yunchao Wei, Yi Yang

To tackle the problem of learning with label noises, this work introduces a purification strategy, called Self-Correction for Human Parsing (SCHP), to progressively promote the reliability of the supervised labels as well as the learned models.

Human Parsing Human Part Segmentation +1

SPGNet: Semantic Prediction Guidance for Scene Parsing

no code implementations ICCV 2019 Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, JinJun Xiong, Thomas Huang, Wen-mei Hwu, Honghui Shi

The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path.

Decoder Pose Estimation +3

CCNet: Criss-Cross Attention for Semantic Segmentation

4 code implementations ICCV 2019 Zilong Huang, Xinggang Wang, Yunchao Wei, Lichao Huang, Humphrey Shi, Wenyu Liu, Thomas S. Huang

Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.

Ranked #7 on Semantic Segmentation on FoodSeg103 (using extra training data)

Computational Efficiency Human Parsing +8

Self-similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-identification

1 code implementation ICCV 2019 Yang Fu, Yunchao Wei, Guanshuo Wang, Yuqian Zhou, Honghui Shi, Thomas Huang

Upon our SSG, we further introduce a clustering-guided semisupervised approach named SSG ++ to conduct the one-shot domain adaption in an open set setting (i. e. the number of independent identities from the target domain is unknown).

Clustering One-Shot Learning +2

A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better Generalization

no code implementations23 Nov 2018 Bowen Cheng, Yunchao Wei, Jiahui Yu, Shiyu Chang, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi

While training on samples drawn from independent and identical distribution has been a de facto paradigm for optimizing image classification networks, humans learn new concepts in an easy-to-hard manner and on the selected examples progressively.

General Classification Image Classification +6

STA: Spatial-Temporal Attention for Large-Scale Video-based Person Re-Identification

no code implementations9 Nov 2018 Yang Fu, Xiaoyang Wang, Yunchao Wei, Thomas Huang

Thus, a more robust clip-level feature representation can be generated according to a weighted sum operation guided by the mined 2-D attention score matrix.

Large-Scale Person Re-Identification Video-Based Person Re-Identification

Self-Erasing Network for Integral Object Attention

no code implementations NeurIPS 2018 Qibin Hou, Peng-Tao Jiang, Yunchao Wei, Ming-Ming Cheng

To test the quality of the generated attention maps, we employ the mined object regions as heuristic cues for learning semantic segmentation models.

Object Semantic Segmentation

SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation

1 code implementation22 Oct 2018 Xiaolin Zhang, Yunchao Wei, Yi Yang, Thomas Huang

In this way, the possibilities embedded in the produced similarity maps can be adapted to guide the process of segmenting objects.

Few-Shot Semantic Segmentation Segmentation +1

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection

no code implementations ECCV 2018 Yunchao Wei, Zhiqiang Shen, Bowen Cheng, Honghui Shi, JinJun Xiong, Jiashi Feng, Thomas Huang

This work provides a simple approach to discover tight object bounding boxes with only image-level supervision, called Tight box mining with Surrounding Segmentation Context (TS2C).

Multiple Instance Learning Object +4

Adversarial Complementary Learning for Weakly Supervised Object Localization

2 code implementations CVPR 2018 Xiaolin Zhang, Yunchao Wei, Jiashi Feng, Yi Yang, Thomas Huang

With such an adversarial learning, the two parallel-classifiers are forced to leverage complementary object regions for classification and can finally generate integral object localization together.

General Classification Object +1

Horizontal Pyramid Matching for Person Re-identification

1 code implementation14 Apr 2018 Yang Fu, Yunchao Wei, Yuqian Zhou, Honghui Shi, Gao Huang, Xinchao Wang, Zhiqiang Yao, Thomas Huang

Despite the remarkable recent progress, person re-identification (Re-ID) approaches are still suffering from the failure cases where the discriminative body parts are missing.

Person Re-Identification

Left-Right Comparative Recurrent Model for Stereo Matching

no code implementations CVPR 2018 Zequn Jie, Pengfei Wang, Yonggen Ling, Bo Zhao, Yunchao Wei, Jiashi Feng, Wei Liu

Left-right consistency check is an effective way to enhance the disparity estimation by referring to the information from the opposite view.

Disparity Estimation Stereo Matching +1

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

7 code implementations ECCV 2018 Bowen Cheng, Yunchao Wei, Honghui Shi, Rogerio Feris, JinJun Xiong, Thomas Huang

Recent region-based object detectors are usually built with separate classification and localization branches on top of shared feature extraction networks.

Classification General Classification +1

Transferable Semi-supervised Semantic Segmentation

no code implementations18 Nov 2017 Huaxin Xiao, Yunchao Wei, Yu Liu, Maojun Zhang, Jiashi Feng

The performance of deep learning based semantic segmentation models heavily depends on sufficient data with careful annotations.

Segmentation Semi-Supervised Semantic Segmentation

Learning to Segment Human by Watching YouTube

no code implementations4 Oct 2017 Xiaodan Liang, Yunchao Wei, Liang Lin, Yunpeng Chen, Xiaohui Shen, Jianchao Yang, Shuicheng Yan

An intuition on human segmentation is that when a human is moving in a video, the video-context (e. g., appearance and motion clues) may potentially infer reasonable mask information for the whole human body.

Human Detection Segmentation +5

Regional Interactive Image Segmentation Networks

no code implementations ICCV 2017 Jun Hao Liew, Yunchao Wei, Wei Xiong, Sim-Heng Ong, Jiashi Feng

The interactive image segmentation model allows users to iteratively add new inputs for refinement until a satisfactory result is finally obtained.

Ranked #10 on Interactive Segmentation on SBD (NoC@85 metric)

Image Segmentation Interactive Segmentation +2

Self-explanatory Deep Salient Object Detection

no code implementations18 Aug 2017 Huaxin Xiao, Jiashi Feng, Yunchao Wei, Maojun Zhang

Through visualizing the differences, we can interpret the capability of different deep neural networks based saliency detection models and demonstrate that our proposed model indeed uses more reasonable structure for salient object detection.

Object object-detection +3

Perceptual Generative Adversarial Networks for Small Object Detection

no code implementations CVPR 2017 Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, Shuicheng Yan

In this work, we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to "super-resolved" ones, achieving similar characteristics as large objects and thus more discriminative for detection.

Generative Adversarial Network Object +2

Multiple-Human Parsing in the Wild

2 code implementations19 May 2017 Jianshu Li, Jian Zhao, Yunchao Wei, Congyan Lang, Yidong Li, Terence Sim, Shuicheng Yan, Jiashi Feng

To address the multi-human parsing problem, we introduce a new multi-human parsing (MHP) dataset and a novel multi-human parsing model named MH-Parser.

Multi-Human Parsing

IAN: The Individual Aggregation Network for Person Search

no code implementations16 May 2017 Jimin Xiao, Yanchun Xie, Tammam Tillo, Kai-Zhu Huang, Yunchao Wei, Jiashi Feng

In addition, to relieve the negative effect caused by varying visual appearances of the same individual, IAN introduces a novel center loss that can increase the intra-class compactness of feature representations.

object-detection Object Detection +1

Deep Self-Taught Learning for Weakly Supervised Object Localization

no code implementations CVPR 2017 Zequn Jie, Yunchao Wei, Xiaojie Jin, Jiashi Feng, Wei Liu

To overcome this issue, we propose a deep self-taught learning approach, which makes the detector learn the object-level features reliable for acquiring tight positive samples and afterwards re-train itself based on them.

Object Weakly Supervised Object Detection +1

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

no code implementations CVPR 2017 Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan

We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems.

Classification General Classification +4

Bottom-Up Top-Down Cues for Weakly-Supervised Semantic Segmentation

no code implementations7 Dec 2016 Qinbin Hou, Puneet Kumar Dokania, Daniela Massiceti, Yunchao Wei, Ming-Ming Cheng, Philip Torr

We focus on the following three aspects of EM: (i) initialization; (ii) latent posterior estimation (E-step) and (iii) the parameter update (M-step).

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation