no code implementations • 22 Dec 2024 • Bin Xia, Yuechen Zhang, Jingyao Li, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia
We begin by analyzing existing frameworks and the requirements of downstream tasks, proposing a unified framework that integrates both T2I models and various editing tasks.
no code implementations • 11 Dec 2024 • Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao
Transformer models have revolutionized AI, enabling applications like content generation and sentiment analysis.
no code implementations • 5 Dec 2024 • HUI ZHANG, Dexiang Hong, Tingwei Gao, Yitong Wang, Jie Shao, Xinglong Wu, Zuxuan Wu, Yu-Gang Jiang
To Inherit the advantages of MM-DiT, we use a separate set of network weights to process the layout, treating it as equally important as the image and text modalities.
no code implementations • 25 Nov 2024 • Yitong Wang, Xudong Xu, Li Ma, Haoran Wang, Bo Dai
By analyzing the components of PBR materials, we choose to consider albedo, roughness, metalness, and bump maps.
1 code implementation • 26 Jun 2024 • Yang Liu, Yitong Wang, Chenyue Feng
UniRec leverages sequence uniformity and item frequency to enhance performance, particularly improving the representation of non-uniform sequences and less-frequent items.
no code implementations • 7 Apr 2024 • Yuxi Ren, Jie Wu, Yanzuo Lu, Huafeng Kuang, Xin Xia, Xionghui Wang, Qianqian Wang, Yixing Zhu, Pan Xie, Shiyin Wang, Xuefeng Xiao, Yitong Wang, Min Zheng, Lean Fu
Recent advancements in diffusion-based generative image editing have sparked a profound revolution, reshaping the landscape of image outpainting and inpainting tasks.
1 code implementation • 1 Jan 2024 • Zhuoyan Luo, Yicheng Xiao, Yong liu, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang
The recent transformer-based models have dominated the Referring Video Object Segmentation (RVOS) task due to the superior performance.
no code implementations • 11 Dec 2023 • Yitong Wang, Chang Liu, Jun Zhao
In pursuit of enhancing the accessibility of AIGC services, the deployment of AIGC models (e. g., diffusion models) to edge servers and local devices has become a prevailing trend.
2 code implementations • CVPR 2024 • Yong liu, Sule Bai, Guanbin Li, Yitong Wang, Yansong Tang
We attribute this to the in-vocabulary embedding and domain-biased CLIP prediction.
2 code implementations • CVPR 2024 • Yong liu, Cairong Zhang, Yitong Wang, Jiahao Wang, Yujiu Yang, Yansong Tang
This paper aims to achieve universal segmentation of arbitrary semantic level.
Ranked #1 on Referring Expression Segmentation on RefCOCOg-test (using extra training data)
1 code implementation • 27 Nov 2023 • Bin Xia, Shiyin Wang, Yingfan Tao, Yitong Wang, Jiaya Jia
In the first stage, we train the MLLM to grasp the properties of image generation and editing, enabling it to generate detailed prompts.
no code implementations • ICCV 2023 • Wenpeng Xiao, Wentao Liu, Yitong Wang, Bernard Ghanem, Bing Li
Considering the complexity of hair structure, we innovatively treat hair wisp extraction as an instance segmentation problem, where a hair wisp is referred to as an instance.
no code implementations • 26 Aug 2023 • Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Radu Timotfe, Luc van Gool
Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations.
1 code implementation • NeurIPS 2023 • Zhuoyan Luo, Yicheng Xiao, Yong liu, Shuyan Li, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang
To address this issue, we propose Semantic-assisted Object Cluster (SOC), which aggregates video content and textual guidance for unified temporal modeling and cross-modal alignment.
Ranked #2 on Referring Expression Segmentation on A2D Sentences (using extra training data)
no code implementations • CVPR 2023 • Jie Qin, Jie Wu, Pengxiang Yan, Ming Li, Ren Yuxi, Xuefeng Xiao, Yitong Wang, Rui Wang, Shilei Wen, Xin Pan, Xingang Wang
Recently, open-vocabulary learning has emerged to accomplish segmentation for arbitrary categories of text-based descriptions, which popularizes the segmentation system to more general-purpose application scenarios.
Ranked #7 on Open Vocabulary Panoptic Segmentation on ADE20K
1 code implementation • ICCV 2023 • Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao
Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).
Knowledge Distillation Open Vocabulary Semantic Segmentation +4
1 code implementation • ICCV 2023 • Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc van Gool
Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network.
1 code implementation • 30 Nov 2022 • Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool
It consists of a knowledge distillation based implicit degradation estimator network (KD-IDE) and an efficient SR network.
1 code implementation • 11 Oct 2022 • Yong liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong Tang, Yujiu Yang
Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head).
2 code implementations • 2 Oct 2022 • Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool
In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.
no code implementations • 28 Sep 2022 • Yitong Wang, Jun Zhao
Compared to cloud computing, as the distributed and closer infrastructure, the convergence of MEC with other emerging technologies, including the Metaverse, 6G wireless communications, artificial intelligence (AI), and blockchain, also solves the problems of network resource allocation, more network load as well as latency requirements.
1 code implementation • CVPR 2023 • Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool
In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.
no code implementations • CVPR 2021 • Qiang Zhou, Shiyin Wang, Yitong Wang, Zilong Huang, Xinggang Wang
Besides, an Amodal Human Perception dataset (AHP) is collected to settle the task of human de-occlusion.
3 code implementations • 24 Feb 2021 • Bing Li, Yuanlue Zhu, Yitong Wang, Chia-Wen Lin, Bernard Ghanem, Linlin Shen
Specifically, a new generator architecture is proposed to simultaneously transfer color/texture styles and transform local facial shapes into anime-like counterparts based on the style of a reference anime-face, while preserving the global structure of the source photo-face.
no code implementations • ECCV 2018 • Yitong Wang, Dihong Gong, Zheng Zhou, Xing Ji, Hao Wang, Zhifeng Li, Wei Liu, Tong Zhang
Extensive experiments conducted on the three public domain face aging datasets (MORPH Album 2, CACD-VS and FG-NET) have shown the effectiveness of the proposed approach and the value of the constructed CAF dataset on AIFR.
Ranked #3 on Age-Invariant Face Recognition on MORPH Album2
no code implementations • ICML 2018 • Li Shen, Peng Sun, Yitong Wang, Wei Liu, Tong Zhang
Specifically, we find that a large class of primal and primal-dual operator splitting algorithms are all special cases of VMOR-HPE.
10 code implementations • CVPR 2018 • Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, Wei Liu
The central task of face recognition, including face verification and identification, involves face feature discrimination.
Ranked #3 on Face Verification on YouTube Faces DB
1 code implementation • 14 Sep 2017 • Yitong Wang, Xing Ji, Zheng Zhou, Hao Wang, Zhifeng Li
Face detection has achieved great success using the region-based methods.
Ranked #2 on Face Detection on FDDB
no code implementations • 4 Jun 2017 • Hao Wang, Zhifeng Li, Xing Ji, Yitong Wang
Faster R-CNN is one of the most representative and successful methods for object detection, and has been becoming increasingly popular in various objection detection applications.