no code implementations • ECCV 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
no code implementations • ECCV 2020 • Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang
It is a challenging problem since (1) the identifying process is susceptible to over-fitting with limited samples of an object, and (2) the sample imbalance between a base (known knowledge) category and a novel category is easy to bias the recognition results.
no code implementations • ECCV 2020 • Jianqiao An, Yucheng Shi, Yahong Han, Meijun Sun, Qi Tian
For a certain object in an image, the relationship between its central region and the peripheral region is not well utilized in existing superpixel segmentation methods.
no code implementations • ECCV 2020 • Kunyuan Du, Ya zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin
Compared with low-bit models trained directly, the proposed framework brings 0. 5% to 3. 4% accuracy gains to three different quantization schemes.
1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian
On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.
no code implementations • ECCV 2020 • Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian
In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.
no code implementations • 3 Dec 2024 • Weijie Kong, Qi Tian, Zijian Zhang, Rox Min, Zuozhuo Dai, Jin Zhou, Jiangfeng Xiong, Xin Li, Bo Wu, Jianwei Zhang, Kathrina Wu, Qin Lin, Aladdin Wang, Andong Wang, Bai Jiawang, Changlin Li, Duojun Huang, Fang Yang, Hao Tan, Hongmei Wang, Jacob Song, Jiawang Bai, Jianbing Wu, Jinbao Xue, Joey Wang, Junkun Yuan, Kai Wang, Mengyang Liu, Pengyu Li, Shuai Li, Weiyan Wang, Wenqing Yu, Xinchi Deng, Yanxin Long, Yi Chen, Yutao Cui, Yuanbo Peng, Zhentao Yu, Zhiyu He, Zhiyong Xu, Zixiang Zhou, Zunnan Xu, Yangyu Tao, Qinglin Lu, Songtao Liu, Daquan Zhou, Hongfa Wang, Yong Yang, Di Wang, Yuhong Liu, Jie Jiang, Caesar Zhong
In this report, we introduce HunyuanVideo, an innovative open-source video foundation model that demonstrates performance in video generation comparable to, or even surpassing, that of leading closed-source models.
no code implementations • 21 Oct 2024 • Xumeng Han, Longhui Wei, Zhiyang Dou, Zipeng Wang, Chenhui Qiang, Xin He, Yingfei Sun, Zhenjun Han, Qi Tian
Mixture-of-Experts (MoE) models embody the divide-and-conquer concept and are a promising approach for increasing model capacity, demonstrating excellent scalability across multiple domains.
1 code implementation • 2 Sep 2024 • Qihua Chen, Yue Ma, Hongfa Wang, Junkun Yuan, Wenzhe Zhao, Qi Tian, Hongmei Wang, Shaobo Min, Qifeng Chen, Wei Liu
Coupling with these two designs enables us to generate higher-resolution outpainting videos with rich content while keeping spatial and temporal consistency.
no code implementations • 23 Jul 2024 • Pengfei Chen, Lingxi Xie, Xinyue Huo, Xuehui Yu, Xiaopeng Zhang, Yingfei Sun, Zhenjun Han, Qi Tian
The Segment Anything model (SAM) has shown a generalized ability to group image pixels into patches, but applying it to semantic-aware segmentation still faces major challenges.
no code implementations • 5 Jul 2024 • Shengxiang Ji, Guanjun Wu, Jiemin Fang, Jiazhong Cen, Taoran Yi, Wenyu Liu, Qi Tian, Xinggang Wang
However, there is a dearth of research focusing on segmentation within 4D representations.
no code implementations • 2 Jul 2024 • Haoru Wang, Wentao Zhu, Luyi Miao, Yishu Xu, Feng Gao, Qi Tian, Yizhou Wang
Human motion generation is a critical task with a wide range of applications.
no code implementations • 26 Jun 2024 • Taoran Yi, Jiemin Fang, Zanwei Zhou, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Xinggang Wang, Qi Tian
The main idea is to bind Gaussians to reasonable geometry, which evolves over the whole generation process.
no code implementations • 25 Jun 2024 • Lin Liu, Quande Liu, Shengju Qian, Yuan Zhou, Wengang Zhou, Houqiang Li, Lingxi Xie, Qi Tian
Video generation is a challenging yet pivotal task in various industries, such as gaming, e-commerce, and advertising.
1 code implementation • CVPR 2024 • Zehong Ma, Shiliang Zhang, Longhui Wei, Qi Tian
Existing works have proposed different methods to embed category cues into the model, \eg, through few-shot fine-tuning, providing category names or textual descriptions to Vision-Language Models.
Ranked #6 on Open Vocabulary Object Detection on LVIS v1.0 (using extra training data)
no code implementations • 5 Jun 2024 • Jingyun Xue, Hongfa Wang, Qi Tian, Yue Ma, Andong Wang, Zhiyuan Zhao, Shaobo Min, Wenzhe Zhao, Kaihao Zhang, Heung-Yeung Shum, Wei Liu, Mengyang Liu, Wenhan Luo
While existing character image animation methods using pose sequences and reference images have shown promising performance, they tend to struggle with incoherent animation in complex scenarios, such as multiple character animation and body occlusion.
no code implementations • 15 May 2024 • Qianru Zhang, Haixin Wang, Cheng Long, Liangcai Su, Xingwei He, Jianlong Chang, Tailin Wu, Hongzhi Yin, Siu-Ming Yiu, Qi Tian, Christian S. Jensen
By integrating generative techniques and providing a standardized framework, the paper contributes to advancing the field and encourages researchers to explore the vast potential of generative techniques in spatial-temporal data mining.
no code implementations • 8 Apr 2024 • Jiannan Ge, Lingxi Xie, Hongtao Xie, Pandeng Li, Xiaopeng Zhang, Yongdong Zhang, Qi Tian
(1) Mutually-Refined Proposal Extraction.
1 code implementation • CVPR 2024 • Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian
Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications.
no code implementations • 27 Mar 2024 • Haitao Li, Qingyao Ai, Jia Chen, Qian Dong, Zhijing Wu, Yiqun Liu, Chong Chen, Qi Tian
However, general LLMs, which are developed on open-domain data, may lack the domain-specific knowledge essential for tasks in vertical domains, such as legal, medical, etc.
no code implementations • 27 Mar 2024 • Haitao Li, Qingyao Ai, Xinyan Han, Jia Chen, Qian Dong, Yiqun Liu, Chong Chen, Qi Tian
Most of the existing works focus on improving the representation ability for the contextualized embedding of the [CLS] token and calculate relevance using textual semantic similarity.
2 code implementations • 15 Feb 2024 • Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined.
1 code implementation • 25 Jan 2024 • Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian
Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.
1 code implementation • 24 Jan 2024 • Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye
In this study, we establish a baseline for a new task named multimodal multi-round referring and grounding (MRG), opening up a promising direction for instance-level multimodal dialogues.
1 code implementation • 12 Jan 2024 • Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, Xiaopeng Zhang
Vision-language foundation models, represented by Contrastive Language-Image Pre-training (CLIP), have gained increasing attention for jointly understanding both vision and textual tasks.
no code implementations • 12 Jan 2024 • Chang Yu, Junran Peng, Xiangyu Zhu, Zhaoxiang Zhang, Qi Tian, Zhen Lei
The text-to-image synthesis by diffusion models has recently shown remarkable performance in generating high-quality images.
1 code implementation • 9 Jan 2024 • Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran Peng, Qi Tian
Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps).
no code implementations • 6 Jan 2024 • Xin He, Longhui Wei, Lingxi Xie, Qi Tian
Multimodal Large Language Models (MLLMs) are experiencing rapid growth, yielding a plethora of noteworthy contributions in recent months.
no code implementations • 28 Dec 2023 • Yuhang Zhang, Yuang Deng, Xiaopeng Zhang, Jie Li, Robert C. Qiu, Qi Tian
In DeLR, the query is based on region-level, and we only annotate the object region that is queried; 2) Instead of directly providing both localization and recognition annotations, we separately query the two components, and thus reduce the recognition budget with the pseudo class labels provided by the model.
1 code implementation • 25 Dec 2023 • Tianhao Shi, Yang Zhang, Zhijian Xu, Chong Chen, Fuli Feng, Xiangnan He, Qi Tian
Instead of dismissing the role of incremental learning, we attribute the lack of anticipated performance enhancement to a mismatch between the LLM4Rec architecture and incremental learning: LLM4Rec employs a single adaptation module for learning recommendations, limiting its ability to simultaneously capture long-term and short-term user preferences in the incremental learning context.
1 code implementation • 16 Dec 2023 • Yihang Zhai, Haixin Wang, Jianlong Chang, Xinlong Yang, Jinan Sun, Shikun Zhang, Qi Tian
Instruction tuning has shown promising potential for developing general-purpose AI capabilities by using large-scale pre-trained models and boosts growing research to integrate multimodal information for creative applications.
no code implementations • 7 Dec 2023 • Yabo Chen, Jiemin Fang, YuYang Huang, Taoran Yi, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian
However, due to the high sparsity of the single input image, Zero-1-to-3 tends to produce geometry and appearance inconsistency across views, especially for complex objects.
1 code implementation • 6 Dec 2023 • Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Zhenjun Han, Qi Tian
The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.
no code implementations • 1 Dec 2023 • Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
This is achieved by attaching an scale-gated affinity feature to each 3D Gaussian to endow it a new property towards multi-granularity segmentation.
no code implementations • CVPR 2024 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Lingxi Xie, Qi Tian, Wei Shen
Parameter-efficient fine-tuning (PEFT) is an effective methodology to unleash the potential of large foundation models in novel scenarios with limited training data.
no code implementations • CVPR 2024 • Junjie Wang, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Qi Tian
Specifically, we first extract the region of interest (RoI) corresponding to the text instruction, aligning it to 3D Gaussians.
no code implementations • 26 Nov 2023 • Hengtong Hu, Lingxi Xie, Xinyue Hue, Richang Hong, Qi Tian
An intriguing property of the setting is that the burden of annotation largely alleviates in comparison to offering the accurate label.
1 code implementation • 23 Nov 2023 • Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou
During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem.
1 code implementation • CVPR 2024 • Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang
Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.
1 code implementation • NeurIPS 2023 • Jin Li, Yaoming Wang, Xiaopeng Zhang, Bowen Shi, Dongsheng Jiang, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian
Specifically, at the intermediate layer of the ViT, we utilize a spatial-aware density-based clustering algorithm to select representative tokens from the token sequence.
1 code implementation • CVPR 2024 • Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang
Representing and rendering dynamic scenes has been an important but challenging task.
1 code implementation • CVPR 2024 • Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, Xinggang Wang
In recent times, the generation of 3D assets from text prompts has shown impressive results.
1 code implementation • 26 Sep 2023 • Yuhui Xu, Lingxi Xie, Xiaotao Gu, Xin Chen, Heng Chang, Hengheng Zhang, Zhengsu Chen, Xiaopeng Zhang, Qi Tian
Recently years have witnessed a rapid development of large language models (LLMs).
no code implementations • 27 Aug 2023 • Yulin Wang, Yizeng Han, Chaofei Wang, Shiji Song, Qi Tian, Gao Huang
Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks.
1 code implementation • 16 Aug 2023 • Keqin Bao, Jizhi Zhang, Wenjie Wang, Yang Zhang, Zhengyi Yang, Yancheng Luo, Chong Chen, Fuli Feng, Qi Tian
As the focus on Large Language Models (LLMs) in the field of recommendation intensifies, the optimization of LLMs for recommendation purposes (referred to as LLM4Rec) assumes a crucial role in augmenting their effectiveness in providing recommendations.
1 code implementation • ICCV 2023 • Shuangrui Ding, Peisen Zhao, Xiaopeng Zhang, Rui Qian, Hongkai Xiong, Qi Tian
Based on the STA score, we are able to progressively prune the tokens without introducing any additional parameters or requiring further re-training.
no code implementations • 4 Aug 2023 • Shikun Sun, Longhui Wei, Junliang Xing, Jia Jia, Qi Tian
Recent score-based diffusion models (SBDMs) show promising results in unpaired image-to-image translation (I2I).
no code implementations • 2 Aug 2023 • Zixuan Ni, Longhui Wei, Jiacheng Li, Siliang Tang, Yueting Zhuang, Qi Tian
In this work, we propose a novel strategy named \textbf{Degeneration-Tuning (DT)} to shield contents of unwanted concepts from SD weights.
no code implementations • 20 Jul 2023 • Wentao Zhu, Xiaoxuan Ma, Dongwoo Ro, Hai Ci, Jinlu Zhang, Jiaxin Shi, Feng Gao, Qi Tian, Yizhou Wang
In this survey, we present a comprehensive literature review of human motion generation, which, to the best of our knowledge, is the first of its kind in this field.
no code implementations • 28 Jun 2023 • Bowen Shi, Xiaopeng Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian
In order to better obtain both discrimination and diversity, we propose a simple but effective Hybrid Distillation strategy, which utilizes both the supervised/CL teacher and the MIM teacher to jointly guide the student model.
no code implementations • 14 Jun 2023 • Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Kaifeng Bi, Xiaotao Gu, Jianlong Chang, Qi Tian
In this paper, we start with a conceptual definition of AGI and briefly review how NLP solves a wide range of tasks via a chat system.
no code implementations • 9 Jun 2023 • Lin Liu, Mingming Zhao, Shanxin Yuan, Wenlong Lyu, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian
Specifically, Cube Mask Sampling Module (CMSM) is proposed to apply both spatial and channel mask sampling modeling to image compression in the pre-training stage.
no code implementations • 8 Jun 2023 • Mingming Zhao, Lin Liu, Lifu Liu, Mengke Li, Qi Tian
To achieve joint channel estimation and feedback, this paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
no code implementations • 24 May 2023 • Jiajie Zhang, Shulin Cao, Tingjia Zhang, Xin Lv, Jiaxin Shi, Qi Tian, Juanzi Li, Lei Hou
To facilitate reasoning, we propose a novel two-stage XQA framework, Reasoning over Hierarchical Question Decomposition Tree (RoHT).
1 code implementation • 22 May 2023 • Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, WangMeng Zuo, Qi Tian
Text-driven diffusion models have unlocked unprecedented abilities in image generation, whereas their video counterpart still lags behind due to the excessive training cost of temporal modeling.
no code implementations • 18 May 2023 • Yuan Zhou, Xin Chen, Yanrong Guo, Shijie Hao, Richang Hong, Qi Tian
Incremental few-shot semantic segmentation (IFSS) aims to incrementally extend a semantic segmentation model to novel classes according to only a few pixel-level annotated data, while preserving its segmentation capability on previously learned base categories.
no code implementations • 11 May 2023 • Zixuan Ni, Longhui Wei, Siliang Tang, Yueting Zhuang, Qi Tian
Moreover, we empirically and theoretically demonstrate how SD leads to a performance decline for CLIP on cross-modal retrieval tasks.
no code implementations • 10 May 2023 • Bruce X. B. Yu, Jianlong Chang, Haixin Wang, Lingbo Liu, Shijie Wang, Zhiyu Wang, Junfan Lin, Lingxi Xie, Haojie Li, Zhouchen Lin, Qi Tian, Chang Wen Chen
With the surprising development of pre-trained visual foundation models, visual tuning jumped out of the standard modus operandi that fine-tunes the whole pre-trained model or just the fully connected layer.
1 code implementation • NeurIPS 2023 • Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results.
1 code implementation • 22 Apr 2023 • Haitao Li, Qingyao Ai, Jia Chen, Qian Dong, Yueyue Wu, Yiqun Liu, Chong Chen, Qi Tian
Moreover, in contrast to the general retrieval, the relevance in the legal domain is sensitive to key legal elements.
no code implementations • 22 Apr 2023 • Xin Chen, Hengheng Zhang, Xiaotao Gu, Kaifeng Bi, Lingxi Xie, Qi Tian
The Mixture of Experts (MoE) model becomes an important choice of large language models nowadays because of its scalability with sublinear computational complexity for training and inference.
1 code implementation • 12 Apr 2023 • Liping Bao, Longhui Wei, Xiaoyu Qiu, Wengang Zhou, Houqiang Li, Qi Tian
Recent researches on unsupervised person re-identification~(reID) have demonstrated that pre-training on unlabeled person images achieves superior performance on downstream reID tasks than pre-training on ImageNet.
no code implementations • 7 Apr 2023 • Gaojie Wu, Wei-Shi Zheng, Yutong Lu, Qi Tian
In this work, we propose a ladder self-attention block with multiple branches and a progressive shift mechanism to develop a light-weight transformer backbone that requires less computing resources (e. g. a relatively small number of parameters and FLOPs), termed Progressive Shift Ladder Transformer (PSLT).
no code implementations • 21 Mar 2023 • Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.
no code implementations • 17 Mar 2023 • Haixin Wang, Jianlong Chang, Xiao Luo, Jinan Sun, Zhouchen Lin, Qi Tian
Despite recent competitive performance across a range of vision tasks, vision Transformers still have an issue of heavy computational costs.
no code implementations • ICCV 2023 • Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images.
no code implementations • ICCV 2023 • Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian
Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS).
Multi-Label Classification Weakly supervised Semantic Segmentation +1
no code implementations • ICCV 2023 • Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang
Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models.
no code implementations • 9 Mar 2023 • Ning Liao, Bowen Shi, Xiaopeng Zhang, Min Cao, Junchi Yan, Qi Tian
To explore prompt learning on the generative pre-trained visual model, as well as keeping the task consistency, we propose Visual Prompt learning as masked visual Token Modeling (VPTM) to transform the downstream visual classification into the pre-trained masked visual token prediction.
no code implementations • 9 Mar 2023 • Ning Liao, Xiaopeng Zhang, Min Cao, Junchi Yan, Qi Tian
In realistic open-set scenarios where labels of a part of testing data are totally unknown, when vision-language (VL) prompt learning methods encounter inputs related to unknown classes (i. e., not seen during training), they always predict them as one of the training classes.
no code implementations • 7 Mar 2023 • Jiacheng Li, Longhui Wei, Zongyuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang
To better accelerate the generative transformers while keeping good generation quality, we propose Lformer, a semi-autoregressive text-to-image generation model.
no code implementations • 20 Feb 2023 • Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian
Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.
no code implementations • 5 Feb 2023 • Zijian Zhang, Zhou Zhao, Jun Yu, Qi Tian
In this paper, we propose a novel and flexible conditional diffusion model by introducing conditions into the forward process.
no code implementations • CVPR 2023 • Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian
PLEor could leverage pre-trained CLIP model to infer the discrepancies encompassing both pre-defined and unknown subcategories, called category-specific discrepancies, and transfer them to the backbone network trained in the close-set scenarios.
1 code implementation • CVPR 2023 • Yaoming Wang, Bowen Shi, Xiaopeng Zhang, Jin Li, Yuchen Liu, Wenrui Dai, Chenglin Li, Hongkai Xiong, Qi Tian
To mitigate the computational and storage demands, recent research has explored Parameter-Efficient Fine-Tuning (PEFT), which focuses on tuning a minimal number of parameters for efficient adaptation.
1 code implementation • CVPR 2023 • Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya zhang, Qi Tian, Yanfeng Wang
Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift.
1 code implementation • 26 Dec 2022 • Deng Li, Aming Wu, Yahong Han, Qi Tian
Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.
no code implementations • CVPR 2023 • Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya zhang, Jianlong Chang, Yanfeng Wang, Qi Tian
And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.
Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization
1 code implementation • 14 Dec 2022 • Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian
However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.
no code implementations • 12 Dec 2022 • Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian
FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.
no code implementations • 4 Dec 2022 • Qi Tian, Kun Kuang, Kelu Jiang, Furui Liu, Zhihua Wang, Fei Wu
The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet.
no code implementations • 28 Nov 2022 • Qi Tian, Kun Kuang, Furui Liu, Baoxiang Wang
e. g., an agent is a random policy while other agents are medium policies.
1 code implementation • CVPR 2023 • Yunjie Tian, Lingxi Xie, Jihao Qiu, Jianbin Jiao, YaoWei Wang, Qi Tian, Qixiang Ye
iTPN is born with two elaborated designs: 1) The first pre-trained feature pyramid upon vision transformer (ViT).
4 code implementations • 3 Nov 2022 • Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian
In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
1 code implementation • CVPR 2023 • Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen
During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.
no code implementations • 20 Oct 2022 • Min Cao, Cong Ding, Chen Chen, Junchi Yan, Qi Tian
Based on a natural assumption that images belonging to the same person identity should not match with images belonging to multiple different person identities across views, called the unicity of person matching on the identity level, we propose an end-to-end person unicity matching architecture for learning and refining the person matching relations.
2 code implementations • 14 Oct 2022 • Xiaoyan Zhang, Gaoyang Tang, Yingying Zhu, Qi Tian
The issue of image haze removal has attracted wide attention in recent years.
1 code implementation • 3 Oct 2022 • Bruce X. B. Yu, Jianlong Chang, Lingbo Liu, Qi Tian, Chang Wen Chen
Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.
no code implementations • 1 Oct 2022 • Shuangrui Ding, Weidi Xie, Yabo Chen, Rui Qian, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
In this paper, we consider the task of unsupervised object discovery in videos.
Ranked #3 on Unsupervised Object Segmentation on DAVIS 2016
no code implementations • 1 Oct 2022 • Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye
LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.
class-incremental learning Few-Shot Class-Incremental Learning +3
no code implementations • 23 Aug 2022 • Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian
Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.
no code implementations • 22 Aug 2022 • Lingbo Liu, Jianlong Chang, Bruce X. B. Yu, Liang Lin, Qi Tian, Chang-Wen Chen
Previous methods usually fine-tuned the entire networks for each specific dataset, which will be burdensome to store massive parameters of these networks.
1 code implementation • 4 Aug 2022 • Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang
Large-scale vision-language pre-training has shown impressive advances in a wide range of downstream tasks.
1 code implementation • 3 Aug 2022 • Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang
In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.
1 code implementation • 31 Jul 2022 • Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian
We also analyze how to build good views for the teacher branch to produce latent representation from the perspective of information bottleneck.
1 code implementation • 31 Jul 2022 • Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang
To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.
no code implementations • 29 Jul 2022 • Shijie Wang, Jianlong Chang, Zhihui Wang, Haojie Li, Wanli Ouyang, Qi Tian
In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation.
1 code implementation • CVPR 2023 • Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian
Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.
no code implementations • 28 Jul 2022 • Xing Nie, Bolin Ni, Jianlong Chang, Gaomeng Meng, Chunlei Huo, Zhaoxiang Zhang, Shiming Xiang, Qi Tian, Chunhong Pan
To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks.
1 code implementation • 23 Jul 2022 • Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu
In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang
Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.
no code implementations • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian
Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.
1 code implementation • 1 Jul 2022 • Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai
Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.
no code implementations • 19 Jun 2022 • Xin Xu, Wei Liu, Zheng Wang, Ruiming Hu, Qi Tian
Guided by original pedestrian images, one stream is employed to learn a camera-invariant global feature for the CC problem via filtering cross-camera interference factors.
Domain Generalization Generalizable Person Re-identification
1 code implementation • 10 Jun 2022 • Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.
1 code implementation • 2 Jun 2022 • Ming Tao, Bing-Kun Bao, Hao Tang, Fei Wu, Longhui Wei, Qi Tian
To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements.
1 code implementation • 30 May 2022 • Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian
A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.
1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian
A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.
no code implementations • 24 May 2022 • Feilong Chen, Xiuyi Chen, Jiaxin Shi, Duzhen Zhang, Jianlong Chang, Qi Tian
It also achieves about +4. 9 AR on COCO and +3. 8 AR on Flickr30K than LightingDot and achieves comparable performance with the state-of-the-art (SOTA) fusion-based model METER.
1 code implementation • 24 May 2022 • Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai
Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax.
3 code implementations • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).
Ranked #36 on Object Detection on COCO test-dev
no code implementations • CVPR 2022 • Yu Zheng, Yueqi Duan, Jiwen Lu, Jie zhou, Qi Tian
A bathtub in a library, a sink in an office, a bed in a laundry room -- the counter-intuition suggests that scene provides important prior knowledge for 3D object detection, which instructs to eliminate the ambiguous detection of similar objects.
no code implementations • CVPR 2022 • Xinyue Huo, Lingxi Xie, Hengtong Hu, Wengang Zhou, Houqiang Li, Qi Tian
Unsupervised domain adaptation (UDA) is an important topic in the computer vision community.
1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye
The past year has witnessed a rapid development of masked image modeling (MIM).
1 code implementation • CVPR 2022 • Qing Chang, Junran Peng, Lingxie Xie, Jiajun Sun, Haoran Yin, Qi Tian, Zhaoxiang Zhang
However, due to the high training costs and the unconsciousness of downstream usages, most self-supervised learning methods lack the capability to correspond to the diversities of downstream scenarios, as there are various data domains, different vision tasks and latency constraints on models.
no code implementations • 11 Mar 2022 • Xiaohan Zhang, Songlin Dong, Jinjie Chen, Qi Tian, Yihong Gong, Xiaopeng Hong
In this paper, we focus on a new and challenging decentralized machine learning paradigm in which there are continuous inflows of data to be addressed and the data are stored in multiple repositories.
no code implementations • 11 Mar 2022 • Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian
In this paper, we propose a novel approach that embeds a task-agnostic prior into a transformer.
no code implementations • 10 Mar 2022 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.
3 code implementations • 29 Jan 2022 • Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, Xiaopeng Zhang, Qi Tian
This is in contrast to recent Gaussian modeling based rotation detectors e. g. GWD loss and KLD loss that involve a human-specified distribution distance metric which require additional hyperparameter tuning that vary across datasets and detectors.
8 code implementations • 10 Jan 2022 • Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian
The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.
no code implementations • CVPR 2022 • Yuhang Zhang, Xiaopeng Zhang, Lingxi Xie, Jie Li, Robert C. Qiu, Hengtong Hu, Qi Tian
The Yes query is treated as positive pairs of the queried category for contrastive pulling, while the No query is treated as hard negative pairs for contrastive repelling.
1 code implementation • CVPR 2022 • Zhengcong Fei, Xu Yan, Shuhui Wang, Qi Tian
On one hand, the representation in shallow layers lacks high-level semantic and sufficient cross-modal fusion information for accurate prediction.
1 code implementation • CVPR 2022 • Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian
Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.
1 code implementation • CVPR 2022 • Wenwen Pan, Haonan Shi, Zhou Zhao, Jieming Zhu, Xiuqiang He, Zhigeng Pan, Lianli Gao, Jun Yu, Fei Wu, Qi Tian
Audio-Guided video semantic segmentation is a challenging problem in visual analysis and editing, which automatically separates foreground objects from background in a video sequence according to the referring audio expressions.
no code implementations • CVPR 2022 • Hui Wu, Min Wang, Wengang Zhou, Houqiang Li, Qi Tian
To this end, we propose a flexible contextual similarity distillation framework to enhance the small query model and keep its output feature compatible with that of large gallery model, which is crucial with asymmetric retrieval.
no code implementations • CVPR 2022 • Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, Qi Tian
Existing NAS-based meta-learning methods apply a two-stage strategy, i. e., first searching architectures and then re-training meta-weights on the searched architecture.
no code implementations • 20 Dec 2021 • Qi Tian, Kun Kuang, Baoxiang Wang, Furui Liu, Fei Wu
The node information compression aims to address the problem of what to communicate via learning compact node representations.
Multi-agent Reinforcement Learning reinforcement-learning +3
1 code implementation • 20 Dec 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.
no code implementations • 17 Dec 2021 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Xin Guo, Youliang Yan, Qi Tian
For zero-shot image restoration, we design a novel model, termed SiamTrans, which is constructed by Siamese transformers, encoders, and decoders.
no code implementations • 16 Dec 2021 • Rui Liu, Yahong Han, YaoWei Wang, Qi Tian
In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency.
no code implementations • 15 Dec 2021 • Gursimran Singh, Lingyang Chu, Lanjun Wang, Jian Pei, Qi Tian, Yong Zhang
In the real world, the frequency of occurrence of objects is naturally skewed forming long-tail class distributions, which results in poor performance on the statistically rare classes.
no code implementations • 5 Dec 2021 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian
In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.
1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.
1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye
In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.
Ranked #63 on Semantic Segmentation on Cityscapes test
no code implementations • 24 Nov 2021 • Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen
Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.
Ranked #10 on Anomaly Detection on Fishyscapes L&F
1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian
Future activity anticipation is a challenging problem in egocentric vision.
no code implementations • 19 Nov 2021 • Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian
Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.
3 code implementations • 28 Oct 2021 • Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li
The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.
1 code implementation • 19 Oct 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian
The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.
Ranked #1 on 3D-Aware Image Synthesis on FFHQ 256 x 256
1 code implementation • 11 Oct 2021 • Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian
Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.
no code implementations • 29 Sep 2021 • Hengtong Hu, Lingxi Xie, Yinquan Wang, Richang Hong, Meng Wang, Qi Tian
We investigate the problem of estimating uncertainty for training data, so that deep neural networks can make use of the results for learning from limited supervision.
no code implementations • 29 Sep 2021 • Mengbiao Zhao, Shixiong Xu, Jianlong Chang, Lingxi Xie, Jie Chen, Qi Tian
Having consumed huge amounts of training data and computational resource, large-scale pre-trained models are often considered key assets of AI service providers.
no code implementations • 7 Sep 2021 • Xiaoman Zhang, Weidi Xie, Chaoqin Huang, Yanfeng Wang, Ya zhang, Xin Chen, Qi Tian
In this paper, we target self-supervised representation learning for zero-shot tumor segmentation.
no code implementations • ICCV 2021 • Xing Nie, Yongcheng Liu, Shaohong Chen, Jianlong Chang, Chunlei Huo, Gaofeng Meng, Qi Tian, Weiming Hu, Chunhong Pan
It can work in a purely data-driven manner and thus is capable of auto-creating a group of suitable convolutions for geometric shape modeling.
no code implementations • 25 Aug 2021 • Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yanfeng Wang, Qi Tian
The core of MST-GNN is a multiscale spatio-temporal graph that explicitly models the relations in motions at various spatial and temporal scales.
2 code implementations • ICCV 2021 • Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu
A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
Ranked #2 on Edge Detection on BRIND
1 code implementation • ICCV 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.
Ranked #2 on Visual Question Answering (VQA) on VQA-CP
no code implementations • 26 Jul 2021 • Zixuan Ni, Haizhou Shi, Siliang Tang, Longhui Wei, Qi Tian, Yueting Zhuang
After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.
1 code implementation • 24 Jul 2021 • Xiujun Shu, Ge Li, Xiao Wang, Weijian Ruan, Qi Tian
The key to this task is to exploit cloth-irrelevant cues.
no code implementations • 21 Jul 2021 • Kunhong Wu, Yucheng Shi, Yahong Han, Yunfeng Shao, Bingshuai Li, Qi Tian
Existing unsupervised domain adaptation (UDA) methods can achieve promising performance without transferring data from source domain to target domain.
1 code implementation • NeurIPS 2021 • Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi Xie, Zenglin Xu, Qi Tian
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).
1 code implementation • 13 Jul 2021 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian
Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.
1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian
Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.
no code implementations • CVPR 2021 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data.