no code implementations • 28 Mar 2024 • Jiacui Huang, Hongtao Zhang, Mingbo Zhao, Zhou Wu
To address this challenge, we propose a new method, namely, Instance-aware Visual Language Map (IVLMap), to empower the robot with instance-level and attribute-level semantic mapping, where it is autonomously constructed by fusing the RGBD video data collected from the robot agent with special-designed natural language map indexing in the bird's-in-eye view.
no code implementations • 20 Mar 2023 • Jiaer Xia, Lei Tan, Pingyang Dai, Mingbo Zhao, Yongjian Wu, Liujuan Cao
To address this issue, we propose a novel transformer-based Attention Disturbance and Dual-Path Constraint Network (ADP) to enhance the generalization of attention networks.
no code implementations • 4 Nov 2022 • Bo wang, Zhao Zhang, Mingbo Zhao, Xiaojie Jin, Mingliang Xu, Meng Wang
To obtain rich features, we use the Swin Transformer to calculate multi-level features, and then feed them into a novel dynamic multi-sight embedding module to exploit both global structure and local texture of input images.
no code implementations • 31 Mar 2022 • Lu Cheng, Mingbo Zhao
To tracking the instance across the video, we have adopted data association strategy for matching the same instance in the video sequence, where we jointly learn target instance appearances and their affinities in a pair of video frames in an end-to-end fashion.
no code implementations • 24 Nov 2021 • Yu Liu, Mingbo Zhao, Zhao Zhang, Haijun Zhang, Shuicheng Yan
Based on this dataset, we then propose the Arbitrary Virtual Try-On Network (AVTON) that is utilized for all-type clothes, which can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person.
no code implementations • 27 Jul 2021 • Wenlong Cheng, Mingbo Zhao, Zhiling Ye, Shuhang Gu
In this paper, we propose a novel compression framework \textbf{M}ulti-scale \textbf{F}eature \textbf{A}ggregation Net based \textbf{GAN} (MFAGAN) for reducing the memory access cost of the generator.
Hardware Aware Neural Architecture Search Image Super-Resolution +1
1 code implementation • 23 Jul 2021 • Jicong Fan, Yiheng Tu, Zhao Zhang, Mingbo Zhao, Haijun Zhang
First, we propose to find the most reliable affinity matrix via grid search or Bayesian optimization among a set of candidates given by different AMC methods with different hyperparameters, where the reliability is quantified by the \textit{relative-eigen-gap} of graph Laplacian introduced in this paper.
no code implementations • 23 Jan 2020 • Yanyan Wei, Zhao Zhang, Yang Wang, Haijun Zhang, Mingbo Zhao, Mingliang Xu, Meng Wang
Although supervised deep deraining networks have obtained impressive results on synthetic datasets, they still cannot obtain satisfactory results on real images due to weak generalization of rain removal capacity, i. e., the pre-trained models usually cannot handle new shapes and directions that may lead to over-derained/under-derained results.
no code implementations • 13 Dec 2019 • Yan Zhang, Zhao Zhang, Zheng Zhang, Mingbo Zhao, Li Zhang, Zheng-Jun Zha, Meng Wang
In this paper, we investigate the unsupervised deep representation learning issue and technically propose a novel framework called Deep Self-representative Concept Factorization Network (DSCF-Net), for clustering deep features.
no code implementations • 20 Nov 2019 • Huan Zhang, Zhao Zhang, Mingbo Zhao, Qiaolin Ye, Min Zhang, Meng Wang
Our method can jointly re-cover the underlying clean data, clean labels and clean weighting spaces by decomposing the original data, predicted soft labels or weights into a clean part plus an error part by fitting noise.
no code implementations • 29 May 2019 • Zhao Zhang, Lei Jia, Mingbo Zhao, Guangcan Liu, Meng Wang, Shuicheng Yan
A Kernel-Induced Label Propagation (Kernel-LP) framework by mapping is proposed for high-dimensional data classification using the most informative patterns of data in kernel space.
1 code implementation • CVPR 2019 • Yuan Gao, Jiayi Ma, Mingbo Zhao, Wei Liu, Alan L. Yuille
In this paper, we propose a novel Convolutional Neural Network (CNN) structure for general-purpose multi-task learning (MTL), which enables automatic feature fusing at every layer from different tasks.
Ranked #93 on Semantic Segmentation on NYU Depth v2