no code implementations • 30 Oct 2018 • Jiaming Liu, Yu Sun, Xiaojian Xu, Ulugbek S. Kamilov
In the past decade, sparsity-driven regularization has led to significant improvements in image reconstruction.
no code implementations • 24 Dec 2018 • Yipeng Sun, Chengquan Zhang, Zuming Huang, Jiaming Liu, Junyu Han, Errui Ding
Reading text from images remains challenging due to multi-orientation, perspective distortion and especially the curved nature of irregular text.
no code implementations • 2 Jan 2019 • Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, Errui Ding
However, text in the wild is usually perspectively distorted or curved, which can not be easily tackled by existing approaches.
1 code implementation • 29 Apr 2019 • Jiaming Liu, Chi-Hao Wu, Yuzhi Wang, Qin Xu, Yuqian Zhou, Haibin Huang, Chuan Wang, Shaofan Cai, Yifan Ding, Haoqiang Fan, Jue Wang
In this paper, we present new data pre-processing and augmentation techniques for DNN-based raw image denoising.
no code implementations • 8 May 2019 • Yifan Ding, Chuan Wang, Haibin Huang, Jiaming Liu, Jue Wang, Liqiang Wang
Compared with image inpainting, performing this task on video presents new challenges such as how to preserving temporal consistency and spatial details, as well as how to handle arbitrary input video size and length fast and efficiently.
1 code implementation • NeurIPS 2019 • Yu Sun, Jiaming Liu, Ulugbek S. Kamilov
In this work, we develop a new block coordinate RED algorithm that decomposes a large-scale estimation problem into a sequence of updates over a small subset of the unknown variables.
no code implementations • ICCV 2019 • Yi He, Jiayuan Shi, Chuan Wang, Haibin Huang, Jiaming Liu, Guanbin Li, Risheng Liu, Jue Wang
In this paper we present a new data-driven method for robust skin detection from a single human portrait image.
2 code implementations • 8 Aug 2019 • Liang Wu, Chengquan Zhang, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai
Specifically, we propose an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module and fusion module.
Ranked #1 on Image Inpainting on StreetView
no code implementations • 4 Sep 2019 • Zihui Wu, Yu Sun, Jiaming Liu, Ulugbek S. Kamilov
Regularization by denoising (RED) is a powerful framework for solving imaging inverse problems.
no code implementations • ICCV 2019 • Shaofan Cai, Xiaoshuai Zhang, Haoqiang Fan, Haibin Huang, Jiangyu Liu, Jiaming Liu, Jiaying Liu, Jue Wang, Jian Sun
Most previous image matting methods require a roughly-specificed trimap as input, and estimate fractional alpha values for all pixels that are in the unknown region of the trimap.
no code implementations • ICCV 2019 • Yipeng Sun, Jiaming Liu, Wei Liu, Junyu Han, Errui Ding, Jingtuo Liu
Most existing text reading benchmarks make it difficult to evaluate the performance of more advanced deep learning models in large vocabularies due to the limited amount of training data.
no code implementations • 20 Sep 2019 • Jiaming Liu, Yu Sun, Ulugbek S. Kamilov
We introduce a new algorithm for regularized reconstruction of multispectral (MS) images from noisy linear measurements.
1 code implementation • 20 Sep 2019 • He guo, Xiameng Qin, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding
Extracting entity from images is a crucial part of many OCR applications, such as entity recognition of cards, invoices, and receipts.
Entity Extraction using GAN Optical Character Recognition (OCR)
no code implementations • 15 Mar 2020 • Kaiyan Chen, Ming Wu, Jiaming Liu, Chuang Zhang
To further promote the research of ship detection, we introduced a new fine-grained ship detection datasets, which is named as FGSD.
no code implementations • 15 May 2020 • Xiaojian Xu, Yu Sun, Jiaming Liu, Brendt Wohlberg, Ulugbek S. Kamilov
Plug-and-play priors (PnP) is a methodology for regularized image reconstruction that specifies the prior through an image denoiser.
no code implementations • 29 Sep 2020 • Weijie Gan, Yu Sun, Cihat Eldeniz, Jiaming Liu, Hongyu An, Ulugbek S. Kamilov
One of the key limitations in conventional deep learning based image reconstruction is the need for registered pairs of training images containing a set of high-quality groundtruth images.
no code implementations • ICLR 2021 • Yu Sun, Jiaming Liu, Yiran Sun, Brendt Wohlberg, Ulugbek S. Kamilov
Regularization by denoising (RED) is a recently developed framework for solving inverse problems by integrating advanced denoisers as image priors.
1 code implementation • ECCV 2020 • Yuzhi Wang, Haibin Huang, Qin Xu, Jiaming Liu, Yiqun Liu, Jue Wang
Deep learning-based image denoising approaches have been extensively studied in recent years, prevailing in many public benchmark datasets.
no code implementations • 26 Nov 2020 • Mingyang Xie, Yu Sun, Jiaming Liu, Brendt Wohlberg, Ulugbek S. Kamilov
Cal-RED extends the traditional RED methodology to imaging problems that require the calibration of the measurement operator.
no code implementations • 1 Jan 2021 • Zhaoqing Wang, Jiaming Liu, Yangyuxuan Kang, Mingming Gong, Chuang Zhang, Ming Lu, Ming Wu
Graph Reasoning has shown great potential recently in modeling long-range dependencies, which are crucial for various computer vision tasks.
1 code implementation • 22 Jan 2021 • Jiaming Liu, Yu Sun, Weijie Gan, Xiaojian Xu, Brendt Wohlberg, Ulugbek S. Kamilov
Deep unfolding networks have recently gained popularity in the context of solving imaging inverse problems.
1 code implementation • 9 Feb 2021 • Yu Sun, Jiaming Liu, Mingyang Xie, Brendt Wohlberg, Ulugbek S. Kamilov
We propose Coordinate-based Internal Learning (CoIL) as a new deep-learning (DL) methodology for the continuous representation of measurements.
1 code implementation • NeurIPS 2021 • Jiaming Liu, M. Salman Asif, Brendt Wohlberg, Ulugbek S. Kamilov
The plug-and-play priors (PnP) and regularization by denoising (RED) methods have become widely used for solving inverse problems by leveraging pre-trained deep denoisers as image priors.
1 code implementation • 12 Jul 2021 • Weijie Gan, Yu Sun, Cihat Eldeniz, Jiaming Liu, Hongyu An, Ulugbek S. Kamilov
Deep neural networks for medical image reconstruction are traditionally trained using high-quality ground-truth images as training targets.
1 code implementation • ICCV 2021 • Jiaming Liu, Ming Lu, Kaixin Chen, Xiaoqi Li, Shizun Wang, Zhaoqing Wang, Enhua Wu, Yurong Chen, Chuang Zhang, Ming Wu
Internet video delivery has undergone a tremendous explosion of growth over the past few years.
1 code implementation • 3 Sep 2021 • Xiaojian Xu, Satya V. V. N. Kothapalli, Jiaming Liu, Sayan Kahali, Weijie Gan, Dmitriy A. Yablonskiy, Ulugbek S. Kamilov
LEARN-IMG performs motion correction on mGRE images and relies on the subsequent analysis for the estimation of $R_2^\ast$ maps, while LEARN-BIO directly performs motion- and $B0$-inhomogeneity-corrected $R_2^\ast$ estimation.
1 code implementation • 30 Nov 2021 • Shizun Wang, Ming Lu, Kaixin Chen, Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu
However, existing methods mostly train the DNNs on uniformly sampled LR-HR patch pairs, which makes them fail to fully exploit informative patches within the image.
no code implementations • 5 Jan 2022 • Xingqun Qi, Muyi Sun, Zijian Wang, Jiaming Liu, Qi Li, Fang Zhao, Shanghang Zhang, Caifeng Shan
To preserve the generated faces being more structure-coordinated, the IRSG models inter-class structural relations among every facial component by graph representation learning.
Generative Adversarial Network Graph Representation Learning +1
no code implementations • 10 Feb 2022 • Yuyang Hu, Jiaming Liu, Xiaojian Xu, Ulugbek S. Kamilov
Regularization by denoising (RED) is a widely-used framework for solving inverse problems by leveraging image denoisers as image priors.
1 code implementation • 22 Mar 2022 • Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
Once the incremental capacity is below the threshold, the patch can exit at the specific layer.
no code implementations • 10 Apr 2022 • Weijie Gan, Cihat Eldeniz, Jiaming Liu, Sihao Chen, Hongyu An, Ulugbek S. Kamilov
We propose a new plug-and-play priors (PnP) based MR image reconstruction method that systematically enforces data consistency while also exploiting deep-learning priors.
no code implementations • CVPR 2022 • Changyong Shu, Hemao Wu, Hang Zhou, Jiaming Liu, Zhibin Hong, Changxing Ding, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
Particularly, seamless blending is achieved with the help of a Semantic-Guided Color Reference Creation procedure and a Blending UNet.
1 code implementation • 3 May 2022 • Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, JianXin Li, Kurt Keutzer, Shanghang Zhang
To solve this problem, we propose an end-to-end cross-domain detection Transformer based on the mean teacher framework, MTTrans, which can fully exploit unlabeled target domain data in object detection training and transfer knowledge between domains via pseudo labels.
2 code implementations • CVPR 2022 • Licheng Tang, Yiyang Cai, Jiaming Liu, Zhibin Hong, Mingming Gong, Minhu Fan, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
Instead of explicitly disentangling global or component-wise modeling, the cross-attention mechanism can attend to the right local styles in the reference glyphs and aggregate the reference styles into a fine-grained style representation for the given content glyphs.
1 code implementation • 25 May 2022 • Jiaming Liu, Xiaojian Xu, Weijie Gan, Shirin Shoushtari, Ulugbek S. Kamilov
However, the dependence of the computational/memory complexity of the measurement models in PnP/RED on the total number of measurements leaves DEQ impractical for many imaging applications.
1 code implementation • 20 Jul 2022 • Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.
no code implementations • 26 Jul 2022 • Shirin Shoushtari, Jiaming Liu, Yuyang Hu, Ulugbek S. Kamilov
While the empirical performance and theoretical properties of DMBAs have been widely investigated, the existing work in the area has primarily focused on their performance when the desired image prior is known exactly.
no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang
In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.
1 code implementation • 26 Aug 2022 • Jiaming Liu, Qizhe Zhang, Jianing Li, Ming Lu, Tiejun Huang, Shanghang Zhang
Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in real-world applications due to its inherent advantage to overcome high-velocity motion blur.
no code implementations • 23 Sep 2022 • Tomas Kerepecky, Jiaming Liu, Xue Wen Ng, David W. Piston, Ulugbek S. Kamilov
Three-dimensional fluorescence microscopy often suffers from anisotropy, where the resolution along the axial direction is lower than that within the lateral imaging plane.
no code implementations • 27 Sep 2022 • Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping, thus the generator's advantage can be adopted for optimizing identity similarity.
no code implementations • 5 Oct 2022 • Yuyang Hu, Weijie Gan, Chunwei Ying, Tongyao Wang, Cihat Eldeniz, Jiaming Liu, Yasheng Chen, Hongyu An, Ulugbek S. Kamilov
However, estimation of accurate CSMs is a challenging problem when measurements are highly undersampled.
no code implementations • 7 Oct 2022 • Weijie Gan, Chunwei Ying, Parna Eshraghi, Tongyao Wang, Cihat Eldeniz, Yuyang Hu, Jiaming Liu, Yasheng Chen, Hongyu An, Ulugbek S. Kamilov
Our numerical results on in-vivo MRI data show that SelfDEQ leads to state-of-the-art performance using only undersampled and noisy training data.
no code implementations • 1 Nov 2022 • Junhao Hu, Shirin Shoushtari, Zihao Zou, Jiaming Liu, Zhixin Sun, Ulugbek S. Kamilov
Deep model-based architectures (DMBAs) are widely used in imaging inverse problems to integrate physical measurement models and learned image priors.
no code implementations • 1 Nov 2022 • Shirin Shoushtari, Jiaming Liu, Ulugbek S. Kamilov
Phase retrieval refers to the problem of recovering an image from the magnitudes of its complex-valued linear measurements.
no code implementations • ICCV 2023 • Jiaming Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Stewart He, K. Aditya Mohan, Ulugbek S. Kamilov, Hyojin Kim
Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine.
no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang
In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.
1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • CVPR 2023 • Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang
In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.
no code implementations • CVPR 2023 • Yulu Gan, Mingjie Pan, Rongyu Zhang, Zijian Ling, Lingran Zhao, Jiaming Liu, Shanghang Zhang
To enable the device model to deal with changing environments, we propose a new learning paradigm of Cloud-Device Collaborative Continual Adaptation, which encourages collaboration between cloud and device and improves the generalization of the device model.
1 code implementation • 9 Mar 2023 • Zihao Zou, Jiaming Liu, Brendt Wohlberg, Ulugbek S. Kamilov
ELDER is based on a regularization functional parameterized by a CNN and a deep equilibrium learning (DEQ) method for training the functional to be MSE-optimal at the fixed points of the reconstruction algorithm.
1 code implementation • 17 Mar 2023 • Senqiao Yang, Jiarui Wu, Jiaming Liu, Xiaoqi Li, Qizhe Zhang, Mingjie Pan, Yulu Gan, Zehui Chen, Shanghang Zhang
The visual prompts have provided an efficient manner in addressing visual cross-domain problems.
no code implementations • 13 Apr 2023 • Huicheng Pi, Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Shunli Zhang
In these works, omnidirectional frames are projected from the 3D sphere to a 2D plane by Equi-Rectangular Projection (ERP).
1 code implementation • CVPR 2023 • Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Yurong Chen, Shunli Zhang
Therefore, we design a strategy to build an Edge-to-Bit lookup table that maps the edge score of a patch to the bit of each layer during inference.
1 code implementation • 22 May 2023 • Jiaming Liu, Yangqiming Wang, Tongze Zhang, Yulu Fan, Qinli Yang, Junming Shao
Traditional semi-supervised learning tasks assume that both labeled and unlabeled data follow the same class distribution, but the realistic open-world scenarios are of more complexity with unknown novel classes mixed in the unlabeled set.
1 code implementation • 7 Jun 2023 • Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang
Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.
no code implementations • 15 Jun 2023 • Mingjie Pan, Li Liu, Jiaming Liu, Peixiang Huang, Longlong Wang, Shanghang Zhang, Shaoqing Xu, Zhiyi Lai, Kuiyuan Yang
In this technical report, we present our solution, named UniOCC, for the Vision-Centric 3D occupancy prediction track in the nuScenes Open Dataset Challenge at CVPR 2023.
Ranked #3 on Prediction Of Occupancy Grid Maps on Occ3D-nuScenes
no code implementations • 21 Jun 2023 • Mingjie Pan, Yulu Gan, Fangxu Zhou, Jiaming Liu, Aimin Wang, Shanghang Zhang, Dawei Li
Since the diffusion model learns the universal structural distribution of biological tissues, which is independent of the axial resolution, DiffuseIR can reconstruct authentic images with unseen low-axial resolutions into a high-axial resolution without requiring re-training.
no code implementations • 1 Jul 2023 • Peidong Jia, Jiaming Liu, Senqiao Yang, Jiarui Wu, Xiaodong Xie, Shanghang Zhang
PDM comprehensively leverages the prompt memory to extract domain-specific knowledge and explicitly constructs a long-term memory space for the data distribution, which represents better domain diversity compared to existing methods.
1 code implementation • 24 Aug 2023 • Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyu Guo, Jiaming Liu, Hao Dong, Peng Gao
However, the prior pre-training stage not only introduces excessive time overhead, but also incurs a significant domain gap on `unseen' classes.
3D Semantic Segmentation Few-shot 3D semantic segmentation +1
1 code implementation • ICCV 2023 • Yingping Liang, Jiaming Liu, Debing Zhang, Ying Fu
The accuracy of learning-based optical flow estimation models heavily relies on the realism of the training datasets.
1 code implementation • 18 Sep 2023 • Mingjie Pan, Jiaming Liu, Renrui Zhang, Peixiang Huang, Xiaoqi Li, Bing Wang, Hongwei Xie, Li Liu, Shanghang Zhang
3D occupancy prediction holds significant promise in the fields of robot perception and autonomous driving, which quantifies 3D scenes into grid cells with semantic labels.
no code implementations • 22 Sep 2023 • Xiaobao Wei, Renrui Zhang, Jiarui Wu, Jiaming Liu, Ming Lu, Yandong Guo, Shanghang Zhang
Firstly, to separate the target object from the scene, we propose a novel strategy to lift the multi-view 2D segmentation masks of SAM into a unified 3D variation field.
no code implementations • 24 Sep 2023 • Jiayi Ni, Senqiao Yang, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, ran Xu, Zehui Chen, Yi Liu, Shanghang Zhang
In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications.
no code implementations • 29 Sep 2023 • Shirin Shoushtari, Jiaming Liu, Edward P. Chandler, M. Salman Asif, Ulugbek S. Kamilov
Our second set of numerical results considers a simple and effective domain adaption strategy that closes the performance gap due to the use of mismatched denoisers.
1 code implementation • 5 Oct 2023 • Marien Renaud, Jiaming Liu, Valentin De Bortoli, Andrés Almansa, Ulugbek S. Kamilov
Posterior sampling has been shown to be a powerful Bayesian approach for solving imaging inverse problems.
1 code implementation • 9 Oct 2023 • Bohan Zeng, Shanglin Li, Yutang Feng, Hong Li, Sicheng Gao, Jiaming Liu, Huaxia Li, Xu Tang, Jianzhuang Liu, Baochang Zhang
Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D generation.
no code implementations • 13 Oct 2023 • Xiaoqi Li, Yanzi Wang, Yan Shen, Ponomarenko Iaroslav, Haoran Lu, Qianxu Wang, Boshi An, Jiaming Liu, Hao Dong
This framework is designed to capture multiple perspectives of the target object and infer depth information to complement its geometry.
no code implementations • 20 Nov 2023 • Yuan Zhang, Tao Huang, Jiaming Liu, Tao Jiang, Kuan Cheng, Shanghang Zhang
(2) During the distillation period, a pixel-wise frequency mask is generated via Frequency Prompt, to localize those pixel of interests (PoIs) in various frequency bands.
no code implementations • 26 Nov 2023 • Zihao Zou, Jiaming Liu, Shirin Shoushtari, YuBo Wang, Weijie Gan, Ulugbek S. Kamilov
Face video restoration (FVR) is a challenging but important problem where one seeks to recover a perceptually realistic face videos from a low-quality input.
1 code implementation • 28 Nov 2023 • Shangkun Sun, Jiaming Liu, Thomas H. Li, Huaxia Li, Guoqing Liu, Wei Gao
To address this issue, multi-frame optical flow methods leverage adjacent frames to mitigate the local ambiguity.
1 code implementation • 5 Dec 2023 • Qizhe Zhang, Bocheng Zou, Ruichuan An, Jiaming Liu, Shanghang Zhang
Motivated by this, we propose Mixture of Sparse Adapters, or MoSA, as a novel Adapter Tuning method to fully unleash the potential of each parameter in the adapter.
2 code implementations • 11 Dec 2023 • Jiaming Liu, Yue Wu, Maoguo Gong, Qiguang Miao, Wenping Ma, Can Qin
3D Single Object Tracking (SOT) stands a forefront task of computer vision, proving essential for applications like autonomous driving.
no code implementations • 19 Dec 2023 • Jiaming Liu, ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang
To tackle these issues, we propose a continual self-supervised method, Adaptive Distribution Masked Autoencoders (ADMA), which enhances the extraction of target domain knowledge while mitigating the accumulation of distribution shifts.
no code implementations • 21 Dec 2023 • Senqiao Yang, Jiaming Liu, Ray Zhang, Mingjie Pan, Zoey Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Yandong Guo, Shanghang Zhang
In this paper, we introduce LiDAR-LLM, which takes raw LiDAR data as input and harnesses the remarkable reasoning capabilities of LLMs to gain a comprehensive understanding of outdoor 3D scenes.
no code implementations • 24 Dec 2023 • Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yuxing Long, Yan Shen, Renrui Zhang, Jiaming Liu, Hao Dong
By fine-tuning the injected adapters, we preserve the inherent common sense and reasoning ability of the MLLMs while equipping them with the ability for manipulation.
1 code implementation • 26 Dec 2023 • Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing
Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.
no code implementations • 26 Dec 2023 • Guanqun Wang, Jiaming Liu, Chenxuan Li, Junpeng Ma, Yuan Zhang, Xinyu Wei, Kevin Zhang, Maurice Chong, Ray Zhang, Yijiang Liu, Shanghang Zhang
However, the deployment of these large-scale MLLMs on client devices is hindered by their extensive model parameters, leading to a notable decline in generalization capabilities when these models are compressed for device deployment.
no code implementations • 27 Dec 2023 • Rongyu Zhang, Yulin Luo, Jiaming Liu, Huanrui Yang, Zhen Dong, Denis Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Yuan Du, Shanghang Zhang
In this work, we propose an efficient MoE architecture with weight sharing across the experts.
no code implementations • 28 Dec 2023 • Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang
We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.
no code implementations • 27 Feb 2024 • Zehui Chen, Qiuchen Wang, Zhenyu Li, Jiaming Liu, Shanghang Zhang, Feng Zhao
In this report, we present our solution to the multi-task robustness track of the 1st Visual Continual Learning (VCL) Challenge at ICCV 2023 Workshop.
no code implementations • 12 Mar 2024 • Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao
Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.
no code implementations • 15 Mar 2024 • Edward P. Chandler, Shirin Shoushtari, Jiaming Liu, M. Salman Asif, Ulugbek S. Kamilov
A common issue with the learned models is that of a performance drop when there is a distribution shift between the training and testing data.
no code implementations • 16 Mar 2024 • Rui Wang, Hailong Guo, Jiaming Liu, Huaxia Li, Haibo Zhao, Xu Tang, Yao Hu, Hao Tang, Peipei Li
In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on.
no code implementations • 22 Mar 2024 • Hongzhi Gao, Zheng Chen, Zehui Chen, Lin Chen, Jiaming Liu, Shanghang Zhang, Feng Zhao
Training high-accuracy 3D detectors necessitates massive labeled 3D annotations with 7 degree-of-freedom, which is laborious and time-consuming.
1 code implementation • 26 Mar 2024 • Chenhongyi Yang, Zehui Chen, Miguel Espinosa, Linus Ericsson, Zhenyu Wang, Jiaming Liu, Elliot J. Crowley
In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information.