1 code implementation • 29 Feb 2024 • Zhihao Duan, Ming Lu, Justin Yang, Jiangpeng He, Zhan Ma, Fengqing Zhu
This paper explores the possibility of extending the capability of pre-trained neural image compressors (e. g., adapting to new data or target bitrates) without breaking backward compatibility, the ability to decode bitstreams encoded by the original model.
no code implementations • 17 Feb 2024 • Gaocheng Ma, Yinfeng Chai, Tianhao Jiang, Ming Lu, Tong Chen
Image compression has been the subject of extensive research for several decades, resulting in the development of well-known standards such as JPEG, JPEG2000, and H. 264/AVC.
no code implementations • 31 Jan 2024 • Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang
To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.
no code implementations • 21 Jan 2024 • Yichi Zhang, Zhihao Duan, Ming Lu, Dandan Ding, Fengqing Zhu, Zhan Ma
While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image.
1 code implementation • 9 Jan 2024 • Linshan Wu, Ming Lu, Leyuan Fang
Compared with the existing category alignment methods, our CR aims to regularize the correlation between different dimensions of the features and thus performs more robustly when dealing with the divergent category features of imbalanced and inconsistent distributions.
no code implementations • 6 Jan 2024 • Mengfei Li, Ming Lu, Xiaofang Li, Shanghang Zhang
First, existing methods assume enough high-quality images are available for training the NeRF model, ignoring real-world image degradation.
no code implementations • 12 Dec 2023 • Ming Lu, Zhihao Duan, Fengqing Zhu, Zhan Ma
Recently, probabilistic predictive coding that directly models the conditional distribution of latent features across successive frames for temporal redundancy removal has yielded promising results.
no code implementations • 3 Dec 2023 • Jianchen Zhao, Cheng-Ching Tseng, Ming Lu, Ruichuan An, Xiaobao Wei, He Sun, Shanghang Zhang
However, manually designing the partition scheme for a complex scene is very challenging and fails to jointly learn the partition and INRs.
no code implementations • 28 Nov 2023 • Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang
In contrast, implicit methods learn continuous representations for segmentation, which is crucial for medical image segmentation.
no code implementations • 28 Nov 2023 • Peng Chen, Xiaobao Wei, Ming Lu, Yitong Zhu, Naiming Yao, Xingyu Xiao, Hui Chen
To address the above limitations, we propose DiffusionTalker, a diffusion-based method that utilizes contrastive learning to personalize 3D facial animation and knowledge distillation to accelerate 3D animation generation.
no code implementations • 12 Oct 2023 • Yun Ye, Yanjie Pan, Qually Jiang, Ming Lu, Xiaoran Fang, Beryl Xu
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods.
no code implementations • 22 Sep 2023 • Xiaobao Wei, Renrui Zhang, Jiarui Wu, Jiaming Liu, Ming Lu, Yandong Guo, Shanghang Zhang
Firstly, to separate the target object from the scene, we propose a novel strategy to lift the multi-view 2D segmentation masks of SAM into a unified 3D variation field.
no code implementations • ICCV 2023 • Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang
Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements.
1 code implementation • 8 Jun 2023 • Hejun Huang, Zuguo Chen, Ying Zou, Ming Lu, Chaoyang Chen
An efficient Channel Prior Convolutional Attention (CPCA) method is proposed in this paper, supporting the dynamic distribution of attention weights in both channel and spatial dimensions.
1 code implementation • 7 Jun 2023 • Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang
Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.
1 code implementation • CVPR 2023 • Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Yurong Chen, Shunli Zhang
Therefore, we design a strategy to build an Edge-to-Bit lookup table that maps the edge score of a patch to the bit of each layer during inference.
no code implementations • 13 Apr 2023 • Huicheng Pi, Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Shunli Zhang
In these works, omnidirectional frames are projected from the 3D sphere to a 2D plane by Equi-Rectangular Projection (ERP).
no code implementations • 24 Mar 2023 • Yulin Luo, Rui Zhao, Xiaobao Wei, Jinwei Chen, Yijie Lu, Shenghao Xie, Tianyu Wang, Ruiqin Xiong, Ming Lu, Shanghang Zhang
Our MoWE achieves SOTA performance in upstream task on the proposed dataset and two public datasets, i. e. All-Weather and Rain/Fog-Cityscapes, and also have better perceptual results in downstream segmentation task compared to other methods.
2 code implementations • 16 Feb 2023 • Zhihao Duan, Ming Lu, Jack Ma, Yuning Huang, Zhan Ma, Fengqing Zhu
This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications.
1 code implementation • 15 Dec 2022 • Zhihao LI, Ming Lu, Xu Zhang, Xin Feng, M. Salman Asif, Zhan Ma
Conventional cameras capture image irradiance on a sensor and convert it to RGB images using an image signal processor (ISP).
no code implementations • CVPR 2023 • Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang
In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.
1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang
In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.
no code implementations • 5 Nov 2022 • Junqi Shi, Ming Lu, Zhan Ma
Quantizing a floating-point neural network to its fixed-point representation is crucial for Learned Image Compression (LIC) because it improves decoding consistency for interoperability and reduces space-time complexity for implementation.
1 code implementation • 4 Oct 2022 • Hejun Huang, Zuguo Chen, Chaoyang Chen, Ming Lu, Ying Zou
A network based on complementary consistency training, called CC-Net, has been proposed for semi-supervised left atrium image segmentation.
2 code implementations • 27 Aug 2022 • Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu
Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory.
1 code implementation • 26 Aug 2022 • Jiaming Liu, Qizhe Zhang, Jianing Li, Ming Lu, Tiejun Huang, Shanghang Zhang
Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in real-world applications due to its inherent advantage to overcome high-velocity motion blur.
no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang
In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.
1 code implementation • 20 Jul 2022 • Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.
1 code implementation • 19 Jul 2022 • Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu
Previous works on morphable models mostly focus on large-scale facial geometry but ignore facial details.
1 code implementation • 25 Apr 2022 • Ming Lu, Fangdong Chen, ShiLiang Pu, Zhan Ma
To this end, Integrated Convolution and Self-Attention (ICSA) unit is first proposed to form a content-adaptive transform to characterize and embed neighborhood information dynamically of any input.
1 code implementation • 22 Mar 2022 • Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
Once the incremental capacity is below the threshold, the patch can exit at the specific layer.
no code implementations • 26 Feb 2022 • Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu
End-to-end learned lossy image coders (LICs), as opposed to hand-crafted image codecs, have shown increasing superiority in terms of the rate-distortion performance.
no code implementations • 10 Jan 2022 • Ming Lu, Leyuan Fang, Muxing Li, Bob Zhang, Yi Zhang, Pedram Ghamisi
Therefore, we study how to utilize point labels to extract water bodies and propose a novel method called the neighbor feature aggregation network (NFANet).
1 code implementation • 30 Nov 2021 • Shizun Wang, Ming Lu, Kaixin Chen, Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu
However, existing methods mostly train the DNNs on uniformly sampled LR-HR patch pairs, which makes them fail to fully exploit informative patches within the image.
no code implementations • 12 Nov 2021 • Ming Lu, Peiyao Guo, Huiqing Shi, Chuntong Cao, Zhan Ma
A Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders.
1 code implementation • ICCV 2021 • Jiaming Liu, Ming Lu, Kaixin Chen, Xiaoqi Li, Shizun Wang, Zhaoqing Wang, Enhua Wu, Yurong Chen, Chuang Zhang, Ming Wu
Internet video delivery has undergone a tremendous explosion of growth over the past few years.
1 code implementation • 11 Aug 2021 • Yikai Wang, Fuchun Sun, Ming Lu, Anbang Yao
We propose a compact and effective framework to fuse multimodal features at multiple layers in a single network.
Ranked #41 on Semantic Segmentation on NYU Depth v2
no code implementations • 5 Aug 2021 • Haojie Liu, Ming Lu, Zhiqi Chen, Xun Cao, Zhan Ma, Yao Wang
We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements, and combines them adaptively to achieve more accurate inter prediction.
1 code implementation • 21 Apr 2021 • Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng Li, Thomas Tanay, Fenglong Song, Wentao Chao, Qiang Guo, Yan Liu, Jiang Li, Xiaochao Qu, Dewang Hou, Jiayu Yang, Lyn Jiang, Di You, Zhenyu Zhang, Chong Mou, Iaroslav Koshelev, Pavel Ostyakov, Andrey Somov, Jia Hao, Xueyi Zou, Shijie Zhao, Xiaopeng Sun, Yiting Liao, Yuanzhi Zhang, Qing Wang, Gen Zhan, Mengxi Guo, Junlin Li, Ming Lu, Zhan Ma, Pablo Navarrete Michelini, Hai Wang, Yiyun Chen, Jingyu Guo, Liliang Zhang, Wenming Yang, Sijung Kim, Syehoon Oh, Yucong Wang, Minjie Cai, Wei Hao, Kangdi Shi, Liangyan Li, Jun Chen, Wei Gao, Wang Liu, XiaoYu Zhang, Linjie Zhou, Sixin Lin, Ru Wang
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results.
no code implementations • 1 Jan 2021 • Zhaoqing Wang, Jiaming Liu, Yangyuxuan Kang, Mingming Gong, Chuang Zhang, Ming Lu, Ming Wu
Graph Reasoning has shown great potential recently in modeling long-range dependencies, which are crucial for various computer vision tasks.
no code implementations • 1 Dec 2020 • Ming Lu, Tong Chen, Dandan Ding, Fengqing Zhu, Zhan Ma
Inspired by the facts that retinal cells actually segregate the visual scene into different attributes (e. g., spatial details, temporal motion) for respective neuronal processing, we propose to first decompose the input video into respective spatial texture frames (STF) at its native spatial resolution that preserve the rich spatial details, and the other temporal motion frames (TMF) at a lower spatial resolution that retain the motion smoothness; then compress them together using any popular video coder; and finally synthesize decoded STFs and TMFs for high-fidelity video reconstruction at the same resolution as its native input.
no code implementations • 17 Oct 2020 • Yunchao Wei, Shuai Zheng, Ming-Ming Cheng, Hang Zhao, LiWei Wang, Errui Ding, Yi Yang, Antonio Torralba, Ting Liu, Guolei Sun, Wenguan Wang, Luc van Gool, Wonho Bae, Junhyug Noh, Jinhwan Seo, Gunhee Kim, Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang, Chuangchuang Tan, Tao Ruan, Guanghua Gu, Shikui Wei, Yao Zhao, Mariia Dobko, Ostap Viniavskyi, Oles Dobosevych, Zhendong Wang, Zhenyuan Chen, Chen Gong, Huanqing Yan, Jun He
The purpose of the Learning from Imperfect Data (LID) workshop is to inspire and facilitate the research in developing novel approaches that would harness the imperfect data and improve the data-efficiency during training.
no code implementations • 9 Jul 2020 • Haojie Liu, Ming Lu, Zhan Ma, Fan Wang, Zhihuang Xie, Xun Cao, Yao Wang
Over the past two decades, traditional block-based video coding has made remarkable progress and spawned a series of well-known standards such as MPEG-4, H. 264/AVC and H. 265/HEVC.
no code implementations • 13 Dec 2019 • Haojie Liu, Han Shen, Lichao Huang, Ming Lu, Tong Chen, Zhan Ma
Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency.
2 code implementations • ICCV 2019 • Ming Lu, Hao Zhao, Anbang Yao, Yurong Chen, Feng Xu, Li Zhang
Although plenty of methods have been proposed, a theoretical analysis of feature transform is still missing.
no code implementations • 3 May 2019 • Ming Lu, Ming Cheng, Yiling Xu, ShiLiang Pu, Qiu Shen, Zhan Ma
Networked video applications, e. g., video conferencing, often suffer from poor visual quality due to unexpected network fluctuation and limited bandwidth.
no code implementations • 19 Apr 2019 • Yiwen Guo, Ming Lu, WangMeng Zuo, Chang-Shui Zhang, Yurong Chen
Convolutional neural networks have been proven effective in a variety of image restoration tasks.
no code implementations • ICCV 2017 • Ming Lu, Hao Zhao, Anbang Yao, Feng Xu, Yurong Chen, Li Zhang
Our method decomposes the semantic style transfer problem into feature reconstruction part and feature decoder part.
1 code implementation • CVPR 2017 • Tao Kong, Fuchun Sun, Anbang Yao, Huaping Liu, Ming Lu, Yurong Chen
To address (a), we design the reverse connection, which enables the network to detect objects on multi-levels of CNNs.
no code implementations • CVPR 2017 • Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang
In this paper, we propose an alternative method to estimate room layouts of cluttered indoor scenes.