Search Results for author: Ming Lu

Found 83 papers, 39 papers with code

VGNC: Reducing the Overfitting of Sparse-view 3DGS via Validation-guided Gaussian Number Control

no code implementations20 Apr 2025 Lifeng Lin, Rongfeng Lu, Quan Chen, Haofan Ren, Ming Lu, Yaoqi Sun, Chenggang Yan, Anke Xue

Recently, many methods based on the 3D Gaussian Splatting (3DGS) framework have been proposed to address sparse-view 3D reconstruction.

3DGS 3D Reconstruction +2

TimeSearch: Hierarchical Video Search with Spotlight and Reflection for Human-like Long Video Understanding

no code implementations2 Apr 2025 Junwen Pan, Rui Zhang, Xin Wan, Yuan Zhang, Ming Lu, Qi She

Motivated by human hierarchical temporal search strategies, we propose \textbf{TimeSearch}, a novel framework enabling LVLMs to understand long videos in a human-like manner.

Video Understanding

CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving

no code implementations28 Mar 2025 Yishen Ji, Ziyue Zhu, Zhenxin Zhu, Kaixin Xiong, Ming Lu, Zhiqi Li, Lijun Zhou, Haiyang Sun, Bing Wang, Tong Lu

Recent progress in driving video generation has shown significant potential for enhancing self-driving systems by providing scalable and controllable training data.

3D Generation Autonomous Driving +1

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

1 code implementation24 Mar 2025 Ruichuan An, Sihan Yang, Ming Lu, Renrui Zhang, Kai Zeng, Yulin Luo, Jiajun Cao, Hao Liang, Ying Chen, Qi She, Shanghang Zhang, Wentao Zhang

To reduce the costs related to joint training, we propose a personalized textual prompt that uses visual token information to initialize concept tokens.

Language Modeling Language Modelling +2

DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation

1 code implementation23 Mar 2025 Peng Chen, Xiaobao Wei, Ming Lu, Hui Chen, Feng Tian

We further propose a personalizer enhancer during distillation to enhance the influence of embeddings on facial animation.

3D Face Animation

On Quantizing Neural Representation for Variable-Rate Video Coding

1 code implementation17 Feb 2025 Junqi Shi, Zhujia Chen, Hanfei Li, Qi Zhao, Ming Lu, Tong Chen, Zhan Ma

This work introduces variable-rate INR-VC for the first time and lays a theoretical foundation for future research in rate-distortion optimization, advancing the field of video coding technology.

Quantization

CMamba: Learned Image Compression with State Space Models

no code implementations7 Feb 2025 Zhuojie Wu, Heming Du, Shuyun Wang, Ming Lu, Haiyang Sun, Yandong Guo, Xin Yu

In this paper, we propose a hybrid Convolution and State Space Models (SSMs) based image compression framework, termed \textit{CMamba}, to achieve superior rate-distortion performance with low computational complexity.

Image Compression State Space Models

SliceOcc: Indoor 3D Semantic Occupancy Prediction with Vertical Slice Representation

1 code implementation28 Jan 2025 Jianing Li, Ming Lu, Hao Wang, Chenyang Gu, Wenzhao Zheng, Li Du, Shanghang Zhang

To utilize these slice features, we propose SliceOcc, an RGB camera-based model specifically tailored for indoor 3D semantic occupancy prediction.

3D Semantic Occupancy Prediction Autonomous Driving +1

Fast-RF-Shimming: Accelerate RF Shimming in 7T MRI using Deep Learning

no code implementations21 Jan 2025 Zhengyi Lu, Hao Liang, Ming Lu, Xiao Wang, Xinqiang Yan, Yuankai Huo

This approach offers a faster and more efficient solution to RF shimming challenges in UHF MRI.

Towards Loss-Resilient Image Coding for Unstable Satellite Networks

no code implementations20 Jan 2025 Hongwei Sha, Muchen Dong, Quanyou Luo, Ming Lu, Hao Chen, Zhan Ma

Geostationary Earth Orbit (GEO) satellite communication demonstrates significant advantages in emergency short burst data services.

Decoder Image Compression

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

no code implementations3 Jan 2025 Jiajun Cao, Yuan Zhang, Tao Huang, Ming Lu, Qizhe Zhang, Ruichuan An, Ningning Ma, Shanghang Zhang

Visual encoders are fundamental components in vision-language models (VLMs), each showcasing unique strengths derived from various pre-trained visual foundation models.

Knowledge Distillation Mixture-of-Experts

Adaptive Rate Control for Deep Video Compression with Rate-Distortion Prediction

no code implementations25 Dec 2024 Bowen Gu, Hao Chen, Ming Lu, Jie Yao, Zhan Ma

In this paper, we propose a neural network-based $\lambda$-domain rate control scheme for deep video compression, which determines the coding parameter $\lambda$ for each to-be-coded frame based on the rate-distortion-$\lambda$ (R-D-$\lambda$) relationships directly learned from uncompressed frames, achieving high rate control accuracy efficiently without the need for pre-encoding.

Video Compression

GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians

1 code implementation18 Dec 2024 Xiaobao Wei, Peng Chen, Ming Lu, Hui Chen, Feng Tian

In this paper, we introduce a method called GraphAvatar that utilizes Graph Neural Networks (GNN) to generate 3D Gaussians for the head avatar.

3DGS NeRF

ASGDiffusion: Parallel High-Resolution Generation with Asynchronous Structure Guidance

no code implementations9 Dec 2024 Yuming Li, Peidong Jia, Daiwei Hong, Yueru Jia, Qi She, Rui Zhao, Ming Lu, Shanghang Zhang

To solve the above limitations, we introduce a novel method named ASGDiffusion for parallel HR generation with Asynchronous Structure Guidance (ASG) using pre-trained diffusion models.

Denoising Image Generation

MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting

1 code implementation6 Dec 2024 Peng Chen, Xiaobao Wei, Qingpo Wuwu, Xinyi Wang, Xingyu Xiao, Ming Lu

We attach the 2D Gaussians to the triangular mesh of the FLAME model and connect additional 3D Gaussians to those 2D Gaussians where the rendering quality of 2DGS is inadequate, creating a mixed 2D-3D Gaussian representation.

3DGS NeRF

EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting

no code implementations23 Nov 2024 Xiaobao Wei, Qingpo Wuwu, Zhongyu Zhao, Zhuangzhe Wu, Nan Huang, Ming Lu, Ningning Ma, Shanghang Zhang

To address this, we propose Explicit Motion Decomposition (EMD), which models the motions of dynamic objects by introducing learnable motion embeddings to the Gaussians, enhancing the decomposition in street scenes.

Autonomous Driving

GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting

no code implementations20 Nov 2024 Xiaobao Wei, Peng Chen, Guangyu Li, Ming Lu, Hui Chen, Feng Tian

Comprehensive experiments show that GazeGaussian outperforms existing methods in rendering speed, gaze redirection accuracy, and facial synthesis across multiple datasets.

3DGS Gaze Estimation +2

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

1 code implementation18 Nov 2024 Ruichuan An, Sihan Yang, Ming Lu, Renrui Zhang, Kai Zeng, Yulin Luo, Jiajun Cao, Hao Liang, Ying Chen, Qi She, Shanghang Zhang, Wentao Zhang

To reduce the costs related to joint training, we propose a personalized textual prompt that uses visual token information to initialize concept tokens.

Language Modeling Language Modelling +2

PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting

no code implementations23 Oct 2024 Yu Wang, Xiaobao Wei, Ming Lu, Guoliang Kang

In this paper, we propose a new method called PLGS that enables 3DGS to generate consistent panoptic segmentation masks from noisy 2D segmentation masks while maintaining superior efficiency compared to NeRF-based methods.

3DGS NeRF +1

High-Efficiency Neural Video Compression via Hierarchical Predictive Learning

1 code implementation3 Oct 2024 Ming Lu, Zhihao Duan, Wuyang Cong, Dandan Ding, Fengqing Zhu, Zhan Ma

This feature-space processing operates from the lowest to the highest scale of each frame, completely eliminating the need for the complexity-intensive motion estimation and compensation techniques that have been standard in video codecs for decades.

Motion Estimation Video Compression

All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path Aggregation

1 code implementation29 Sep 2024 Xu Zhang, Peiyao Guo, Ming Lu, Zhan Ma

Experimental results show that MPA achieves performance comparable to state-of-the-art methods in both task-specific and multi-objective optimization across human viewing and machine analysis tasks.

All Data Compression +4

ThermalGaussian: Thermal 3D Gaussian Splatting

1 code implementation11 Sep 2024 Rongfeng Lu, Hangyu Chen, Zunjie Zhu, Yuhang Qin, Ming Lu, Le Zhang, Chenggang Yan, Anke Xue

In this work, we propose ThermalGaussian, the first thermal 3DGS approach capable of rendering high-quality images in RGB and thermal modalities.

3DGS NeRF

Accelerating block-level rate control for learned image compression

no code implementations2 Sep 2024 Muchen Dong, Ming Lu, Zhan Ma

Despite the unprecedented compression efficiency achieved by deep learned image compression (LIC), existing methods usually approximate the desired bitrate by adjusting a single quality factor for a given input image, which may compromise the rate control results.

Image Compression

HINER: Neural Representation for Hyperspectral Image

1 code implementation31 Jul 2024 Junqi Shi, Mingyi Jiang, Ming Lu, Tong Chen, Xun Cao, Zhan Ma

For downstream classification on compressed HSI, we theoretically demonstrate the task accuracy is not only related to the classification loss but also to the reconstruction fidelity through a first-order expansion of the accuracy degradation, and accordingly adapt the reconstruction by introducing Adaptive Spectral Weighting.

Classification Data Augmentation +1

3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

no code implementations7 Jun 2024 Xiaobiao Du, Haiyang Sun, Shuyun Wang, Zhuojie Wu, Hongwei Sheng, Jiaying Ying, Ming Lu, Tianqing Zhu, Kun Zhan, Xin Yu

(1) \textbf{High-Volume}: 2, 500 cars are meticulously scanned by 3D scanners, obtaining car images and point clouds with real-world dimensions; (2) \textbf{High-Quality}: Each car is captured in an average of 200 dense, high-resolution 360-degree RGB-D views, enabling high-fidelity 3D reconstruction; (3) \textbf{High-Diversity}: The dataset contains various cars from over 100 brands, collected under three distinct lighting conditions, including reflective, standard, and dark.

3D Reconstruction

$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

1 code implementation30 May 2024 Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving.

3DGS 3D Reconstruction +3

Implicit Neural Image Field for Biological Microscopy Image Compression

1 code implementation29 May 2024 Gaole Dai, Cheng-Ching Tseng, Qingpo Wuwu, Rongyu Zhang, Shaokang Wang, Ming Lu, Tiejun Huang, Yu Zhou, Ali Ata Tuz, Matthias Gunzer, Jianxu Chen, Shanghang Zhang

The rapid pace of innovation in biological microscopy imaging has led to large images, putting pressure on data storage and impeding efficient sharing, management, and visualization.

Image Compression Management

Three-layer deep learning network random trees for fault detection in chemical production process

no code implementations1 May 2024 Ming Lu, Zhen Gao, Ying Zou, Zuguo Chen, Pei Li

With the development of technology, the chemical production process is becoming increasingly complex and large-scale, making fault detection particularly important.

Deep Learning Fault Detection

SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera

no code implementations10 Apr 2024 Gaole Dai, Zhenyu Wang, Qinwen Xu, Ming Lu, Wen Chen, Boxin Shi, Shanghang Zhang, Tiejun Huang

Since the spike camera relies on temporal integration instead of temporal differentiation used by event cameras, our proposed TfS loss maintains manageable training costs.

3DGS NeRF +1

Towards Backward-Compatible Continual Learning of Image Compression

1 code implementation CVPR 2024 Zhihao Duan, Ming Lu, Justin Yang, Jiangpeng He, Zhan Ma, Fengqing Zhu

This paper explores the possibility of extending the capability of pre-trained neural image compressors (e. g., adapting to new data or target bitrates) without breaking backward compatibility, the ability to decode bitstreams encoded by the original model.

Continual Learning Image Compression +1

TinyLIC-High efficiency lossy image compression method

no code implementations17 Feb 2024 Gaocheng Ma, Yinfeng Chai, Tianhao Jiang, Ming Lu, Tong Chen

Image compression has been the subject of extensive research for several decades, resulting in the development of well-known standards such as JPEG, JPEG2000, and H. 264/AVC.

Image Compression

Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis

1 code implementation31 Jan 2024 Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang

To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.

Multi-Task Learning Question Answering +1

Another Way to the Top: Exploit Contextual Clustering in Learned Image Coding

no code implementations21 Jan 2024 Yichi Zhang, Zhihao Duan, Ming Lu, Dandan Ding, Fengqing Zhu, Zhan Ma

While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image.

Clustering Image Compression +3

Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation

1 code implementation9 Jan 2024 Linshan Wu, Ming Lu, Leyuan Fang

Compared with the existing category alignment methods, our CR aims to regularize the correlation between different dimensions of the features and thus performs more robustly when dealing with the divergent category features of imbalanced and inconsistent distributions.

Image Segmentation Segmentation +1

RustNeRF: Robust Neural Radiance Field with Low-Quality Images

no code implementations6 Jan 2024 Mengfei Li, Ming Lu, Xiaofang Li, Shanghang Zhang

First, existing methods assume enough high-quality images are available for training the NeRF model, ignoring real-world image degradation.

NeRF Novel View Synthesis

Deep Hierarchical Video Compression

no code implementations12 Dec 2023 Ming Lu, Zhihao Duan, Fengqing Zhu, Zhan Ma

Recently, probabilistic predictive coding that directly models the conditional distribution of latent features across successive frames for temporal redundancy removal has yielded promising results.

Video Compression

MoEC: Mixture of Experts Implicit Neural Compression

no code implementations3 Dec 2023 Jianchen Zhao, Cheng-Ching Tseng, Ming Lu, Ruichuan An, Xiaobao Wei, He Sun, Shanghang Zhang

However, manually designing the partition scheme for a complex scene is very challenging and fails to jointly learn the partition and INRs.

Data Compression Mixture-of-Experts

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything

1 code implementation28 Nov 2023 Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang

To convert the SAM features and coordinates into continuous segmentation output, we utilize Implicit Neural Representation (INR) to learn an implicit segmentation decoder.

Decoder Image Segmentation +4

DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser

no code implementations28 Nov 2023 Peng Chen, Xiaobao Wei, Ming Lu, Yitong Zhu, Naiming Yao, Xingyu Xiao, Hui Chen

To address the above limitations, we propose DiffusionTalker, a diffusion-based method that utilizes contrastive learning to personalize 3D facial animation and knowledge distillation to accelerate 3D animation generation.

3D Face Animation Contrastive Learning +1

Frequency-Aware Re-Parameterization for Over-Fitting Based Image Compression

no code implementations12 Oct 2023 Yun Ye, Yanjie Pan, Qually Jiang, Ming Lu, Xiaoran Fang, Beryl Xu

Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods.

Image Compression Image Restoration

Channel prior convolutional attention for medical image segmentation

1 code implementation8 Jun 2023 Hejun Huang, Zuguo Chen, Ying Zou, Ming Lu, Chaoyang Chen

An efficient Channel Prior Convolutional Attention (CPCA) method is proposed in this paper, supporting the dynamic distribution of attention weights in both channel and spatial dimensions.

Image Segmentation Medical Image Segmentation +2

ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

2 code implementations7 Jun 2023 Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang

Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.

Test-time Adaptation

CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network with Large Input

1 code implementation CVPR 2023 Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Yurong Chen, Shunli Zhang

Therefore, we design a strategy to build an Edge-to-Bit lookup table that maps the edge score of a patch to the bit of each layer during inference.

2k 4k +3

A Comprehensive Comparison of Projections in Omnidirectional Super-Resolution

no code implementations13 Apr 2023 Huicheng Pi, Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Shunli Zhang

In these works, omnidirectional frames are projected from the 3D sphere to a 2D plane by Equi-Rectangular Projection (ERP).

ERP Super-Resolution

QARV: Quantization-Aware ResNet VAE for Lossy Image Compression

2 code implementations16 Feb 2023 Zhihao Duan, Ming Lu, Jack Ma, Yuning Huang, Zhan Ma, Fengqing Zhu

This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications.

Image Compression Quantization

Efficient Visual Computing with Camera RAW Snapshots

1 code implementation15 Dec 2022 Zhihao LI, Ming Lu, Xu Zhang, Xin Feng, M. Salman Asif, Zhan Ma

Conventional cameras capture image irradiance on a sensor and convert it to RGB images using an image signal processor (ISP).

Autonomous Driving Image Compression +2

BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks

no code implementations CVPR 2023 Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang

In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.

3D Object Detection Autonomous Driving +1

BEVUDA: Multi-geometric Space Alignments for Domain Adaptive BEV 3D Object Detection

no code implementations30 Nov 2022 Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang

In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.

3D Object Detection Autonomous Driving +4

Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression

no code implementations5 Nov 2022 Junqi Shi, Ming Lu, Zhan Ma

Quantizing a floating-point neural network to its fixed-point representation is crucial for Learned Image Compression (LIC) because it improves decoding consistency for interoperability and reduces space-time complexity for implementation.

Image Classification Image Compression +2

Complementary consistency semi-supervised learning for 3D left atrial image segmentation

1 code implementation4 Oct 2022 Hejun Huang, Zuguo Chen, Chaoyang Chen, Ming Lu, Ying Zou

A network based on complementary consistency training, called CC-Net, has been proposed for semi-supervised left atrium image segmentation.

Image Segmentation Segmentation +1

Lossy Image Compression with Quantized Hierarchical VAEs

2 code implementations27 Aug 2022 Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu

Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory.

Image Compression Quantization

Uncertainty Guided Depth Fusion for Spike Camera

no code implementations26 Aug 2022 Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang

In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.

Autonomous Driving Stereo Depth Estimation

Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer

1 code implementation26 Aug 2022 Jiaming Liu, Qizhe Zhang, Xiaoqi Li, Jianing Li, Guanqun Wang, Ming Lu, Tiejun Huang, Shanghang Zhang

Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in autonomous driving by mitigating the challenges posed by high-velocity motion blur.

Autonomous Driving Depth Estimation +2

Efficient Meta-Tuning for Content-aware Neural Video Delivery

1 code implementation20 Jul 2022 Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang

Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.

Super-Resolution

Structure-aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation

1 code implementation19 Jul 2022 Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu

Previous works on morphable models mostly focus on large-scale facial geometry but ignore facial details.

High-Efficiency Lossy Image Coding Through Adaptive Neighborhood Information Aggregation

1 code implementation25 Apr 2022 Ming Lu, Fangdong Chen, ShiLiang Pu, Zhan Ma

To this end, Integrated Convolution and Self-Attention (ICSA) unit is first proposed to form a content-adaptive transform to characterize and embed neighborhood information dynamically of any input.

Vocal Bursts Intensity Prediction

Adaptive Patch Exiting for Scalable Single Image Super-Resolution

1 code implementation22 Mar 2022 Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo

Once the incremental capacity is below the threshold, the patch can exit at the specific layer.

Image Super-Resolution

Opening the Black Box of Learned Image Coders

no code implementations26 Feb 2022 Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu

End-to-end learned lossy image coders (LICs), as opposed to hand-crafted image codecs, have shown increasing superiority in terms of the rate-distortion performance.

NFANet: A Novel Method for Weakly Supervised Water Extraction from High-Resolution Remote Sensing Imagery

no code implementations10 Jan 2022 Ming Lu, Leyuan Fang, Muxing Li, Bob Zhang, Yi Zhang, Pedram Ghamisi

Therefore, we study how to utilize point labels to extract water bodies and propose a novel method called the neighbor feature aggregation network (NFANet).

SamplingAug: On the Importance of Patch Sampling Augmentation for Single Image Super-Resolution

1 code implementation30 Nov 2021 Shizun Wang, Ming Lu, Kaixin Chen, Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu

However, existing methods mostly train the DNNs on uniformly sampled LR-HR patch pairs, which makes them fail to fully exploit informative patches within the image.

Data Augmentation Image Super-Resolution

Transformer-based Image Compression

no code implementations12 Nov 2021 Ming Lu, Peiyao Guo, Huiqing Shi, Chuntong Cao, Zhan Ma

A Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders.

Image Compression Image Reconstruction

End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation

no code implementations5 Aug 2021 Haojie Liu, Ming Lu, Zhiqi Chen, Xun Cao, Zhan Ma, Yao Wang

We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements, and combines them adaptively to achieve more accurate inter prediction.

Motion Compensation MS-SSIM +3

Contextual Graph Reasoning Networks

no code implementations1 Jan 2021 Zhaoqing Wang, Jiaming Liu, Yangyuxuan Kang, Mingming Gong, Chuang Zhang, Ming Lu, Ming Wu

Graph Reasoning has shown great potential recently in modeling long-range dependencies, which are crucial for various computer vision tasks.

2D Human Pose Estimation Instance Segmentation +4

Decomposition, Compression, and Synthesis (DCS)-based Video Coding: A Neural Exploration via Resolution-Adaptive Learning

no code implementations1 Dec 2020 Ming Lu, Tong Chen, Dandan Ding, Fengqing Zhu, Zhan Ma

Inspired by the facts that retinal cells actually segregate the visual scene into different attributes (e. g., spatial details, temporal motion) for respective neuronal processing, we propose to first decompose the input video into respective spatial texture frames (STF) at its native spatial resolution that preserve the rich spatial details, and the other temporal motion frames (TMF) at a lower spatial resolution that retain the motion smoothness; then compress them together using any popular video coder; and finally synthesize decoded STFs and TMFs for high-fidelity video reconstruction at the same resolution as its native input.

Motion Compensation Super-Resolution +2

Neural Video Coding using Multiscale Motion Compensation and Spatiotemporal Context Model

no code implementations9 Jul 2020 Haojie Liu, Ming Lu, Zhan Ma, Fan Wang, Zhihuang Xie, Xun Cao, Yao Wang

Over the past two decades, traditional block-based video coding has made remarkable progress and spawned a series of well-known standards such as MPEG-4, H. 264/AVC and H. 265/HEVC.

Motion Compensation MS-SSIM +2

Learned Video Compression via Joint Spatial-Temporal Correlation Exploration

no code implementations13 Dec 2019 Haojie Liu, Han Shen, Lichao Huang, Ming Lu, Tong Chen, Zhan Ma

Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency.

Optical Flow Estimation Video Compression

A Closed-form Solution to Universal Style Transfer

3 code implementations ICCV 2019 Ming Lu, Hao Zhao, Anbang Yao, Yurong Chen, Feng Xu, Li Zhang

Although plenty of methods have been proposed, a theoretical analysis of feature transform is still missing.

Form Style Transfer

Learned Quality Enhancement via Multi-Frame Priors for HEVC Compliant Low-Delay Applications

no code implementations3 May 2019 Ming Lu, Ming Cheng, Yiling Xu, ShiLiang Pu, Qiu Shen, Zhan Ma

Networked video applications, e. g., video conferencing, often suffer from poor visual quality due to unexpected network fluctuation and limited bandwidth.

Decoder Video Compression

Cannot find the paper you are looking for? You can Submit a new open access paper.