no code implementations • 20 Apr 2025 • Lifeng Lin, Rongfeng Lu, Quan Chen, Haofan Ren, Ming Lu, Yaoqi Sun, Chenggang Yan, Anke Xue
Recently, many methods based on the 3D Gaussian Splatting (3DGS) framework have been proposed to address sparse-view 3D reconstruction.
no code implementations • 2 Apr 2025 • Junwen Pan, Rui Zhang, Xin Wan, Yuan Zhang, Ming Lu, Qi She
Motivated by human hierarchical temporal search strategies, we propose \textbf{TimeSearch}, a novel framework enabling LVLMs to understand long videos in a human-like manner.
no code implementations • 28 Mar 2025 • Yishen Ji, Ziyue Zhu, Zhenxin Zhu, Kaixin Xiong, Ming Lu, Zhiqi Li, Lijun Zhou, Haiyang Sun, Bing Wang, Tong Lu
Recent progress in driving video generation has shown significant potential for enhancing self-driving systems by providing scalable and controllable training data.
1 code implementation • 24 Mar 2025 • Ruichuan An, Sihan Yang, Ming Lu, Renrui Zhang, Kai Zeng, Yulin Luo, Jiajun Cao, Hao Liang, Ying Chen, Qi She, Shanghang Zhang, Wentao Zhang
To reduce the costs related to joint training, we propose a personalized textual prompt that uses visual token information to initialize concept tokens.
1 code implementation • 23 Mar 2025 • Peng Chen, Xiaobao Wei, Ming Lu, Hui Chen, Feng Tian
We further propose a personalizer enhancer during distillation to enhance the influence of embeddings on facial animation.
no code implementations • 17 Mar 2025 • Ruichuan An, Kai Zeng, Ming Lu, Sihan Yang, Renrui Zhang, Huitong Ji, Qizhe Zhang, Yulin Luo, Hao Liang, Wentao Zhang
Vision-Language Models (VLMs) have demonstrated exceptional performance in various multi-modal tasks.
1 code implementation • 17 Feb 2025 • Junqi Shi, Zhujia Chen, Hanfei Li, Qi Zhao, Ming Lu, Tong Chen, Zhan Ma
This work introduces variable-rate INR-VC for the first time and lays a theoretical foundation for future research in rate-distortion optimization, advancing the field of video coding technology.
no code implementations • 7 Feb 2025 • Zhuojie Wu, Heming Du, Shuyun Wang, Ming Lu, Haiyang Sun, Yandong Guo, Xin Yu
In this paper, we propose a hybrid Convolution and State Space Models (SSMs) based image compression framework, termed \textit{CMamba}, to achieve superior rate-distortion performance with low computational complexity.
1 code implementation • 28 Jan 2025 • Jianing Li, Ming Lu, Hao Wang, Chenyang Gu, Wenzhao Zheng, Li Du, Shanghang Zhang
To utilize these slice features, we propose SliceOcc, an RGB camera-based model specifically tailored for indoor 3D semantic occupancy prediction.
no code implementations • 21 Jan 2025 • Zhengyi Lu, Hao Liang, Ming Lu, Xiao Wang, Xinqiang Yan, Yuankai Huo
This approach offers a faster and more efficient solution to RF shimming challenges in UHF MRI.
no code implementations • 20 Jan 2025 • Hongwei Sha, Muchen Dong, Quanyou Luo, Ming Lu, Hao Chen, Zhan Ma
Geostationary Earth Orbit (GEO) satellite communication demonstrates significant advantages in emergency short burst data services.
no code implementations • 3 Jan 2025 • Jiajun Cao, Yuan Zhang, Tao Huang, Ming Lu, Qizhe Zhang, Ruichuan An, Ningning Ma, Shanghang Zhang
Visual encoders are fundamental components in vision-language models (VLMs), each showcasing unique strengths derived from various pre-trained visual foundation models.
no code implementations • 25 Dec 2024 • Bowen Gu, Hao Chen, Ming Lu, Jie Yao, Zhan Ma
In this paper, we propose a neural network-based $\lambda$-domain rate control scheme for deep video compression, which determines the coding parameter $\lambda$ for each to-be-coded frame based on the rate-distortion-$\lambda$ (R-D-$\lambda$) relationships directly learned from uncompressed frames, achieving high rate control accuracy efficiently without the need for pre-encoding.
1 code implementation • 18 Dec 2024 • Xiaobao Wei, Peng Chen, Ming Lu, Hui Chen, Feng Tian
In this paper, we introduce a method called GraphAvatar that utilizes Graph Neural Networks (GNN) to generate 3D Gaussians for the head avatar.
no code implementations • 9 Dec 2024 • Yuming Li, Peidong Jia, Daiwei Hong, Yueru Jia, Qi She, Rui Zhao, Ming Lu, Shanghang Zhang
To solve the above limitations, we introduce a novel method named ASGDiffusion for parallel HR generation with Asynchronous Structure Guidance (ASG) using pre-trained diffusion models.
1 code implementation • 6 Dec 2024 • Peng Chen, Xiaobao Wei, Qingpo Wuwu, Xinyi Wang, Xingyu Xiao, Ming Lu
We attach the 2D Gaussians to the triangular mesh of the FLAME model and connect additional 3D Gaussians to those 2D Gaussians where the rendering quality of 2DGS is inadequate, creating a mixed 2D-3D Gaussian representation.
1 code implementation • 2 Dec 2024 • Qizhe Zhang, Aosong Cheng, Ming Lu, Zhiyong Zhuo, Minqi Wang, Jiajun Cao, Shaobo Guo, Qi She, Shanghang Zhang
Most existing methods assess the importance of visual tokens based on the text-visual cross-attentions in LLMs.
no code implementations • 23 Nov 2024 • Xiaobao Wei, Qingpo Wuwu, Zhongyu Zhao, Zhuangzhe Wu, Nan Huang, Ming Lu, Ningning Ma, Shanghang Zhang
To address this, we propose Explicit Motion Decomposition (EMD), which models the motions of dynamic objects by introducing learnable motion embeddings to the Gaussians, enhancing the decomposition in street scenes.
no code implementations • 20 Nov 2024 • Xiaobao Wei, Peng Chen, Guangyu Li, Ming Lu, Hui Chen, Feng Tian
Comprehensive experiments show that GazeGaussian outperforms existing methods in rendering speed, gaze redirection accuracy, and facial synthesis across multiple datasets.
1 code implementation • 18 Nov 2024 • Ruichuan An, Sihan Yang, Ming Lu, Renrui Zhang, Kai Zeng, Yulin Luo, Jiajun Cao, Hao Liang, Ying Chen, Qi She, Shanghang Zhang, Wentao Zhang
To reduce the costs related to joint training, we propose a personalized textual prompt that uses visual token information to initialize concept tokens.
no code implementations • 23 Oct 2024 • Yu Wang, Xiaobao Wei, Ming Lu, Guoliang Kang
In this paper, we propose a new method called PLGS that enables 3DGS to generate consistent panoptic segmentation masks from noisy 2D segmentation masks while maintaining superior efficiency compared to NeRF-based methods.
1 code implementation • 3 Oct 2024 • Ming Lu, Zhihao Duan, Wuyang Cong, Dandan Ding, Fengqing Zhu, Zhan Ma
This feature-space processing operates from the lowest to the highest scale of each frame, completely eliminating the need for the complexity-intensive motion estimation and compensation techniques that have been standard in video codecs for decades.
1 code implementation • 29 Sep 2024 • Xu Zhang, Peiyao Guo, Ming Lu, Zhan Ma
Experimental results show that MPA achieves performance comparable to state-of-the-art methods in both task-specific and multi-objective optimization across human viewing and machine analysis tasks.
1 code implementation • 11 Sep 2024 • Rongfeng Lu, Hangyu Chen, Zunjie Zhu, Yuhang Qin, Ming Lu, Le Zhang, Chenggang Yan, Anke Xue
In this work, we propose ThermalGaussian, the first thermal 3DGS approach capable of rendering high-quality images in RGB and thermal modalities.
no code implementations • 2 Sep 2024 • Muchen Dong, Ming Lu, Zhan Ma
Despite the unprecedented compression efficiency achieved by deep learned image compression (LIC), existing methods usually approximate the desired bitrate by adjusting a single quality factor for a given input image, which may compromise the rate control results.
1 code implementation • 31 Jul 2024 • Junqi Shi, Mingyi Jiang, Ming Lu, Tong Chen, Xun Cao, Zhan Ma
For downstream classification on compressed HSI, we theoretically demonstrate the task accuracy is not only related to the classification loss but also to the reconstruction fidelity through a first-order expansion of the accuracy degradation, and accordingly adapt the reconstruction by introducing Adaptive Spectral Weighting.
1 code implementation • 24 Jul 2024 • Xiaobiao Du, Haiyang Sun, Ming Lu, Tianqing Zhu, Xin Yu
With this dataset, we make the generative model more robust to cars.
no code implementations • 7 Jun 2024 • Xiaobiao Du, Haiyang Sun, Shuyun Wang, Zhuojie Wu, Hongwei Sheng, Jiaying Ying, Ming Lu, Tianqing Zhu, Kun Zhan, Xin Yu
(1) \textbf{High-Volume}: 2, 500 cars are meticulously scanned by 3D scanners, obtaining car images and point clouds with real-world dimensions; (2) \textbf{High-Quality}: Each car is captured in an average of 200 dense, high-resolution 360-degree RGB-D views, enabling high-fidelity 3D reconstruction; (3) \textbf{High-Diversity}: The dataset contains various cars from over 100 brands, collected under three distinct lighting conditions, including reflective, standard, and dark.
1 code implementation • 30 May 2024 • Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving.
1 code implementation • 29 May 2024 • Gaole Dai, Cheng-Ching Tseng, Qingpo Wuwu, Rongyu Zhang, Shaokang Wang, Ming Lu, Tiejun Huang, Yu Zhou, Ali Ata Tuz, Matthias Gunzer, Jianxu Chen, Shanghang Zhang
The rapid pace of innovation in biological microscopy imaging has led to large images, putting pressure on data storage and impeding efficient sharing, management, and visualization.
no code implementations • 1 May 2024 • Ming Lu, Zhen Gao, Ying Zou, Zuguo Chen, Pei Li
With the development of technology, the chemical production process is becoming increasingly complex and large-scale, making fault detection particularly important.
no code implementations • 10 Apr 2024 • Gaole Dai, Zhenyu Wang, Qinwen Xu, Ming Lu, Wen Chen, Boxin Shi, Shanghang Zhang, Tiejun Huang
Since the spike camera relies on temporal integration instead of temporal differentiation used by event cameras, our proposed TfS loss maintains manageable training costs.
1 code implementation • CVPR 2024 • Zhihao Duan, Ming Lu, Justin Yang, Jiangpeng He, Zhan Ma, Fengqing Zhu
This paper explores the possibility of extending the capability of pre-trained neural image compressors (e. g., adapting to new data or target bitrates) without breaking backward compatibility, the ability to decode bitstreams encoded by the original model.
no code implementations • 17 Feb 2024 • Gaocheng Ma, Yinfeng Chai, Tianhao Jiang, Ming Lu, Tong Chen
Image compression has been the subject of extensive research for several decades, resulting in the development of well-known standards such as JPEG, JPEG2000, and H. 264/AVC.
1 code implementation • 31 Jan 2024 • Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang
To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.
no code implementations • 21 Jan 2024 • Yichi Zhang, Zhihao Duan, Ming Lu, Dandan Ding, Fengqing Zhu, Zhan Ma
While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image.
1 code implementation • 9 Jan 2024 • Linshan Wu, Ming Lu, Leyuan Fang
Compared with the existing category alignment methods, our CR aims to regularize the correlation between different dimensions of the features and thus performs more robustly when dealing with the divergent category features of imbalanced and inconsistent distributions.
no code implementations • 6 Jan 2024 • Mengfei Li, Ming Lu, Xiaofang Li, Shanghang Zhang
First, existing methods assume enough high-quality images are available for training the NeRF model, ignoring real-world image degradation.
no code implementations • 12 Dec 2023 • Ming Lu, Zhihao Duan, Fengqing Zhu, Zhan Ma
Recently, probabilistic predictive coding that directly models the conditional distribution of latent features across successive frames for temporal redundancy removal has yielded promising results.
no code implementations • 3 Dec 2023 • Jianchen Zhao, Cheng-Ching Tseng, Ming Lu, Ruichuan An, Xiaobao Wei, He Sun, Shanghang Zhang
However, manually designing the partition scheme for a complex scene is very challenging and fails to jointly learn the partition and INRs.
1 code implementation • 28 Nov 2023 • Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang
To convert the SAM features and coordinates into continuous segmentation output, we utilize Implicit Neural Representation (INR) to learn an implicit segmentation decoder.
no code implementations • 28 Nov 2023 • Peng Chen, Xiaobao Wei, Ming Lu, Yitong Zhu, Naiming Yao, Xingyu Xiao, Hui Chen
To address the above limitations, we propose DiffusionTalker, a diffusion-based method that utilizes contrastive learning to personalize 3D facial animation and knowledge distillation to accelerate 3D animation generation.
no code implementations • 12 Oct 2023 • Yun Ye, Yanjie Pan, Qually Jiang, Ming Lu, Xiaoran Fang, Beryl Xu
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods.
1 code implementation • CVPR 2024 • Xiaobao Wei, Renrui Zhang, Jiarui Wu, Jiaming Liu, Ming Lu, Yandong Guo, Shanghang Zhang
NTO3D lifts the 2D masks and features of SAM into the 3D neural field for high-quality neural target object 3D reconstruction.
no code implementations • ICCV 2023 • Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang
Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements.
1 code implementation • 8 Jun 2023 • Hejun Huang, Zuguo Chen, Ying Zou, Ming Lu, Chaoyang Chen
An efficient Channel Prior Convolutional Attention (CPCA) method is proposed in this paper, supporting the dynamic distribution of attention weights in both channel and spatial dimensions.
2 code implementations • 7 Jun 2023 • Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang
Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.
1 code implementation • CVPR 2023 • Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Yurong Chen, Shunli Zhang
Therefore, we design a strategy to build an Edge-to-Bit lookup table that maps the edge score of a patch to the bit of each layer during inference.
no code implementations • 13 Apr 2023 • Huicheng Pi, Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Shunli Zhang
In these works, omnidirectional frames are projected from the 3D sphere to a 2D plane by Equi-Rectangular Projection (ERP).
no code implementations • 24 Mar 2023 • Yulin Luo, Rui Zhao, Xiaobao Wei, Jinwei Chen, Yijie Lu, Shenghao Xie, Tianyu Wang, Ruiqin Xiong, Ming Lu, Shanghang Zhang
To this end, we propose a method called Weather-aware Multi-scale MoE (WM-MoE) based on Transformer for blind weather removal.
2 code implementations • 16 Feb 2023 • Zhihao Duan, Ming Lu, Jack Ma, Yuning Huang, Zhan Ma, Fengqing Zhu
This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications.
1 code implementation • 15 Dec 2022 • Zhihao LI, Ming Lu, Xu Zhang, Xin Feng, M. Salman Asif, Zhan Ma
Conventional cameras capture image irradiance on a sensor and convert it to RGB images using an image signal processor (ISP).
no code implementations • CVPR 2023 • Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang
In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.
1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang
In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.
no code implementations • 5 Nov 2022 • Junqi Shi, Ming Lu, Zhan Ma
Quantizing a floating-point neural network to its fixed-point representation is crucial for Learned Image Compression (LIC) because it improves decoding consistency for interoperability and reduces space-time complexity for implementation.
1 code implementation • 4 Oct 2022 • Hejun Huang, Zuguo Chen, Chaoyang Chen, Ming Lu, Ying Zou
A network based on complementary consistency training, called CC-Net, has been proposed for semi-supervised left atrium image segmentation.
2 code implementations • 27 Aug 2022 • Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu
Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory.
no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang
In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.
1 code implementation • 26 Aug 2022 • Jiaming Liu, Qizhe Zhang, Xiaoqi Li, Jianing Li, Guanqun Wang, Ming Lu, Tiejun Huang, Shanghang Zhang
Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in autonomous driving by mitigating the challenges posed by high-velocity motion blur.
1 code implementation • 20 Jul 2022 • Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.
1 code implementation • 19 Jul 2022 • Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu
Previous works on morphable models mostly focus on large-scale facial geometry but ignore facial details.
1 code implementation • 25 Apr 2022 • Ming Lu, Fangdong Chen, ShiLiang Pu, Zhan Ma
To this end, Integrated Convolution and Self-Attention (ICSA) unit is first proposed to form a content-adaptive transform to characterize and embed neighborhood information dynamically of any input.
1 code implementation • 22 Mar 2022 • Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
Once the incremental capacity is below the threshold, the patch can exit at the specific layer.
no code implementations • 26 Feb 2022 • Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu
End-to-end learned lossy image coders (LICs), as opposed to hand-crafted image codecs, have shown increasing superiority in terms of the rate-distortion performance.
no code implementations • 10 Jan 2022 • Ming Lu, Leyuan Fang, Muxing Li, Bob Zhang, Yi Zhang, Pedram Ghamisi
Therefore, we study how to utilize point labels to extract water bodies and propose a novel method called the neighbor feature aggregation network (NFANet).
1 code implementation • 30 Nov 2021 • Shizun Wang, Ming Lu, Kaixin Chen, Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu
However, existing methods mostly train the DNNs on uniformly sampled LR-HR patch pairs, which makes them fail to fully exploit informative patches within the image.
no code implementations • 12 Nov 2021 • Ming Lu, Peiyao Guo, Huiqing Shi, Chuntong Cao, Zhan Ma
A Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders.
1 code implementation • ICCV 2021 • Jiaming Liu, Ming Lu, Kaixin Chen, Xiaoqi Li, Shizun Wang, Zhaoqing Wang, Enhua Wu, Yurong Chen, Chuang Zhang, Ming Wu
Internet video delivery has undergone a tremendous explosion of growth over the past few years.
1 code implementation • 11 Aug 2021 • Yikai Wang, Fuchun Sun, Ming Lu, Anbang Yao
We propose a compact and effective framework to fuse multimodal features at multiple layers in a single network.
Ranked #53 on
Semantic Segmentation
on NYU Depth v2
no code implementations • 5 Aug 2021 • Haojie Liu, Ming Lu, Zhiqi Chen, Xun Cao, Zhan Ma, Yao Wang
We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements, and combines them adaptively to achieve more accurate inter prediction.
1 code implementation • 21 Apr 2021 • Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng Li, Thomas Tanay, Fenglong Song, Wentao Chao, Qiang Guo, Yan Liu, Jiang Li, Xiaochao Qu, Dewang Hou, Jiayu Yang, Lyn Jiang, Di You, Zhenyu Zhang, Chong Mou, Iaroslav Koshelev, Pavel Ostyakov, Andrey Somov, Jia Hao, Xueyi Zou, Shijie Zhao, Xiaopeng Sun, Yiting Liao, Yuanzhi Zhang, Qing Wang, Gen Zhan, Mengxi Guo, Junlin Li, Ming Lu, Zhan Ma, Pablo Navarrete Michelini, Hai Wang, Yiyun Chen, Jingyu Guo, Liliang Zhang, Wenming Yang, Sijung Kim, Syehoon Oh, Yucong Wang, Minjie Cai, Wei Hao, Kangdi Shi, Liangyan Li, Jun Chen, Wei Gao, Wang Liu, XiaoYu Zhang, Linjie Zhou, Sixin Lin, Ru Wang
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results.
no code implementations • 1 Jan 2021 • Zhaoqing Wang, Jiaming Liu, Yangyuxuan Kang, Mingming Gong, Chuang Zhang, Ming Lu, Ming Wu
Graph Reasoning has shown great potential recently in modeling long-range dependencies, which are crucial for various computer vision tasks.
no code implementations • 1 Dec 2020 • Ming Lu, Tong Chen, Dandan Ding, Fengqing Zhu, Zhan Ma
Inspired by the facts that retinal cells actually segregate the visual scene into different attributes (e. g., spatial details, temporal motion) for respective neuronal processing, we propose to first decompose the input video into respective spatial texture frames (STF) at its native spatial resolution that preserve the rich spatial details, and the other temporal motion frames (TMF) at a lower spatial resolution that retain the motion smoothness; then compress them together using any popular video coder; and finally synthesize decoded STFs and TMFs for high-fidelity video reconstruction at the same resolution as its native input.
no code implementations • 17 Oct 2020 • Yunchao Wei, Shuai Zheng, Ming-Ming Cheng, Hang Zhao, LiWei Wang, Errui Ding, Yi Yang, Antonio Torralba, Ting Liu, Guolei Sun, Wenguan Wang, Luc van Gool, Wonho Bae, Junhyug Noh, Jinhwan Seo, Gunhee Kim, Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang, Chuangchuang Tan, Tao Ruan, Guanghua Gu, Shikui Wei, Yao Zhao, Mariia Dobko, Ostap Viniavskyi, Oles Dobosevych, Zhendong Wang, Zhenyuan Chen, Chen Gong, Huanqing Yan, Jun He
The purpose of the Learning from Imperfect Data (LID) workshop is to inspire and facilitate the research in developing novel approaches that would harness the imperfect data and improve the data-efficiency during training.
no code implementations • 9 Jul 2020 • Haojie Liu, Ming Lu, Zhan Ma, Fan Wang, Zhihuang Xie, Xun Cao, Yao Wang
Over the past two decades, traditional block-based video coding has made remarkable progress and spawned a series of well-known standards such as MPEG-4, H. 264/AVC and H. 265/HEVC.
no code implementations • 13 Dec 2019 • Haojie Liu, Han Shen, Lichao Huang, Ming Lu, Tong Chen, Zhan Ma
Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency.
3 code implementations • ICCV 2019 • Ming Lu, Hao Zhao, Anbang Yao, Yurong Chen, Feng Xu, Li Zhang
Although plenty of methods have been proposed, a theoretical analysis of feature transform is still missing.
no code implementations • 3 May 2019 • Ming Lu, Ming Cheng, Yiling Xu, ShiLiang Pu, Qiu Shen, Zhan Ma
Networked video applications, e. g., video conferencing, often suffer from poor visual quality due to unexpected network fluctuation and limited bandwidth.
no code implementations • 19 Apr 2019 • Yiwen Guo, Ming Lu, WangMeng Zuo, Chang-Shui Zhang, Yurong Chen
Convolutional neural networks have been proven effective in a variety of image restoration tasks.
no code implementations • ICCV 2017 • Ming Lu, Hao Zhao, Anbang Yao, Feng Xu, Yurong Chen, Li Zhang
Our method decomposes the semantic style transfer problem into feature reconstruction part and feature decoder part.
1 code implementation • CVPR 2017 • Tao Kong, Fuchun Sun, Anbang Yao, Huaping Liu, Ming Lu, Yurong Chen
To address (a), we design the reverse connection, which enables the network to detect objects on multi-levels of CNNs.
no code implementations • CVPR 2017 • Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang
In this paper, we propose an alternative method to estimate room layouts of cluttered indoor scenes.