no code implementations • 11 Feb 2025 • Hongyu An, Xinfeng Zhang, Shijie Zhao, Li Zhang
To synthesize more compression-lost details and refine temporal consistency, we propose a novel Spatial Degradation-Aware and Temporal Consistent (SDATC) diffusion model for compressed VSR.
no code implementations • 10 Feb 2025 • Lv Tang, Jun Zhu, Xinfeng Zhang, Li Zhang, Siwei Ma, Qingming Huang
Furthermore, to enhance the capture of dynamics between frames within a sequence, we implement a dynamic frame-level adjustment (DFA).
no code implementations • 23 Dec 2024 • Qi Zhang, Shanshe Wang, Xinfeng Zhang, Siwei Ma, Jingshan Pan, Wen Gao
It is meaningful to predict the perceptual quality of compressed images for both humans and machines, which guides the optimization for compression.
1 code implementation • 15 Oct 2024 • Hongyu An, Xinfeng Zhang, Li Zhang, Ruiqin Xiong
Omnidirectional video (ODV) can provide an immersive experience and is widely utilized in the field of virtual reality and augmented reality.
no code implementations • 2 Oct 2024 • Gai Zhang, Xinfeng Zhang, Lv Tang, Yue Li, Kai Zhang, Li Zhang
For decades, video compression technology has been a prominent research area.
1 code implementation • 21 Aug 2024 • Xiao Han, Xinfeng Zhang, Yiling Wu, Zhenduo Zhang, Zhe Wu
To this end, we introduce the Kolmogorov-Arnold Network (KAN) into time series forecasting research, which has better mathematical properties and interpretability.
1 code implementation • 15 Apr 2024 • Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Hongyu An, Xinfeng Zhang, Zhiyuan Song, Ziyue Dong, Qing Zhao, Xiaogang Xu, Pengxu Wei, Zhi-chao Dou, Gui-ling Wang, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Cansu Korkmaz, A. Murat Tekalp, Yubin Wei, Xiaole Yan, Binren Li, Haonan Chen, Siqi Zhang, Sihan Chen, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi, Anjali Sarvaiya, Pooja Choksy, Jagrit Joshi, Shubh Kawa, Kishor Upla, Sushrut Patwardhan, Raghavendra Ramachandra, Sadat Hossain, Geongi Park, S. M. Nadim Uddin, Hao Xu, Yanhui Guo, Aman Urumbekov, Xingzhuo Yan, Wei Hao, Minghan Fu, Isaac Orais, Samuel Smith, Ying Liu, Wangwang Jia, Qisheng Xu, Kele Xu, Weijun Yuan, Zhan Li, Wenqin Kuang, Ruijin Guan, Ruting Deng, Zhao Zhang, Bo wang, Suiyi Zhao, Yan Luo, Yanyan Wei, Asif Hussain Khan, Christian Micheloni, Niki Martinel
This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained.
3 code implementations • 6 May 2023 • Yufeng Huang, Jiji Tang, Zhuo Chen, Rongsheng Zhang, Xinfeng Zhang, WeiJie Chen, Zeng Zhao, Zhou Zhao, Tangjie Lv, Zhipeng Hu, Wen Zhang
In this paper, we present an end-to-end framework Structure-CLIP, which integrates Scene Graph Knowledge (SGK) to enhance multi-modal structured representations.
1 code implementation • ICCV 2023 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao
On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use.
no code implementations • ICCV 2023 • Lv Tang, Xinfeng Zhang, Gai Zhang, Xiaoqi Ma
Video compression has always been a popular research area, where many traditional and deep video compression methods have been proposed.
1 code implementation • 13 Nov 2022 • Qi Zhang, Shanshe Wang, Xinfeng Zhang, Chuanmin Jia, Zhao Wang, Siwei Ma, Wen Gao
Each score is derived from machine perceptual differences between original and compressed images.
no code implementations • 6 Sep 2022 • Jiguo Li, Chuanmin Jia, Xinfeng Zhang, Siwei Ma, Wen Gao
With the recent advances in cross modal translation and generation, in this paper, we propose the cross modal compression~(CMC), a semantic compression framework for visual data, to transform the high redundant visual data~(such as image, video, etc.)
1 code implementation • 9 Jun 2022 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
To solve the information loss problem, the proposed model aims to preserve the spatiotemporal information for videos during the feature extraction and the state transitions, respectively.
no code implementations • 20 Apr 2022 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos.
1 code implementation • CVPR 2022 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
In this paper, we propose a Spatiotemporal Residual Predictive Model (STRPM) for high-resolution video prediction.
1 code implementation • 15 Mar 2022 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
In Stage II, the pre-trained encoder is loaded to STMO model and fine-tuned.
Ranked #11 on
Monocular 3D Human Pose Estimation
on Human3.6M
1 code implementation • NeurIPS 2021 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xiang Xinguang, Wen Gao
The attention module aims to learn an attention map based on the correlations between the current spatial state and the historical spatial states.
Ranked #19 on
Video Prediction
on Moving MNIST
1 code implementation • 29 Jul 2021 • Wenkang Shan, Haopeng Lu, Shanshe Wang, Xinfeng Zhang, Wen Gao
To alleviate these two problems, we propose a relative information encoding method that yields positional and temporal enhanced representations.
Ranked #14 on
Monocular 3D Human Pose Estimation
on Human3.6M
no code implementations • 19 Jul 2020 • Yuqing Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
Based on the observation, in this paper, we build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
Ranked #15 on
Image Super-Resolution
on Manga109 - 3x upscaling
no code implementations • 26 May 2020 • Lingbo Yang, Pan Wang, Chang Liu, Zhanning Gao, Peiran Ren, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Xian-Sheng Hua, Wen Gao
Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality.
1 code implementation • 26 May 2020 • Lingbo Yang, Pan Wang, Xinfeng Zhang, Shanshe Wang, Zhanning Gao, Peiran Ren, Xuansong Xie, Siwei Ma, Wen Gao
The ability to produce convincing textural details is essential for the fidelity of synthesized person images.
Ranked #4 on
Pose Transfer
on Deep-Fashion
no code implementations • 21 Apr 2020 • Shurun Wang, Shiqi Wang, Wenhan Yang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
In particular, we study the feature and texture compression in a scalable coding framework, where the base layer serves as the deep learning feature and enhancement layer targets to perfectly reconstruct the texture.
1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao
In this paper, we attempt to translate the speech signals into the image signals without the transcription stage.
Multimedia Sound Audio and Speech Processing
1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao
Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention.
Audio and Speech Processing Cryptography and Security Sound
1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao
Attacking deep learning based biometric systems has drawn more and more attention with the wide deployment of fingerprint/face/speaker recognition systems, given the fact that the neural networks are vulnerable to the adversarial examples, which have been intentionally perturbed to remain almost imperceptible for human.
no code implementations • 18 Feb 2020 • Sheng Shi, Xinfeng Zhang, Wei Fan
Explainability is a gateway between Artificial Intelligence and society as the current popular deep learning models are generally weak in explaining the reasoning process and prediction results.
no code implementations • 4 Nov 2019 • Sheng Shi, Xinfeng Zhang, Wei Fan
Despite outstanding contribution to the significant progress of Artificial Intelligence (AI), deep learning models remain mostly black boxes, which are extremely weak in explainability of the reasoning process and prediction results.
no code implementations • 19 Sep 2019 • Yongbing Zhang, Yangzhe Liu, Xiu Li, Shaowei Jiang, Krishna Dixit, Xinfeng Zhang, Xiangyang Ji
Since the optimal parameters of the PgNN can be derived by minimizing the difference between the model-generated images and real captured angle-varied images corresponding to the same scene, the proposed PgNN can get rid of the problem of massive training data as in traditional supervised methods.
1 code implementation • ICCV 2019 • Wentao Cheng, Weisi Lin, Kan Chen, Xinfeng Zhang
Image-based localization (IBL) aims to estimate the 6DOF camera pose for a given query image.
no code implementations • 7 Apr 2019 • Siwei Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao, Shiqi Wang, Shanshe Wang
Deep convolution neural network (CNN) which makes the neural network resurge in recent years and has achieved great success in both artificial intelligent and signal processing fields, also provides a novel and promising solution for image and video compression.
no code implementations • 14 Mar 2019 • Shurun Wang, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
In this paper, we propose a scalable image compression scheme, including the base layer for feature representation and enhancement layer for texture representation.
no code implementations • 27 May 2018 • Xinfeng Zhang, Su Yang, Xinjian Zhang, Weishan Zhang, Jiulong Zhang
In crowded scenes, detection and localization of abnormal behaviors is challenging in that high-density people make object segmentation and tracking extremely difficult.
no code implementations • 25 Sep 2017 • Chuanmin Jia, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma
Deep learning has demonstrated tremendous break through in the area of image/video processing.
Multimedia