MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

1 code implementation19 Dec 2022 Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo

To generate joint audio-video pairs, we propose a novel Multi-Modal Diffusion model (i. e., MM-Diffusion), with two-coupled denoising autoencoders.

Denoising FAD +1

A Cross-Residual Learning for Image Recognition

1 code implementation22 Nov 2022 Jun Liang, Songsen Yu, Huan Yang

ResNets and its variants play an important role in various fields of image recognition.

Fine-Grained Image Style Transfer with Visual Transformers

1 code implementation11 Oct 2022 Jianbo Wang, Huan Yang, Jianlong Fu, Toshihiko Yamasaki, Baining Guo

Such a design usually destroys the spatial information of the input images and fails to transfer fine-grained style patterns into style transfer results.

Style Transfer

AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation

1 code implementation7 Sep 2022 Yiyang Ma, Huan Yang, Bei Liu, Jianlong Fu, Jiaying Liu

To address this issue, we propose a Prompt-based Cross-Modal Generation Framework (PCM-Frame) to leverage two powerful pre-trained models, including CLIP and StyleGAN.

Image Generation

4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement

no code implementations5 Sep 2022 Chengxu Liu, Huan Yang, Jianlong Fu, Xueming Qian

In particular, we first introduce a lightweight context encoder and a parameter encoder to learn a context map for the pixel-level category and a group of image-adaptive coefficients, respectively.

Image Enhancement

Language-Guided Face Animation by Recurrent StyleGAN-based Generator

1 code implementation11 Aug 2022 Tiankai Hang, Huan Yang, Bei Liu, Jianlong Fu, Xin Geng, Baining Guo

Specifically, we propose a recurrent motion generator to extract a series of semantic and motion information from the language and feed it along with visual information to a pre-trained StyleGAN to generate high-quality frames.

Image Manipulation

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution

1 code implementation5 Aug 2022 Zhongwei Qiu, Huan Yang, Jianlong Fu, Dongmei Fu

First, we divide a video frame into patches, and transform each patch into DCT spectral maps in which each channel represents a frequency band.

Video Enhancement Video Super-Resolution

Online Video Super-Resolution with Convolutional Kernel Bypass Graft

no code implementations4 Aug 2022 Jun Xiao, Xinyang Jiang, Ningxin Zheng, Huan Yang, Yifan Yang, Yuqing Yang, Dongsheng Li, Kin-Man Lam

Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models.

Transfer Learning Video Super-Resolution

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation

no code implementations19 Jul 2022 Chengxu Liu, Huan Yang, Jianlong Fu, Xueming Qian

In particular, we formulate the warped features with inconsistent motions as query tokens, and formulate relevant regions in a motion trajectory from two original consecutive frames into keys and values.

Video Frame Interpolation

Degradation-Guided Meta-Restoration Network for Blind Super-Resolution

no code implementations3 Jul 2022 Fuzhi Yang, Huan Yang, Yanhong Zeng, Jianlong Fu, Hongtao Lu

The extractor estimates the degradations in LR inputs and guides the meta-restoration modules to predict restoration parameters for different degradations on-the-fly.

Blind Super-Resolution Image Restoration +1

UID2021: An Underwater Image Dataset for Evaluation of No-reference Quality Assessment Metrics

1 code implementation19 Apr 2022 Guojia Hou, YuXuan Li, Huan Yang, Kunqian Li, Zhenkuan Pan

Achieving subjective and objective quality assessment of underwater images is of high significance in underwater visual perception and image/video processing.

Image Enhancement Image Quality Assessment

Learning Trajectory-Aware Transformer for Video Super-Resolution

1 code implementation CVPR 2022 Chengxu Liu, Huan Yang, Jianlong Fu, Xueming Qian

Existing approaches usually align and aggregate video frames from limited adjacent frames (e. g., 5 or 7 frames), which prevents these approaches from satisfactory results.

Video Super-Resolution

Collaborative Learning in General Graphs with Limited Memorization: Learnability, Complexity and Reliability

no code implementations29 Jan 2022 Feng Li, Xuyang Yuan, Lina Wang, Huan Yang, Dongxiao Yu, Weifeng Lv, Xiuzhen Cheng

We consider K-armed bandit problem in general graphs where agents are arbitrarily connected and each of them has limited memorization and communication bandwidth.


Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

1 code implementation CVPR 2022 Hongwei Xue, Tiankai Hang, Yanhong Zeng, Yuchong Sun, Bei Liu, Huan Yang, Jianlong Fu, Baining Guo

To enable VL pre-training, we jointly optimize the HD-VILA model by a hybrid Transformer that learns rich spatiotemporal features, and a multimodal Transformer that enforces interactions of the learned video features with diversified texts.

Retrieval Super-Resolution +2

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers

no code implementations NeurIPS 2021 Yanhong Zeng, Huan Yang, Hongyang Chao, Jianbo Wang, Jianlong Fu

Given a sequence of style tokens, the TokenGAN is able to control the image synthesis by assigning the styles to the content tokens by attention mechanism with a Transformer.

Image Generation

Learning Fine-Grained Motion Embedding for Landscape Animation

no code implementations6 Sep 2021 Hongwei Xue, Bei Liu, Huan Yang, Jianlong Fu, Houqiang Li, Jiebo Luo

To tackle this problem, we propose a model named FGLA to generate high-quality and realistic videos by learning Fine-Grained motion embedding for Landscape Animation.

Domain-Aware Universal Style Transfer

1 code implementation ICCV 2021 Kibeom Hong, Seogkyu Jeon, Huan Yang, Jianlong Fu, Hyeran Byun

To this end, we design a novel domainness indicator that captures the domainness value from the texture and structural features of reference images.

Style Transfer

Friedel Oscillations of Vortex Bound States Under Extreme Quantum Limit in KCa2Fe4As4F2

no code implementations24 Feb 2021 Xiaoyu Chen, Wen Duan, Xinwei Fan, Wenshan Hong, Kailun Chen, Huan Yang, Shiliang Li, Huiqian Luo, Hai-Hu Wen

We report the observation of discrete vortex bound states with the energy levels deviating from the widely believed ratio of 1:3:5 in the vortices of an iron based superconductor KCa2Fe4As4F2 through scanning tunneling microcopy (STM).

Superconductivity Strongly Correlated Electrons

Single particle tunneling spectroscopy and superconducting gaps in layered iron based superconductor KCa$_{2}$Fe$_{4}$As$_{4}$F$_{2}$

no code implementations17 Feb 2021 Wen Duan, Kailun Chen, Wenshan Hong, Xiaoyu Chen, Huan Yang, Shiliang Li, Huiqian Luo, Hai-Hu Wen

On the second type of surface which is rarely obtained, the fully gapped feature can still be observed on the tunneling spectra, although multiple gaps are obtained either from a single spectrum or separate ones, and the gap values determined from coherence peaks locate mainly in the range from 4 to 8 meV.


A Lyman-α protocluster at redshift 6.9

no code implementations25 Jan 2021 Weida Hu, Junxian Wang, Leopoldo Infante, James E. Rhoads, Zhen-Ya Zheng, Huan Yang, Sangeeta Malhotra, L. Felipe Barrientos, Chunyan Jiang, Jorge González-López, Gonzalo Prieto, Lucia A. Perez, Pascale Hibon, Gaspar Galaz, Alicia Coughlin, Santosh Harish, Xu Kong, Wenyong Kang, Ali Ahmad Khostovan, John Pharo, Francisco Valdes, Isak Wold, Alistair R. Walker, XianZhong Zheng

Here we report the discovery of the protocluster LAGER-z7OD1 at a redshift of 6. 93, when the Universe was only 770 million years old and could be experiencing rapid evolution of the neutral hydrogen fraction in the intergalactic medium.

Astrophysics of Galaxies

Formation Rate of Extreme Mass Ratio Inspirals in Active Galactic Nuclei

no code implementations22 Jan 2021 Zhen Pan, Huan Yang

In this work, we calculate the rate of EMRIs of an alternative formation channel: EMRI formation assisted by the accretion flow around accreting massive black holes.

High Energy Astrophysical Phenomena General Relativity and Quantum Cosmology

Reduced Reference Perceptual Quality Model and Application to Rate Control for 3D Point Cloud Compression

no code implementations25 Nov 2020 Qi Liu, Hui Yuan, Raouf Hamzaoui, Honglei Su, Junhui Hou, Huan Yang

In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bit rate.


Full Reference Screen Content Image Quality Assessment by Fusing Multi-level Structure Similarity

1 code implementation7 Aug 2020 Chenglizhao Chen, Hongmeng Zhao, Huan Yang, Chong Peng, Teng Yu

The screen content images (SCIs) usually comprise various content types with sharp edges, in which the artifacts or distortions can be well sensed by the vanilla structure similarity measurement in a full reference manner.

Image Quality Assessment

Physical properties revealed by transport measurements on superconducting Nd$_{0.8}$Sr$_{0.2}$NiO$_{2}$ thin films

no code implementations9 Jul 2020 Ying Xiang, Qing Li, Yueying Li, Huan Yang, Yuefeng Nie, Hai-Hu Wen

The angle dependent resistivity at a fixed temperature and different magnetic fields cannot be scaled to one curve, which deviates from the prediction of the anisotropic Ginzburg-Landau theory.

Superconductivity Materials Science Strongly Correlated Electrons

Learning Texture Transformer Network for Image Super-Resolution

1 code implementation CVPR 2020 Fuzhi Yang, Huan Yang, Jianlong Fu, Hongtao Lu, Baining Guo

In this paper, we propose a novel Texture Transformer Network for Image Super-Resolution (TTSR), in which the LR and Ref images are formulated as queries and keys in a transformer, respectively.

Hard Attention Image Generation +2

Application of Structural Similarity Analysis of Visually Salient Areas and Hierarchical Clustering in the Screening of Similar Wireless Capsule Endoscopic Images

no code implementations1 Apr 2020 Rui Nie, Huan Yang, Hejuan Peng, Wenbin Luo, Weiya Fan, Jie Zhang, Jing Liao, Fang Huang, Yufeng Xiao

Small intestinal capsule endoscopy is the mainstream method for inspecting small intestinal lesions, but a single small intestinal capsule endoscopy will produce 60, 000 - 120, 000 images, the majority of which are similar and have no diagnostic value.

Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-encoders

no code implementations ICCV 2015 Huan Yang, Baoyuan Wang, Stephen Lin, David Wipf, Minyi Guo, Baining Guo

With the growing popularity of short-form video sharing platforms such as \em{Instagram} and \em{Vine}, there has been an increasing need for techniques that automatically extract highlights from video.

