STAR: A Structure-Aware Lightweight Transformer for Real-Time Image Enhancement

ICCV 2021 · Zhaoyang Zhang, Yitong Jiang, Jun Jiang, Xiaogang Wang, Ping Luo, Jinwei Gu ·

Image and video enhancement such as color constancy, low light enhancement, and tone mapping on smartphones is challenging because high-quality images should be achieved efficiently with a limited resource budget. Unlike prior works that either used very deep CNNs or large Transformer models, we propose a \underline s eman\underline t ic-\underline a wa\underline r e lightweight Transformer, termed STAR, for real-time image enhancement. STAR is formulated to capture long-range dependencies between image patches, which naturally and implicitly captures the semantic relationships of different regions in an image. STAR is a general architecture that can be easily adapted to different image enhancement tasks. Extensive experiments show that STAR can effectively boost the quality and efficiency of many tasks such as illumination enhancement, auto white balance, and photo retouching, which are indispensable components for image processing on smartphones. For example, STAR reduces model complexity and improves image quality compared to the recent state-of-the-art [??] on the MIT-Adobe FiveK dataset [??] (i.e., 1.8dB PSNR improvements with 25% parameters and 13% float operations.)

PDF Abstract