no code implementations • 29 Jul 2024 • Chia-Hao Kao, Cheng Chien, Yu-Jen Tseng, Yi-Hsin Chen, Alessandro Gnutti, Shao-Yuan Lo, Wen-Hsiao Peng, Riccardo Leonardi
MLLMs have extended the success of large language models to modalities (e. g. images) beyond text, but their billion scale hinders deployment on resource-constrained end devices.
no code implementations • 20 Feb 2024 • Zong-Lin Gao, Sang NguyenQuang, Wen-Hsiao Peng, Xiem HoangVan
To mitigate the domain shift, we present an online motion resolution adaptation (OMRA) method.
no code implementations • 20 Feb 2024 • Yi-Hsin Chen, Kuan-Wei Ho, Shiau-Rung Tsai, Guan-Hsun Lin, Alessandro Gnutti, Wen-Hsiao Peng, Riccardo Leonardi
Instead of training separate decoders for these tasks, we incorporate two add-on modules to adapt a pre-trained image decoder from performing the standard image reconstruction to joint decoding and denoising.
no code implementations • 12 Jan 2024 • Alessandro Gnutti, Stefano Della Fiore, Mattia Savardi, Yi-Hsin Chen, Riccardo Leonardi, Wen-Hsiao Peng
In this paper, we introduce a novel direction that harnesses LiDAR depth maps to enhance the compression of the corresponding RGB camera images.
no code implementations • 25 Dec 2023 • Yi-Hsin Chen, Hong-Sheng Xie, Cheng-Wei Chen, Zong-Lin Gao, Martin Benjak, Wen-Hsiao Peng, Jörn Ostermann
Conditional coding has lately emerged as the mainstream approach to learned video compression.
no code implementations • 22 Sep 2023 • Chia-Hao Kao, Yi-Hsin Chen, Cheng Chien, Wei-Chen Chiu, Wen-Hsiao Peng
This paper presents a Transformer-based image compression system that allows for a variable image quality objective according to the user's preference.
1 code implementation • ICCV 2023 • Su-Kai Chen, Hung-Lin Yen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Wen-Hsiao Peng, Yen-Yu Lin
To address this, we propose the continuous exposure value representation (CEVR), which uses an implicit function to generate LDR images with arbitrary EVs, including those unseen during training.
1 code implementation • ICCV 2023 • Si-Cun Chen, Yi-Hsin Chen, Yen-Yu Lin, Wen-Hsiao Peng
We motivate the use of forward motion from the perspective of learning individual motion trajectories, as opposed to learning a mixture of motion trajectories with backward motion.
1 code implementation • ICCV 2023 • Yi-Hsin Chen, Ying-Chieh Weng, Chia-Hao Kao, Cheng Chien, Wei-Chen Chiu, Wen-Hsiao Peng
This work aims for transferring a Transformer-based image compression codec from human perception to machine perception without fine-tuning the codec.
1 code implementation • 18 May 2023 • Chia-Hao Kao, Ying-Chieh Weng, Yi-Hsin Chen, Wei-Chen Chiu, Wen-Hsiao Peng
Our prompt generation networks generate content-adaptive tokens according to the input image, an ROI mask, and a rate parameter.
no code implementations • CVPR 2023 • David Alexandre, Hsueh-Ming Hang, Wen-Hsiao Peng
The rate-distortion performance of our scheme is slightly lower than that of the state-of-the-art learned B-frame coding scheme, B-CANF, but outperforms other learned B-frame coding schemes.
no code implementations • 13 Feb 2023 • Chih-Hsuan Lin, Yi-Hsin Chen, Wen-Hsiao Peng
This paper introduces an online motion rate adaptation scheme for learned video compression, with the aim of achieving content-adaptive coding on individual test sequences to mitigate the domain gap between training and test data.
no code implementations • 29 Dec 2022 • Mu-Jung Chen, Hong-Sheng Xie, Cheng Chien, Wen-Hsiao Peng, Hsueh-Ming Hang
Most learned video codecs operate internally in the RGB domain for P-frame coding.
1 code implementation • 22 Oct 2022 • Shih-Po Lee, Niraj Prakash Kini, Wen-Hsiao Peng, Ching-Wen Ma, Jenq-Neng Hwang
In addition to the benchmark, we propose a cross-modality training framework that leverages the ground-truth 2D keypoints representing human body joints for training, which are systematically generated from the pre-trained 2D pose estimation network based on a monocular camera input image, avoiding laborious manual label annotation efforts.
no code implementations • 15 Oct 2022 • Yung-Han Ho, Chih-Hsuan Lin, Peng-Yu Chen, Mu-Jung Chen, Chih-Peng Chang, Wen-Hsiao Peng, Hsueh-Ming Hang
To adapt our codec to YUV 4:2:0 content, we adopt a simple strategy of using space-to-depth and depth-to-space conversions.
no code implementations • 27 Sep 2022 • Yung-Han Ho, Chia-Hao Kao, Wen-Hsiao Peng, Ping-Chun Hsieh
Recently, the dual-critic design is proposed to update the actor by alternating the rate and distortion critics.
1 code implementation • 5 Sep 2022 • Mu-Jung Chen, Yi-Hsin Chen, Wen-Hsiao Peng
Our B*-frames allow greater flexibility in specifying the group-of-pictures (GOP) structure by reusing the B-frame codec to mimic P-frame coding, without the need for an additional, separate P-frame codec.
1 code implementation • 12 Jul 2022 • Yung-Han Ho, Chih-Peng Chang, Peng-Yu Chen, Alessandro Gnutti, Wen-Hsiao Peng
CANF-VC represents a new attempt that leverages the conditional ANF to learn a video generative model for conditional inter-frame coding.
no code implementations • 10 Mar 2022 • Yung-Han Ho, Yun Liang, Chia-Hao Kao, Wen-Hsiao Peng
More recently, the dual-critic design is proposed to update the actor network by alternating the rate and distortion critics.
1 code implementation • 18 Jul 2021 • Yung-Han Ho, Chih-Chun Chan, Wen-Hsiao Peng, Hsueh-Ming Hang, Marek Domanski
This paper introduces an end-to-end learned image compression system, termed ANFIC, based on Augmented Normalizing Flows (ANF).
no code implementations • 5 Apr 2021 • Yung-Han Ho, Guo-Lun Jin, Yun Liang, Wen-Hsiao Peng, Xiaobo Li
This paper introduces a dual-critic reinforcement learning (RL) framework to address the problem of frame-level bit allocation in HEVC/H. 265.
1 code implementation • 31 Mar 2021 • Shun-Yi Pan, Cheng-You Lu, Shih-Po Lee, Wen-Hsiao Peng
One common approach to this task is to propagate the activation scores of Class Activation Maps (CAMs) using a random-walk mechanism in order to arrive at complete pseudo labels for training a semantic segmentation network in a fully-supervised manner.
1 code implementation • CVPR 2021 • Yan-Cheng Huang, Yi-Hsin Chen, Cheng-You Lu, Hui-Po Wang, Wen-Hsiao Peng, Ching-Chun Huang
Our Long Short-Term Memory Video Rescaling Network (LSTM-VRN) leverages temporal information in the low-resolution video to form an explicit prediction of the missing high-frequency information for upscaling.
1 code implementation • 16 Mar 2021 • Shih-Po Lee, Si-Cun Chen, Wen-Hsiao Peng
Moreover, we introduce a guided spatially-varying convolution for fusing segmentations derived from the previous and current frames, to mitigate propagation error and enable lightweight feature extraction on non-keyframes.
1 code implementation • 15 Dec 2020 • Cheng-Hsun Lei, Yi-Hsin Chen, Wen-Hsiao Peng, Wei-Chen Chiu
In this paper, we address the problem of distillation-based class-incremental learning with a single head.
no code implementations • ICLR 2018 • Hui-Po Wang, Wen-Hsiao Peng, Wei-Jan Ko
Most deep latent factor models choose simple priors for simplicity, tractability or not knowing what prior to use.
no code implementations • 24 Jul 2019 • Yen-Wei Chang, Wen-Hsiao Peng
This paper tackles the problem of learning a questioner in the goal-oriented visual dialog task.
1 code implementation • CVPR 2019 • Wei-Lun Chang, Hui-Po Wang, Wen-Hsiao Peng, Wei-Chen Chiu
In this paper we tackle the problem of unsupervised domain adaptation for the task of semantic segmentation, where we attempt to transfer the knowledge learned upon synthetic datasets with ground-truth labels to real-world images without any annotation.
Ranked #25 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
no code implementations • 20 Feb 2019 • David Alexandre, Chih-Peng Chang, Wen-Hsiao Peng, Hsueh-Ming Hang
We propose a lossy image compression system using the deep-learning autoencoder structure to participate in the Challenge on Learned Image Compression (CLIC) 2018.