no code implementations • 25 Oct 2024 • Fengbin Zhu, Ziyang Liu, Xiang Yao Ng, Haohui Wu, Wenjie Wang, Fuli Feng, Chao Wang, Huanbo Luan, Tat Seng Chua
Large Vision-Language Models (LVLMs) have achieved remarkable performance in many vision-language tasks, yet their capabilities in fine-grained visual understanding remain insufficiently evaluated.
1 code implementation • 3 Apr 2020 • Wentian Li, Xidong Feng, Haotian An, Xiang Yao Ng, Yu-Jin Zhang
In this work, we propose a deep reinforcement learning based method to reconstruct the corrupted images with meaningful pixel-wise operations (e. g. edge enhancing filters), so that the reconstruction process is transparent to users.