1 code implementation • 21 Aug 2024 • Maksim Smirnov, Aleksandr Gushchin, Anastasia Antsiferova, Dmitry Vatolin, Radu Timofte, Ziheng Jia, ZiCheng Zhang, Wei Sun, Jiaying Qian, Yuqin Cao, Yinan Sun, Yuxin Zhu, Xiongkuo Min, Guangtao Zhai, Kanjar De, Qing Luo, Ao-Xiang Zhang, Peng Zhang, Haibo Lei, Linyan Jiang, Yaqing Li, Wenhui Meng, Xiaoheng Tan, Haiqiang Wang, Xiaozhong Xu, Shan Liu, Zhenzhong Chen, Zhengxue Cheng, Jiahao Xiao, Jun Xu, Chenlong He, Qi Zheng, Ruoxi Zhu, Min Li, Yibo Fan, Zhengzhong Tu
The challenge aimed to evaluate the performance of VQA methods on a diverse dataset of 459 videos, encoded with 14 codecs of various compression standards (AVC/H. 264, HEVC/H. 265, AV1, and VVC/H. 266) and containing a comprehensive collection of compression artifacts.
no code implementations • 11 Apr 2024 • Yinan Sun, Xiongkuo Min, Huiyu Duan, Guangtao Zhai
Finally, considering the effect of text descriptions on visual attention, while most existing saliency models ignore this impact, we further propose a text-guided saliency (TGSal) prediction model, which extracts and integrates both image features and text features to predict the image saliency under various text-description conditions.
no code implementations • 14 Feb 2024 • Yang Qian, Yinan Sun, Ali Kargarandehkordi, Parnian Azizian, Onur Cezmi Mutlu, Saimourya Surabhi, Pingyi Chen, Zain Jabbar, Dennis Paul Wall, Peter Washington
These results highlight the potential of using unstructured and unlabeled videos as a valuable source of diverse and dynamic content for training foundation models.
no code implementations • 1 Aug 2022 • Yandong Ji, Zhongyu Li, Yinan Sun, Xue Bin Peng, Sergey Levine, Glen Berseth, Koushil Sreenath
Developing algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task.