no code implementations • 25 Apr 2024 • Juyong Lee, Taywon Min, Minyong An, Dongyoon Hahm, Haeone Lee, Changyeon Kim, Kimin Lee
In this work, we introduce B-MoCA: a novel benchmark with interactive environments for evaluating and developing mobile device control agents.
1 code implementation • 2 Apr 2024 • KyuYoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, Jinwoo Shin, Kimin Lee
To investigate this issue in depth, we introduce the Text-Image Alignment Assessment (TIA2) benchmark, which comprises a diverse collection of text prompts, images, and human annotations.