no code implementations • 31 Aug 2023 • Katherine Deng, Arijit Ray, Reuben Tan, Saadia Gabriel, Bryan A. Plummer, Kate Saenko
We further see that current captioning metrics based on large vision-language models also fail to correlate with human preferences.