no code implementations • 13 Dec 2023 • Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell
Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image.
1 code implementation • 28 Nov 2022 • Grace Luo, Giscard Biamby, Trevor Darrell, Daniel Fried, Anna Rohrbach
We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations and an associated textual guidebook for GeoGuessr, a popular interactive geolocation game.
1 code implementation • NAACL 2022 • Giscard Biamby, Grace Luo, Trevor Darrell, Anna Rohrbach
Detecting out-of-context media, such as "mis-captioned" images on Twitter, is a relevant problem, especially in domains of high public significance.
no code implementations • 20 Aug 2021 • Michael Laielli, Giscard Biamby, Dian Chen, Ritwik Gupta, Adam Loeffler, Phat Dat Nguyen, Ross Luo, Trevor Darrell, Sayna Ebrahimi
Active learning for object detection is conventionally achieved by applying techniques developed for classification in a way that aggregates individual detections into image-level selection criteria.
no code implementations • 18 Dec 2020 • Sayna Ebrahimi, William Gan, Dian Chen, Giscard Biamby, Kamyar Salahi, Michael Laielli, Shizhan Zhu, Trevor Darrell
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.