Search Results for author: Tobias Weyand

Found 12 papers, 7 papers with code

VideoPrism: A Foundational Visual Encoder for Video Understanding

no code implementations • 20 Feb 2024 • Long Zhao, Nitesh B. Gundavarapu, Liangzhe Yuan, Hao Zhou, Shen Yan, Jennifer J. Sun, Luke Friedman, Rui Qian, Tobias Weyand, Yue Zhao, Rachel Hornung, Florian Schroff, Ming-Hsuan Yang, David A. Ross, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Ting Liu, Boqing Gong

We introduce VideoPrism, a general-purpose video encoder that tackles diverse video understanding tasks with a single frozen model.

Question Answering Video Question Answering +1

Paper
Add Code

VideoGLUE: Video General Understanding Evaluation of Foundation Models

1 code implementation • 6 Jul 2023 • Liangzhe Yuan, Nitesh Bharadwaj Gundavarapu, Long Zhao, Hao Zhou, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong

We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action recognition, temporal localization, and spatiotemporal localization), eight datasets well received by the community, and four adaptation methods tailoring a foundation model (FM) for a downstream task.

Action Recognition Temporal Localization +1

76,589

Paper
Code

Improving Fairness in Large-Scale Object Recognition by CrowdSourced Demographic Information

no code implementations • 2 Jun 2022 • Zu Kim, André Araujo, Bingyi Cao, Cam Askew, Jack Sim, Mike Green, N'Mah Fodiatu Yilla, Tobias Weyand

We showcase its application to the landmark recognition domain, presenting a detailed analysis and the final fairer landmark rankings.

Cultural Vocal Bursts Intensity Prediction Fairness +2

Paper
Add Code

Towards A Fairer Landmark Recognition Dataset

no code implementations • 19 Aug 2021 • Zu Kim, André Araujo, Bingyi Cao, Cam Askew, Jack Sim, Mike Green, N'Mah Fodiatu Yilla, Tobias Weyand

To create a more comprehensive and equitable dataset, we start by defining the fair relevance of a landmark to the world population.

Landmark Recognition

Paper
Add Code

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

1 code implementation • CVPR 2021 • Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim

Understanding the nutritional content of food from visual data is a challenging computer vision problem, with the potential to have a positive and widespread impact on public health.

Nutrition

130

Paper
Code

Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

1 code implementation • CVPR 2020 • Tobias Weyand, Andre Araujo, Bingyi Cao, Jack Sim

GLDv2 is the largest such dataset to date by a large margin, including over 5M images and 200k distinct instance labels.

Image Retrieval Retrieval +1

719

Paper
Code

Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

4 code implementations • 3 Apr 2020 • Tobias Weyand, Andre Araujo, Bingyi Cao, Jack Sim

GLDv2 is the largest such dataset to date by a large margin, including over 5M images and 200k distinct instance labels.

Ranked #1 on Landmark Recognition on Google Landmarks Dataset v2 (recognition, validation) (using extra training data)

Image Retrieval Landmark Recognition +2

76,590

Paper
Code

CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps

no code implementations • ECCV 2018 • Paul Hongsuck Seo, Tobias Weyand, Jack Sim, Bohyung Han

Image geolocalization is the task of identifying the location depicted in a photo based only on its visual information.

Ranked #1 on Photo geolocation estimation on Im2GPS (Reference images metric)

Photo geolocation estimation

Paper
Add Code

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

153 code implementations • 17 Apr 2017 • Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam

We present a class of efficient models called MobileNets for mobile and embedded vision applications.

Ranked #238 on Object Detection on COCO test-dev

General Classification Image Classification +1

182,441

Paper
Code

Large-Scale Image Retrieval with Attentive Deep Local Features

12 code implementations • ICCV 2017 • Hyeonwoo Noh, Andre Araujo, Jack Sim, Tobias Weyand, Bohyung Han

We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELF (DEep Local Feature).

Ranked #2 on Image Retrieval on Oxf105k

Image Retrieval Retrieval

76,590

Paper
Code

PlaNet - Photo Geolocation with Convolutional Neural Networks

1 code implementation • 17 Feb 2016 • Tobias Weyand, Ilya Kostrikov, James Philbin

Is it possible to build a system to determine the location where a photo was taken using just its pixels?

Ranked #1 on Photo geolocation estimation on Im2GPS (Reference images metric)

Image Retrieval Photo geolocation estimation +1

Paper
Code

Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation

no code implementations • 18 Sep 2014 • Tobias Weyand, Bastian Leibe

We evaluate how different choices of methods and parameters for the individual pipeline steps affect overall system performance and examine their effects for different query categories such as buildings, paintings or sculptures.

Clustering Landmark Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.