1 code implementation • 9 Aug 2024 • Haider Al-Tahan, Quentin Garrido, Randall Balestriero, Diane Bouchacourt, Caner Hazirbas, Mark Ibrahim
To facilitate a systematic evaluation of VLM progress, we introduce UniBench: a unified implementation of 50+ VLM benchmarks spanning a comprehensive range of carefully categorized capabilities from object recognition to spatial awareness, counting, and much more.
no code implementations • 11 Feb 2024 • Caner Hazirbas, Alicia Sun, Yonathan Efroni, Mark Ibrahim
Despite the remarkable performance of foundation vision-language models, the shared representation space for text and vision can also encode harmful label associations detrimental to fairness.
no code implementations • 26 Sep 2023 • Jiachen Sun, Mark Ibrahim, Melissa Hall, Ivan Evtimov, Z. Morley Mao, Cristian Canton Ferrer, Caner Hazirbas
Inspired by the success of textual prompting, several studies have investigated the efficacy of visual prompt tuning.
no code implementations • 20 Jun 2023 • Maxim Maximov, Tim Meinhardt, Ismail Elezi, Zoe Papakipos, Caner Hazirbas, Cristian Canton Ferrer, Laura Leal-Taixé
To highlight the importance of privacy issues and motivate future research, we motivate and introduce the Pedestrian Dataset De-Identification (PDI) task.
1 code implementation • 11 Apr 2023 • Laura Gustafson, Megan Richards, Melissa Hall, Caner Hazirbas, Diane Bouchacourt, Mark Ibrahim
As an example, we show that mitigating a model's vulnerability to texture can improve performance on the lower income level.
no code implementations • 8 Mar 2023 • Bilal Porgali, Vítor Albiero, Jordan Ryda, Cristian Canton Ferrer, Caner Hazirbas
This paper introduces a new large consent-driven dataset aimed at assisting in the evaluation of algorithmic bias and robustness of computer vision and audio speech models in regards to 11 attributes that are self-provided or labeled by trained annotators.
1 code implementation • CVPR 2023 • Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim
Key to advancing the reliability of vision systems is understanding whether existing methods can overcome multiple shortcuts or struggle in a Whac-A-Mole game, i. e., where mitigating one shortcut amplifies reliance on others.
Ranked #1 on Out-of-Distribution Generalization on ImageNet-W
no code implementations • 10 Nov 2022 • Caner Hazirbas, Yejin Bang, Tiezheng Yu, Parisa Assar, Bilal Porgali, Vítor Albiero, Stefan Hermanek, Jacqueline Pan, Emily McReynolds, Miranda Bogen, Pascale Fung, Cristian Canton Ferrer
Developing robust and fair AI systems require datasets with comprehensive set of labels that can help ensure the validity and legitimacy of relevant measurements.
no code implementations • 3 Nov 2022 • Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim
Equipped with ImageNet-X, we investigate 2, 200 current recognition models and study the types of mistakes as a function of model's (1) architecture, e. g. transformer vs. convolutional, (2) learning paradigm, e. g. supervised vs. self-supervised, and (3) training procedures, e. g., data augmentation.
no code implementations • CVPR 2022 • Vikash Sehwag, Caner Hazirbas, Albert Gordo, Firat Ozgenel, Cristian Canton Ferrer
We observe that uniform sampling from diffusion models predominantly samples from high-density regions of the data manifold.
1 code implementation • 15 Feb 2022 • Priya Goyal, Adriana Romero Soriano, Caner Hazirbas, Levent Sagun, Nicolas Usunier
Systematic diagnosis of fairness, harms, and biases of computer vision systems is an important step towards building socially responsible systems.
no code implementations • 18 Nov 2021 • Chunxi Liu, Michael Picheny, Leda Sari, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf
This paper presents initial Speech Recognition results on "Casual Conversations" -- a publicly released 846 hour corpus designed to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of metadata, including age, gender, and skin tone.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 17 Jun 2021 • Ousmane Amadou Dia, Theofanis Karaletsos, Caner Hazirbas, Cristian Canton Ferrer, Ilknur Kaynar Kabul, Erik Meijer
Under this threat model, we create adversarial examples by perturbing only regions in the inputs where a classifier is uncertain.
no code implementations • 6 Apr 2021 • Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer
The videos were recorded in multiple U. S. states with a diverse set of adults in various age, gender and apparent skin tone groups.
1 code implementation • 19 Jan 2018 • Nikolaus Mayer, Eddy Ilg, Philipp Fischer, Caner Hazirbas, Daniel Cremers, Alexey Dosovitskiy, Thomas Brox
The finding that very large networks can be trained efficiently and reliably has led to a paradigm shift in computer vision from engineered solutions to learning formulations.
1 code implementation • ICCV 2017 • Tim Meinhardt, Michael Moeller, Caner Hazirbas, Daniel Cremers
While variational methods have been among the most powerful tools for solving linear inverse problems in imaging, deep (convolutional) neural networks have recently taken the lead in many challenging benchmarks.
5 code implementations • 4 Apr 2017 • Caner Hazirbas, Sebastian Georg Soyer, Maximilian Christian Staab, Laura Leal-Taixé, Daniel Cremers
Depth from focus (DFF) is one of the classical ill-posed inverse problems in computer vision.
no code implementations • ICCV 2017 • Florian Walch, Caner Hazirbas, Laura Leal-Taixé, Torsten Sattler, Sebastian Hilsenbeck, Daniel Cremers
In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes.