Search Results for author: Olga Russakovsky

Found 52 papers, 28 papers with code

CornerNet-Lite: Efficient Keypoint Based Object Detection

6 code implementations18 Apr 2019 Hei Law, Yun Teng, Olga Russakovsky, Jia Deng

Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency.

Object object-detection +1

Remember the Past: Distilling Datasets into Addressable Memories for Neural Networks

2 code implementations6 Jun 2022 Zhiwei Deng, Olga Russakovsky

We propose an algorithm that compresses the critical information of a large dataset into compact addressable memories.

Continual Learning

Vision-Language Dataset Distillation

2 code implementations15 Aug 2023 Xindi Wu, Byron Zhang, Zhiwei Deng, Olga Russakovsky

In this work, we design the first vision-language dataset distillation method, building on the idea of trajectory matching.

Image Classification Image-to-Text Retrieval +2

ImageNet Large Scale Visual Recognition Challenge

12 code implementations1 Sep 2014 Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei

The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images.

General Classification Image Classification +4

End-to-end Learning of Action Detection from Frame Glimpses in Videos

1 code implementation CVPR 2016 Serena Yeung, Olga Russakovsky, Greg Mori, Li Fei-Fei

In this work we introduce a fully end-to-end approach for action detection in videos that learns to directly predict the temporal bounds of actions.

Ranked #9 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.2 metric)

Action Detection Temporal Action Localization

What Actions are Needed for Understanding Human Actions in Videos?

1 code implementation ICCV 2017 Gunnar A. Sigurdsson, Olga Russakovsky, Abhinav Gupta

We present the many kinds of information that will be needed to achieve substantial gains in activity understanding: objects, verbs, intent, and sequential reasoning.

Benchmarking

Fair Attribute Classification through Latent Space De-biasing

1 code implementation CVPR 2021 Vikram V. Ramaswamy, Sunnie S. Y. Kim, Olga Russakovsky

Fairness in visual recognition is becoming a prominent and critical topic of discussion as recognition systems are deployed at scale in the real world.

Attribute Classification +2

A Study of Face Obfuscation in ImageNet

1 code implementation10 Mar 2021 Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng, Olga Russakovsky

In this paper, we explore the effects of face obfuscation on the popular ImageNet challenge visual recognition benchmark.

Attribute Object +5

What's the Point: Semantic Segmentation with Point Supervision

1 code implementation6 Jun 2015 Amy Bearman, Olga Russakovsky, Vittorio Ferrari, Li Fei-Fei

The semantic image segmentation task presents a trade-off between test time accuracy and training-time annotation cost.

Image Segmentation Semantic Segmentation

Multi-Query Video Retrieval

1 code implementation10 Jan 2022 Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky

Retrieving target videos based on text descriptions is a task of great practical value and has received increasing attention over the past few years.

Retrieval Video Retrieval

[Re] Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias

1 code implementation RC 2020 Sunnie S. Y. Kim, Sharon Zhang, Nicole Meister, Olga Russakovsky

The implementation of most (7 of 10) methods was straightforward, especially after we received additional details from the original authors.

Attribute

HIVE: Evaluating the Human Interpretability of Visual Explanations

1 code implementation6 Dec 2021 Sunnie S. Y. Kim, Nicole Meister, Vikram V. Ramaswamy, Ruth Fong, Olga Russakovsky

As AI technology is increasingly applied to high-impact, high-risk domains, there have been a number of new methods aimed at making AI models more human interpretable.

Decision Making

Point and Ask: Incorporating Pointing into Visual Question Answering

1 code implementation27 Nov 2020 Arjun Mani, Nobline Yoo, Will Hinthorn, Olga Russakovsky

Concretely, we (1) introduce and motivate point-input questions as an extension of VQA, (2) define three novel classes of questions within this space, and (3) for each class, introduce both a benchmark dataset and a series of baseline models to handle its unique challenges.

Question Answering Visual Question Answering

Directional Bias Amplification

1 code implementation24 Feb 2021 Angelina Wang, Olga Russakovsky

We introduce and analyze a new, decoupled metric for measuring bias amplification, $\text{BiasAmp}_{\rightarrow}$ (Directional Bias Amplification).

Fairness

Towards Unique and Informative Captioning of Images

1 code implementation ECCV 2020 Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky

We find that modern captioning systems return higher likelihoods for incorrect distractor sentences compared to ground truth captions, and that evaluation metrics like SPICE can be 'topped' using simple captioning systems relying on object detectors.

Image Captioning Re-Ranking

Understanding and Evaluating Racial Biases in Image Captioning

1 code implementation ICCV 2021 Dora Zhao, Angelina Wang, Olga Russakovsky

Image captioning is an important task for benchmarking visual reasoning and for enabling accessibility for people with vision impairments.

Benchmarking Image Captioning +1

Overlooked factors in concept-based explanations: Dataset choice, concept learnability, and human capability

1 code implementation CVPR 2023 Vikram V. Ramaswamy, Sunnie S. Y. Kim, Ruth Fong, Olga Russakovsky

Second, we find that concepts in the probe dataset are often less salient and harder to learn than the classes they claim to explain, calling into question the correctness of the explanations.

CARETS: A Consistency And Robustness Evaluative Test Suite for VQA

1 code implementation ACL 2022 Carlos E. Jimenez, Olga Russakovsky, Karthik Narasimhan

We introduce CARETS, a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.

Negation Question Generation +2

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

1 code implementation27 Jul 2022 Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei

Particularly, SiRi conveys a significant principle to the research of visual grounding, i. e., a better initialized vision-language encoder would help the model converge to a better local minimum, advancing the performance accordingly.

Visual Grounding

Overwriting Pretrained Bias with Finetuning Data

1 code implementation ICCV 2023 Angelina Wang, Olga Russakovsky

Transfer learning is beneficial by allowing the expressive features of models pretrained on large-scale datasets to be finetuned for the target task of smaller, more domain-specific datasets.

Attribute Transfer Learning

Efficient, Self-Supervised Human Pose Estimation with Inductive Prior Tuning

1 code implementation6 Nov 2023 Nobline Yoo, Olga Russakovsky

We (1) analyze the relationship between reconstruction quality and pose estimation accuracy, (2) develop a model pipeline that outperforms the baseline which inspired our work, using less than one-third the amount of training data, and (3) offer a new metric suitable for self-supervised settings that measures the consistency of predicted body part length proportions.

2D Human Pose Estimation Pose Estimation

Predictive-Corrective Networks for Action Detection

no code implementations CVPR 2017 Achal Dave, Olga Russakovsky, Deva Ramanan

While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing.

Action Detection Optical Flow Estimation +2

Learning to Learn from Noisy Web Videos

no code implementations CVPR 2017 Serena Yeung, Vignesh Ramanathan, Olga Russakovsky, Liyue Shen, Greg Mori, Li Fei-Fei

Our method uses Q-learning to learn a data labeling policy on a small labeled training dataset, and then uses this to automatically label noisy web data for new visual concepts.

Action Recognition Q-Learning +1

Crowdsourcing in Computer Vision

no code implementations7 Nov 2016 Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, Kristen Grauman

Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts.

Object Recognition

Much Ado About Time: Exhaustive Annotation of Temporal Data

no code implementations25 Jul 2016 Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta

We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments).

Joint calibration of Ensemble of Exemplar SVMs

no code implementations CVPR 2015 Davide Modolo, Alexander Vezhnevets, Olga Russakovsky, Vittorio Ferrari

We formulate joint calibration as a constrained optimization problem and devise an efficient optimization algorithm to find its global optimum.

object-detection Object Detection

Best of Both Worlds: Human-Machine Collaboration for Object Annotation

no code implementations CVPR 2015 Olga Russakovsky, Li-Jia Li, Li Fei-Fei

This paper brings together the latest advancements in object detection and in crowd engineering into a principled framework for accurately and efficiently localizing objects in images.

Object object-detection +1

Human uncertainty makes classification more robust

no code implementations ICCV 2019 Joshua C. Peterson, Ruairidh M. Battleday, Thomas L. Griffiths, Olga Russakovsky

We then show that, while contemporary classifiers fail to exhibit human-like uncertainty on their own, explicit training on our dataset closes this gap, supports improved generalization to increasingly out-of-training-distribution test datasets, and confers robustness to adversarial attacks.

Classification General Classification

Compositional Temporal Visual Grounding of Natural Language Event Descriptions

no code implementations4 Dec 2019 Jonathan C. Stroud, Ryan McCaffrey, Rada Mihalcea, Jia Deng, Olga Russakovsky

Temporal grounding entails establishing a correspondence between natural language event descriptions and their visual depictions.

Visual Grounding

Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation

no code implementations31 Mar 2020 Felix Yu, Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky

In the Vision-and-Language Navigation (VLN) task, an agent with egocentric vision navigates to a destination given natural language instructions.

Vision and Language Navigation

A Technical and Normative Investigation of Social Bias Amplification

no code implementations1 Jan 2021 Angelina Wang, Olga Russakovsky

The conversation around the fairness of machine learning models is growing and evolving.

Fairness

Scaling Fair Learning to Hundreds of Intersectional Groups

no code implementations29 Sep 2021 Eric Zhao, De-An Huang, Hao liu, Zhiding Yu, Anqi Liu, Olga Russakovsky, Anima Anandkumar

In real-world applications, however, there are multiple protected attributes yielding a large number of intersectional protected groups.

Attribute Fairness +1

Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentation, and Performing Evaluation

1 code implementation10 May 2022 Angelina Wang, Vikram V. Ramaswamy, Olga Russakovsky

In this work, we grapple with questions that arise along three stages of the machine learning pipeline when incorporating intersectionality as multiple demographic attributes: (1) which demographic attributes to include as dataset labels, (2) how to handle the progressively smaller size of subgroups during model training, and (3) how to move beyond existing evaluation metrics when benchmarking model fairness for more subgroups.

Attribute Benchmarking +2

ELUDE: Generating interpretable explanations via a decomposition into labelled and unlabelled features

no code implementations15 Jun 2022 Vikram V. Ramaswamy, Sunnie S. Y. Kim, Nicole Meister, Ruth Fong, Olga Russakovsky

Specifically, we develop a novel explanation framework ELUDE (Explanation via Labelled and Unlabelled DEcomposition) that decomposes a model's prediction into two parts: one that is explainable through a linear combination of the semantic attributes, and another that is dependent on the set of uninterpretable features.

Attribute

Gender Artifacts in Visual Datasets

no code implementations ICCV 2023 Nicole Meister, Dora Zhao, Angelina Wang, Vikram V. Ramaswamy, Ruth Fong, Olga Russakovsky

Gender biases are known to exist within large-scale visual datasets and can be reflected or even amplified in downstream models.

Predicting Word Learning in Children from the Performance of Computer Vision Systems

no code implementations7 Jul 2022 Sunayana Rane, Mira L. Nencheva, Zeyu Wang, Casey Lew-Williams, Olga Russakovsky, Thomas L. Griffiths

The performance of the computer vision systems is correlated with human judgments of the concreteness of words, which are in turn a predictor of children's word learning, suggesting that these models are capturing the relationship between words and visual phenomena.

Image Captioning

"Help Me Help the AI": Understanding How Explainability Can Support Human-AI Interaction

no code implementations2 Oct 2022 Sunnie S. Y. Kim, Elizabeth Anne Watkins, Olga Russakovsky, Ruth Fong, Andrés Monroy-Hernández

Despite the proliferation of explainable AI (XAI) methods, little is understood about end-users' explainability needs and behaviors around XAI explanations.

Explainable Artificial Intelligence (XAI)

UFO: A unified method for controlling Understandability and Faithfulness Objectives in concept-based explanations for CNNs

no code implementations27 Mar 2023 Vikram V. Ramaswamy, Sunnie S. Y. Kim, Ruth Fong, Olga Russakovsky

In this work, we propose UFO, a unified method for controlling Understandability and Faithfulness Objectives in concept-based explanations.

Art and the science of generative AI: A deeper dive

no code implementations7 Jun 2023 Ziv Epstein, Aaron Hertzmann, Laura Herman, Robert Mahari, Morgan R. Frank, Matthew Groh, Hope Schroeder, Amy Smith, Memo Akten, Jessica Fjeld, Hany Farid, Neil Leach, Alex Pentland, Olga Russakovsky

A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation.

ICON$^2$: Reliably Benchmarking Predictive Inequity in Object Detection

no code implementations7 Jun 2023 Sruthi Sudhakar, Viraj Prabhu, Olga Russakovsky, Judy Hoffman

As computer vision systems are being increasingly deployed at scale in high-stakes applications like autonomous driving, concerns about social bias in these systems are rising.

Attribute Autonomous Driving +5

ImageNet-OOD: Deciphering Modern Out-of-Distribution Detection Algorithms

1 code implementation3 Oct 2023 William Yang, Byron Zhang, Olga Russakovsky

Through comprehensive experiments, we show that OOD detectors are more sensitive to covariate shift than to semantic shift, and the benefits of recent OOD detection algorithms on semantic shift detection is minimal.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Unseen Image Synthesis with Diffusion Models

no code implementations13 Oct 2023 Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan Yan

While the current trend in the generative field is scaling up towards larger models and more training data for generalized domain representations, we go the opposite direction in this work by synthesizing unseen domain images without additional training.

Denoising Image Generation

DETER: Detecting Edited Regions for Deterring Generative Manipulations

no code implementations16 Dec 2023 Sai Wang, Ye Zhu, Ruoyu Wang, Amaya Dharmasiri, Olga Russakovsky, Yu Wu

While face swapping and attribute editing are performed on similar face regions such as eyes and nose, the inpainting operation can be performed on random image regions, removing the spurious correlations of previous datasets.

Attribute Face Swapping +1

Cannot find the paper you are looking for? You can Submit a new open access paper.