1 code implementation • ECCV 2020 • Tai-Yin Chiu, Danna Gurari
The general framework for fast universal style transfer consists of an autoencoder and a feature transformation at the bottleneck.
no code implementations • 21 Apr 2024 • Stuti Pandey, Josh Myers-Dean, Jarek Reynolds, Danna Gurari
Accordingly, we explore the abilities of modern foundation vision language models (VLMs) in interpreting such tests.
no code implementations • 17 Dec 2023 • Mengchen Liu, Chongyan Chen, Danna Gurari
While there is much excitement about the potential of large multimodal models (LMM), a comprehensive evaluation is critical to establish their true capabilities and limitations.
1 code implementation • 27 Nov 2023 • Chongyan Chen, Mengchen Liu, Noel Codella, Yunsheng Li, Lu Yuan, Danna Gurari
Visual Question Answering (VQA) entails answering questions about images.
1 code implementation • ICCV 2023 • Chongyan Chen, Samreen Anjum, Danna Gurari
Visual question answering is a task of predicting the answer to a question about an image.
no code implementations • 20 Jul 2023 • Josh Myers-Dean, Yifei Fan, Brian Price, Wilson Chan, Danna Gurari
Interactive segmentation entails a human marking an image to guide how a model either creates or edits a segmentation.
1 code implementation • 14 May 2023 • Maniratnam Mandal, Deepti Ghadiyaram, Danna Gurari, Alan C. Bovik
The photographs taken by visually impaired users often suffer from one or both of two kinds of quality issues: technical quality (distortions), and semantic quality, such as framing and aesthetic composition.
no code implementations • 12 Jan 2023 • Jarek Reynolds, Chandra Kanth Nagesh, Danna Gurari
Salient object detection is the task of producing a binary mask for an image that deciphers which pixels belong to the foreground object versus background.
1 code implementation • CVPR 2023 • Reza Akbarian Bafghi, Danna Gurari
Our goal is to improve upon the status quo for designing image classification models trained in one domain that perform well on images from another domain.
no code implementations • 12 Oct 2022 • Tai-Yin Chiu, Danna Gurari
Photorealistic style transfer is the task of synthesizing a realistic-looking image when adapting the content from one image to appear in the style of another image.
no code implementations • 24 Jul 2022 • Yu-Yun Tseng, Alexander Bell, Danna Gurari
Compared to existing few-shot object detection and instance segmentation datasets, our dataset is the first to locate holes in objects (e. g., found in 12. 3\% of our segmentations), it shows objects that occupy a much larger range of sizes relative to the images, and text is over five times more common in our objects (e. g., found in 22. 4\% of our segmentations).
1 code implementation • CVPR 2022 • Tai-Yin Chiu, Danna Gurari
To our knowledge, this is the first knowledge distillation method for photorealistic style transfer.
1 code implementation • CVPR 2022 • Chongyan Chen, Samreen Anjum, Danna Gurari
Visual question answering is the task of answering questions about images.
no code implementations • 21 Dec 2021 • Josh Myers-Dean, Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari
Generalized few-shot semantic segmentation was introduced to move beyond only evaluating few-shot segmentation models on novel classes to include testing their ability to remember base classes.
1 code implementation • 22 Oct 2021 • Tai-Yin Chiu, Danna Gurari
First, we introduce blockwise training to perform coarse-to-fine feature transformations that enable state-of-art stylization strength in a single autoencoder in place of the inefficient cascade of four autoencoders used in PhotoWCT.
no code implementations • 29 Sep 2020 • Samreen Anjum, Chi Lin, Danna Gurari
Crowdsourcing is a valuable approach for tracking objects in videos in a more scalable manner than possible with domain experts.
1 code implementation • 6 Apr 2020 • Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari
We demonstrate how to increase overall model capacity to achieve improved performance, by introducing objectness, which is class-agnostic and so not prone to overfitting, for complementary use with class-specific features.
no code implementations • CVPR 2020 • Tai-Yin Chiu, Yinan Zhao, Danna Gurari
We introduce a new large-scale dataset that links the assessment of image quality issues to two practical vision tasks: image captioning and visual question answering.
no code implementations • ECCV 2020 • Danna Gurari, Yinan Zhao, Meng Zhang, Nilavra Bhattacharya
While an important problem in the vision community is to design algorithms that can automatically caption images, few publicly-available datasets for algorithm development directly address the interests of real users.
no code implementations • 19 Dec 2019 • Nilavra Bhattacharya, Danna Gurari
We present a visualization tool to exhaustively search and browse through a set of large-scale machine learning datasets.
no code implementations • ICCV 2019 • Nilavra Bhattacharya, Qing Li, Danna Gurari
Visual question answering is the task of returning the answer to a question about an image.
no code implementations • ICCV 2019 • Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari
Many people search for foreground objects to use when editing images.
no code implementations • CVPR 2019 • Danna Gurari, Qing Li, Chi Lin, Yinan Zhao, Anhong Guo, Abigale Stangl, Jeffrey P. Bigham
We introduce the first visual privacy dataset originating from people who are blind in order to better understand their privacy disclosures and to encourage the development of algorithms that can assist in preventing their unintended disclosures.
no code implementations • 30 Apr 2019 • Danna Gurari, Yinan Zhao, Suyog Dutt Jain, Margrit Betke, Kristen Grauman
We propose a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and automated methods.
no code implementations • 22 Mar 2018 • Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari
Deep generative models have shown success in automatically synthesizing missing image regions using surrounding context.
1 code implementation • CVPR 2018 • Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, Jeffrey P. Bigham
The study of algorithms to automatically answer visual questions currently is motivated by visual question answering (VQA) datasets constructed in artificial VQA settings.
no code implementations • 30 Apr 2017 • Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, Kristen Grauman
We propose the ambiguity problem for the foreground object segmentation task and motivate the importance of estimating and accounting for this ambiguity when designing vision systems.
no code implementations • 29 Aug 2016 • Danna Gurari, Kristen Grauman
Visual question answering (VQA) systems are emerging from a desire to empower users to ask any natural language question about visual content and receive a valid answer in response.
no code implementations • CVPR 2016 • Danna Gurari, Suyog Jain, Margrit Betke, Kristen Grauman
We propose a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and automated methods.