Search Results for author: Scott Cohen

Found 59 papers, 21 papers with code

MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis

no code implementations3 Dec 2024 Tianyu Wang, Jianming Zhang, Haitian Zheng, Zhihong Ding, Scott Cohen, Zhe Lin, Wei Xiong, Chi-Wing Fu, Luis Figueroa, Soo Ye Kim

MetaShadow combines the strengths of two cooperative components: Shadow Analyzer, for object-centered shadow detection and removal, and Shadow Synthesizer, for reference-based controllable shadow synthesis.

Object Shadow Detection And Removal

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

no code implementations CVPR 2024 Eric Slyman, Stefan Lee, Scott Cohen, Kushal Kafle

Recent dataset deduplication techniques have demonstrated that content-aware dataset pruning can dramatically reduce the cost of training Vision-Language Pretrained (VLP) models without significant performance losses compared to training on the original dataset.

Fairness

FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

no code implementations23 Apr 2024 Hang Hua, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo

To address this, we propose FineMatch, a new aspect-based fine-grained text and image matching benchmark, focusing on text and image mismatch detection and correction.

Hallucination In-Context Learning +2

Latent Feature-Guided Diffusion Models for Shadow Removal

no code implementations4 Dec 2023 Kangfu Mei, Luis Figueroa, Zhe Lin, Zhihong Ding, Scott Cohen, Vishal M. Patel

Recovering textures under shadows has remained a challenging problem due to the difficulty of inferring shadow-free scenes from shadow images.

Shadow Removal

SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data

1 code implementation24 Aug 2023 Ziyan Yang, Kushal Kafle, Zhe Lin, Scott Cohen, Zhihong Ding, Vicente Ordonez

To solve this problem, we propose an auto-regressive model that given a subject, it predicts its relations, objects, and object locations by casting this output as a sequence of tokens.

Object Relation

GamutMLP: A Lightweight MLP for Color Loss Recovery

no code implementations CVPR 2023 Hoang M. Le, Brian Price, Scott Cohen, Michael S. Brown

Inspired by neural implicit representations for 2D images, we propose a method that optimizes a lightweight multi-layer-perceptron (MLP) model during the gamut reduction step to predict the clipped values.

TopNet: Transformer-based Object Placement Network for Image Compositing

1 code implementation CVPR 2023 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.

Object

ObjectStitch: Object Compositing With Diffusion Model

no code implementations CVPR 2023 Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.

Data Augmentation Object

ObjectStitch: Generative Object Compositing

1 code implementation2 Dec 2022 Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.

Data Augmentation Object

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing

no code implementations31 Mar 2022 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.

Object

CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training

1 code implementation22 Mar 2022 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo

We propose cascaded modulation GAN (CM-GAN), a new network design consisting of an encoder with Fourier convolution blocks that extract multi-scale feature representations from the input image with holes and a dual-stream decoder with a novel cascaded global-spatial modulation block at each scale level.

Decoder Image Inpainting

Generalized Few-Shot Semantic Segmentation: All You Need is Fine-Tuning

no code implementations21 Dec 2021 Josh Myers-Dean, Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari

Generalized few-shot semantic segmentation was introduced to move beyond only evaluating few-shot segmentation models on novel classes to include testing their ability to remember base classes.

Generalized Few-Shot Semantic Segmentation Meta-Learning +3

Learning to Predict Visual Attributes in the Wild

no code implementations CVPR 2021 Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava

In this paper, we introduce a large-scale in-the-wild visual attribute prediction dataset consisting of over 927K attribute annotations for over 260K object instances.

Attribute Contrastive Learning +2

AESOP: Abstract Encoding of Stories, Objects, and Pictures

2 code implementations ICCV 2021 Hareesh Ravi, Kushal Kafle, Scott Cohen, Jonathan Brandt, Mubbasir Kapadia

Visual storytelling and story comprehension are uniquely human skills that play a central role in how we learn about and experience the world.

Story Completion Visual Storytelling

Semantic Layout Manipulation with High-Resolution Sparse Attention

1 code implementation14 Dec 2020 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Jianming Zhang, Ning Xu, Jiebo Luo

A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.

Decoder Vocal Bursts Intensity Prediction

PhraseCut: Language-based Image Segmentation in the Wild

1 code implementation CVPR 2020 Chenyun Wu, Zhe Lin, Scott Cohen, Trung Bui, Subhransu Maji

We consider the problem of segmenting image regions given a natural language phrase, and study it on a novel dataset of 77, 262 images and 345, 486 phrase-region pairs.

Attribute Diversity +3

Objectness-Aware Few-Shot Semantic Segmentation

1 code implementation6 Apr 2020 Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari

We demonstrate how to increase overall model capacity to achieve improved performance, by introducing objectness, which is class-agnostic and so not prone to overfitting, for complementary use with class-specific features.

Few-Shot Semantic Segmentation Segmentation +1

DeepStrip: High Resolution Boundary Refinement

no code implementations25 Mar 2020 Peng Zhou, Brian Price, Scott Cohen, Gregg Wilensky, Larry S. Davis

In this paper, we target refining the boundaries in high resolution images given low resolution masks.

Vocal Bursts Intensity Prediction

Getting to 99% Accuracy in Interactive Segmentation

3 code implementations17 Mar 2020 Marco Forte, Brian Price, Scott Cohen, Ning Xu, François Pitié

We propose a novel interactive architecture and a novel training scheme that are both tailored to better exploit the user workflow.

Deep Learning Interactive Segmentation

Deep Visual Template-Free Form Parsing

3 code implementations5 Sep 2019 Brian Davis, Bryan Morse, Scott Cohen, Brian Price, Chris Tensmeyer

Automatic, template-free extraction of information from form images is challenging due to the variety of form layouts.

Answering Questions about Data Visualizations using Efficient Bimodal Fusion

1 code implementation5 Aug 2019 Kushal Kafle, Robik Shrestha, Brian Price, Scott Cohen, Christopher Kanan

Chart question answering (CQA) is a newly proposed visual question answering (VQA) task where an algorithm must answer questions about data visualizations, e. g. bar charts, pie charts, and line graphs.

Chart Question Answering Optical Character Recognition +3

Figure Captioning with Reasoning and Sequence-Level Training

no code implementations7 Jun 2019 Charles Chen, Ruiyi Zhang, Eunyee Koh, Sungchul Kim, Scott Cohen, Tong Yu, Ryan Rossi, Razvan Bunescu

In this work, we investigate the problem of figure captioning where the goal is to automatically generate a natural language description of the figure.

Image Captioning Reinforcement Learning

Image Recoloring Based on Object Color Distributions

1 code implementation Eurographics 2019 - Short Papers 2019 Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S. Brown

We present a method to perform automatic image recoloring based on the distribution of colors associated with objects present in an image.

Object Segmentation +1

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

4 code implementations ECCV 2018 Ning Xu, Linjie Yang, Yuchen Fan, Jianchao Yang, Dingcheng Yue, Yuchen Liang, Brian Price, Scott Cohen, Thomas Huang

End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.

Ranked #12 on Video Object Segmentation on YouTube-VOS 2018 (F-Measure (Unseen) metric)

Image Segmentation Object +7

Start, Follow, Read: End-to-End Full-Page Handwriting Recognition

1 code implementation ECCV 2018 Curtis Wigington, Chris Tensmeyer, Brian Davis, William Barrett, Brian Price, Scott Cohen

Despite decades of research, offline handwriting recognition (HWR) of degraded historical documents remains a challenging problem, which if solved could greatly improve the searchability of online cultural heritage archives.

Handwriting Recognition Handwritten Text Recognition +4

Interactive Boundary Prediction for Object Selection

no code implementations ECCV 2018 Hoang Le, Long Mai, Brian Price, Scott Cohen, Hailin Jin, Feng Liu

Instead of relying on pre-defined low-level image features, our method adaptively predicts object boundaries according to image content and user interactions.

Decoder Image Segmentation +4

Concept Mask: Large-Scale Segmentation from Semantic Concepts

no code implementations ECCV 2018 Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen

Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts.

Image Segmentation Segmentation +1

Guided Image Inpainting: Replacing an Image Region by Pulling Content from Another Image

no code implementations22 Mar 2018 Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari

Deep generative models have shown success in automatically synthesizing missing image regions using surrounding context.

Image Inpainting

Discriminability objective for training descriptive captions

1 code implementation CVPR 2018 Ruotian Luo, Brian Price, Scott Cohen, Gregory Shakhnarovich

One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them.

Caption Generation Descriptive +1

Group-Theme Recoloring for Multi-Image Color Consistency

1 code implementation Pacific Graphics 2017 Rang Nguyen, Brian Price, Scott Cohen, and Michael S. Brown

Methods such as color transfer are effective in making an image share similar colors with a target image; however, color transfer is not suitable for modifying multiple images.

Deep GrabCut for Object Selection

no code implementations2 Jul 2017 Ning Xu, Brian Price, Scott Cohen, Jimei Yang, Thomas Huang

In this paper, we propose a novel segmentation approach that uses a rectangle as a soft constraint by transforming it into an Euclidean distance map.

Decoder Instance Segmentation +4

Depth From Defocus in the Wild

no code implementations CVPR 2017 Huixuan Tang, Scott Cohen, Brian Price, Stephen Schiller, Kiriakos N. Kutulakos

We consider the problem of two-frame depth from defocus in conditions unsuitable for existing methods yet typical of everyday photography: a handheld cellphone camera, a small aperture, a non-stationary scene and sparse surface texture.

Relationship Proposal Networks

no code implementations CVPR 2017 Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal

We demonstrate the ability of our Rel-PN to localize relationships with only a few thousand proposals.

Scene Understanding

Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition

no code implementations CVPR 2017 Yufei Wang, Zhe Lin, Xiaohui Shen, Scott Cohen, Garrison W. Cottrell

Furthermore, our algorithm can generate descriptions with varied length, benefiting from the separate control of the skeleton and attributes.

Attribute Image Captioning +2

Deep Image Matting

8 code implementations CVPR 2017 Ning Xu, Brian Price, Scott Cohen, Thomas Huang

We evaluate our algorithm on the image matting benchmark, our testing set, and a wide variety of real images.

Decoder Semantic Image Matting

SURGE: Surface Regularized Geometry Estimation from a Single Image

no code implementations NeurIPS 2016 Peng Wang, Xiaohui Shen, Bryan Russell, Scott Cohen, Brian Price, Alan L. Yuille

This paper introduces an approach to regularize 2. 5D surface normal and depth predictions at each pixel given a single input image.

Progressive Attention Networks for Visual Attribute Prediction

1 code implementation8 Jun 2016 Paul Hongsuck Seo, Zhe Lin, Scott Cohen, Xiaohui Shen, Bohyung Han

We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images.

Attribute Hard Attention

Two Illuminant Estimation and User Correction Preference

no code implementations CVPR 2016 Dongliang Cheng, Abdelrahman Abdelhamed, Brian Price, Scott Cohen, Michael S. Brown

Existing methods attempt to estimate a spatially varying illumination map, however, results are error prone and the resulting illumination maps are too low-resolution to be used for proper spatially varying white-balance correction.

Vocal Bursts Valence Prediction

Interactive Segmentation on RGBD Images via Cue Selection

no code implementations CVPR 2016 Jie Feng, Brian Price, Scott Cohen, Shih-Fu Chang

While these methods achieve better results than color-based methods, they are still limited in either using depth as an additional color channel or simply combining depth with color in a linear way.

Image Retrieval Image Segmentation +5

Automatic Annotation of Structured Facts in Images

no code implementations WS 2016 Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions.

Beyond White: Ground Truth Colors for Color Constancy Correction

no code implementations ICCV 2015 Dongliang Cheng, Brian Price, Scott Cohen, Michael S. Brown

A limitation in color constancy research is the inability to establish ground truth colors for evaluating corrected images.

Color Constancy

Sherlock: Scalable Fact Learning in Images

no code implementations16 Nov 2015 Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

We show that learning visual facts in a structured way enables not only a uniform but also generalizable visual understanding.

Multiview Learning Retrieval

Towards Unified Depth and Semantic Prediction From a Single Image

no code implementations CVPR 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan L. Yuille

By allowing for interactions between the depth and semantic information, the joint network provides more accurate depth prediction than a state-of-the-art CNN trained solely for depth prediction [5].

Depth Estimation Depth Prediction +1

Effective Learning-Based Illuminant Estimation Using Simple Features

no code implementations CVPR 2015 Dongliang Cheng, Brian Price, Scott Cohen, Michael S. Brown

More recent state-of-the-art methods employ learning-based techniques that produce better results, but often rely on complex features and have long evaluation and training times.

Color Constancy

PatchCut: Data-Driven Object Segmentation via Local Shape Transfer

no code implementations CVPR 2015 Jimei Yang, Brian Price, Scott Cohen, Zhe Lin, Ming-Hsuan Yang

The transferred local shape masks constitute a patch-level segmentation solution space and we thus develop a novel cascade algorithm, PatchCut, for coarse-to-fine object segmentation.

Object Object Discovery +2

Joint Object and Part Segmentation using Deep Learned Potentials

no code implementations ICCV 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille

Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision.

Object Segmentation +1

Semantic Object Selection

no code implementations CVPR 2014 Ejaz Ahmed, Scott Cohen, Brian Price

With the tag provided by the user we do a text query of an image database to gather exemplars of the object.

Image Retrieval Object +5

Large Displacement Optical Flow from Nearest Neighbor Fields

no code implementations CVPR 2013 Zhuoyuan Chen, Hailin Jin, Zhe Lin, Scott Cohen, Ying Wu

We use approximate nearest neighbor fields to compute an initial motion field and use a robust algorithm to compute a set of similarity transformations as the motion candidates for segmentation.

Motion Estimation Motion Segmentation +2

Improving Image Matting Using Comprehensive Sampling Sets

no code implementations CVPR 2013 Ehsan Shahrian, Deepu Rajan, Brian Price, Scott Cohen

The first is that the range in which the foreground and background are sampled is often limited to such an extent that the true foreground and background colors are not present.

Image Matting

Fast Image Super-Resolution Based on In-Place Example Regression

no code implementations CVPR 2013 Jianchao Yang, Zhe Lin, Scott Cohen

Extensive experiments on benchmark and realworld images demonstrate that our algorithm can produce natural-looking results with sharp edges and preserved fine details, while the current state-of-the-art algorithms are prone to visual artifacts.

Image Super-Resolution regression

Cannot find the paper you are looking for? You can Submit a new open access paper.