Search Results for author: Alexander C. Berg

Found 38 papers, 14 papers with code

Learning from Models and Data for Visual Grounding

no code implementations20 Mar 2024 Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

We introduce SynGround, a novel framework that combines data-driven learning and knowledge transfer from various large-scale pretrained models to enhance the visual grounding capabilities of a pretrained vision-and-language model.

Language Modelling Large Language Model +2

Improved Visual Grounding through Self-Consistent Explanations

no code implementations7 Dec 2023 Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

Vision-and-language models trained to match images with text can be combined with visual explanation methods to point to the locations of specific objects in an image.

Language Modelling Large Language Model +1

Neural Pseudo-Label Optimism for the Bank Loan Problem

no code implementations NeurIPS 2021 Aldo Pacchiano, Shaun Singh, Edward Chou, Alexander C. Berg, Jakob Foerster

The lender only observes whether a customer will repay a loan if the loan is issued to begin with, and thus modeled decisions affect what data is available to the lender for future decisions.

Decision Making Pseudo Label

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

2 code implementations CVPR 2021 Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov

We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects.

Image Segmentation Object +2

A Mask-RCNN Baseline for Probabilistic Object Detection

no code implementations9 Aug 2019 Phil Ammirato, Alexander C. Berg

The Probabilistic Object Detection Challenge evaluates object detection methods using a new evaluation measure, Probability-based Detection Quality (PDQ), on a new synthetic image dataset.

Object object-detection +1

IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things

no code implementations ICCV 2019 Cheng-Yang Fu, Tamara L. Berg, Alexander C. Berg

In addition, the instance mask projection operator works well on other (non-clothing) datasets, providing an improvement of 3 points in mIOU on Thing classes of Cityscapes, a self-driving dataset, on top of a state-of-the-art approach.

Instance Segmentation Scene Segmentation +2

Low Power Inference for On-Device Visual Recognition with a Quantization-Friendly Solution

no code implementations12 Mar 2019 Chen Feng, Tao Sheng, Zhiyu Liang, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, Matthew Ardi, Alexander C. Berg, Yiran Chen, Bo Chen, Kent Gauen, Yung-Hsiang Lu

The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015 that encourages joint hardware and software solutions for computer vision systems with low latency and power.

Quantization

RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free

52 code implementations10 Jan 2019 Cheng-Yang Fu, Mykhailo Shvets, Alexander C. Berg

COCO test-dev results are up to 41. 4 mAP for RetinaMask-101 vs 39. 1mAP for RetinaNet-101, while the runtime is the same during evaluation.

Object Detection

Target Driven Instance Detection

1 code implementation13 Mar 2018 Phil Ammirato, Cheng-Yang Fu, Mykhailo Shvets, Jana Kosecka, Alexander C. Berg

While state-of-the-art general object detectors are getting better and better, there are not many systems specifically designed to take advantage of the instance detection problem.

Object

Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers

no code implementations ECCV 2018 Eunbyung Park, Alexander C. Berg

The meta learning is driven by the goal of deep networks that can quickly be adapted to robustly model a particular target in future frames.

Meta-Learning

Video Highlight Prediction Using Audience Chat Reactions

no code implementations EMNLP 2017 Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander C. Berg

Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis.

Transformation-Grounded Image Generation Network for Novel 3D View Synthesis

2 code implementations CVPR 2017 Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, Alexander C. Berg

Instead of taking a 'blank slate' approach, we first explicitly infer the parts of the geometry visible both in the input and novel views and then re-cast the remaining synthesis problem as image completion.

Image Generation Novel View Synthesis

A Dataset for Developing and Benchmarking Active Vision

no code implementations27 Feb 2017 Phil Ammirato, Patrick Poirson, Eunbyung Park, Jana Kosecka, Alexander C. Berg

We present a new public dataset with a focus on simulating robotic vision tasks in everyday indoor environments using real imagery.

Benchmarking General Classification +5

Synthesizing Training Data for Object Detection in Indoor Scenes

no code implementations25 Feb 2017 Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka

In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection.

Object object-detection +1

DSSD : Deconvolutional Single Shot Detector

2 code implementations23 Jan 2017 Cheng-Yang Fu, Wei Liu, Ananth Ranga, Ambrish Tyagi, Alexander C. Berg

The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection.

object-detection Object Detection

Combining Multiple Cues for Visual Madlibs Question Answering

no code implementations1 Nov 2016 Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.

Attribute General Classification +3

Fast Single Shot Detection and Pose Estimation

no code implementations19 Sep 2016 Patrick Poirson, Phil Ammirato, Cheng-Yang Fu, Wei Liu, Jana Kosecka, Alexander C. Berg

For applications in navigation and robotics, estimating the 3D pose of objects is as important as detection.

Object Tracking Pose Estimation

When was that made?

no code implementations12 Aug 2016 Sirion Vittayakorn, Alexander C. Berg, Tamara L. Berg

Toward this goal, we utilize features from existing deep networks and also fine-tune new networks for temporal estimation.

Retrieval

Solving Visual Madlibs with Multiple Cues

no code implementations11 Aug 2016 Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.

Activity Prediction Attribute +4

SSD: Single Shot MultiBox Detector

223 code implementations8 Dec 2015 Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg

Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference.

LIDAR Semantic Segmentation Low-Light Image Enhancement +4

Where to Buy It: Matching Street Clothing Photos in Online Shops

no code implementations ICCV 2015 M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

In this paper, we define a new task, Exact Street to Shop, where our goal is to match a real-world example of a garment item to the same item in an online shop.

Retrieval

Learning to decompose for object detection and instance segmentation

no code implementations19 Nov 2015 Eunbyung Park, Alexander C. Berg

Although deep convolutional neural networks(CNNs) have achieved remarkable results on object detection and segmentation, pre- and post-processing steps such as region proposals and non-maximum suppression(NMS), have been required.

Instance Segmentation Object +4

Piecewise Linear Activation Functions For More Efficient Deep Networks

no code implementations11 Nov 2015 Cheng-Yang Fu, Alexander C. Berg

This submission has been withdrawn by arXiv administrators because it is intentionally incomplete, which is in violation of our policies.

ParseNet: Looking Wider to See Better

4 code implementations15 Jun 2015 Wei Liu, Andrew Rabinovich, Alexander C. Berg

When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines.

Segmentation Semantic Segmentation

PAIGE: PAirwise Image Geometry Encoding for Improved Efficiency in Structure-From-Motion

no code implementations CVPR 2015 Johannes L. Schonberger, Alexander C. Berg, Jan-Michael Frahm

Based on the insights of this evaluation, we propose a learning-based approach, the PAirwise Image Geometry Encoding (PAIGE), to efficiently identify image pairs with scene overlap without the need to perform exhaustive putative matching and geometric verification.

MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching

2 code implementations CVPR 2015 Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg

We perform a comprehensive set of experiments on standard datasets to carefully study the contributions of each aspect of MatchNet, with direct comparisons to established methods.

Computational Efficiency Metric Learning +1

Visual Madlibs: Fill in the blank Image Generation and Question Answering

no code implementations31 May 2015 Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg

In this paper, we introduce a new dataset consisting of 360, 001 focused natural language descriptions for 10, 738 images.

Image Generation Multiple-choice +1

ImageNet Large Scale Visual Recognition Challenge

12 code implementations1 Sep 2014 Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei

The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images.

General Classification Image Classification +4

Cannot find the paper you are looking for? You can Submit a new open access paper.