Search Results for author: Alexander C. Berg

Found 38 papers, 14 papers with code

Learning from Models and Data for Visual Grounding

no code implementations • 20 Mar 2024 • Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

We introduce SynGround, a novel framework that combines data-driven learning and knowledge transfer from various large-scale pretrained models to enhance the visual grounding capabilities of a pretrained vision-and-language model.

Language Modelling Large Language Model +2

Paper
Add Code

Improved Visual Grounding through Self-Consistent Explanations

no code implementations • 7 Dec 2023 • Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

Vision-and-language models trained to match images with text can be combined with visual explanation methods to point to the locations of specific objects in an image.

Language Modelling Large Language Model +1

Paper
Add Code

Joint Depth Prediction and Semantic Segmentation with Multi-View SAM

no code implementations • 31 Oct 2023 • Mykhailo Shvets, Dongxu Zhao, Marc Niethammer, Roni Sengupta, Alexander C. Berg

Multi-task approaches to joint depth and segmentation prediction are well-studied for monocular images.

Depth Estimation Depth Prediction +2

Paper
Add Code

Segment Anything

18 code implementations • ICCV 2023 • Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick

We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation.

Ranked #2 on Zero-Shot Instance Segmentation on LVIS v1.0 val

Event-based Object Segmentation Image Segmentation +3

124,527

Paper
Code

Point-Level Region Contrast for Object Detection Pre-Training

1 code implementation • CVPR 2022 • Yutong Bai, Xinlei Chen, Alexander Kirillov, Alan Yuille, Alexander C. Berg

In this work we present point-level region contrast, a self-supervised pre-training approach for the task of object detection.

Contrastive Learning Knowledge Distillation +2

Paper
Code

Neural Pseudo-Label Optimism for the Bank Loan Problem

no code implementations • NeurIPS 2021 • Aldo Pacchiano, Shaun Singh, Edward Chou, Alexander C. Berg, Jakob Foerster

The lender only observes whether a customer will repay a loan if the loan is issued to begin with, and thus modeled decisions affect what data is available to the lender for future decisions.

Decision Making Pseudo Label

Paper
Add Code

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

2 code implementations • CVPR 2021 • Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov

We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects.

Image Segmentation Object +2

206

Paper
Code

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image

1 code implementation • ICCV 2021 • Ronghang Hu, Nikhila Ravi, Alexander C. Berg, Deepak Pathak

We present Worldsheet, a method for novel view synthesis using just a single RGB image as input.

Novel View Synthesis

Paper
Code

Similarity Search for Efficient Active Learning and Search of Rare Concepts

1 code implementation • 30 Jun 2020 • Cody Coleman, Edward Chou, Julian Katz-Samuels, Sean Culatana, Peter Bailis, Alexander C. Berg, Robert Nowak, Roshan Sumbaly, Matei Zaharia, I. Zeki Yalniz

Many active learning and search approaches are intractable for large-scale industrial settings with billions of unlabeled examples.

Active Learning Computational Efficiency

520

Paper
Code

A Mask-RCNN Baseline for Probabilistic Object Detection

no code implementations • 9 Aug 2019 • Phil Ammirato, Alexander C. Berg

The Probabilistic Object Detection Challenge evaluates object detection methods using a new evaluation measure, Probability-based Detection Quality (PDQ), on a new synthetic image dataset.

Object object-detection +1

Paper
Add Code

IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things

no code implementations • ICCV 2019 • Cheng-Yang Fu, Tamara L. Berg, Alexander C. Berg

In addition, the instance mask projection operator works well on other (non-clothing) datasets, providing an improvement of 3 points in mIOU on Thing classes of Cityscapes, a self-driving dataset, on top of a state-of-the-art approach.

Instance Segmentation Scene Segmentation +2

Paper
Add Code

Low-Power Computer Vision: Status, Challenges, Opportunities

no code implementations • 15 Apr 2019 • Sergei Alyamkin, Matthew Ardi, Alexander C. Berg, Achille Brighton, Bo Chen, Yiran Chen, Hsin-Pai Cheng, Zichen Fan, Chen Feng, Bo Fu, Kent Gauen, Abhinav Goel, Alexander Goncharenko, Xuyang Guo, Soonhoi Ha, Andrew Howard, Xiao Hu, Yuanjun Huang, Donghyun Kang, Jaeyoun Kim, Jong Gook Ko, Alexander Kondratyev, Junhyeok Lee, Seungjae Lee, Suwoong Lee, Zichao Li, Zhiyu Liang, Juzheng Liu, Xin Liu, Yang Lu, Yung-Hsiang Lu, Deeptanshu Malik, Hong Hanh Nguyen, Eunbyung Park, Denis Repin, Liang Shen, Tao Sheng, Fei Sun, David Svitov, George K. Thiruvathukal, Baiwu Zhang, Jingchi Zhang, Xiaopeng Zhang, Shaojie Zhuo

In addition to mobile phones, many autonomous systems rely on visual data for making decisions and some of these systems have limited energy (such as unmanned aerial vehicles also called drones and mobile robots).

Paper
Add Code

Low Power Inference for On-Device Visual Recognition with a Quantization-Friendly Solution

no code implementations • 12 Mar 2019 • Chen Feng, Tao Sheng, Zhiyu Liang, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, Matthew Ardi, Alexander C. Berg, Yiran Chen, Bo Chen, Kent Gauen, Yung-Hsiang Lu

The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015 that encourages joint hardware and software solutions for computer vision systems with low latency and power.

Quantization

Paper
Add Code

RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free

52 code implementations • 10 Jan 2019 • Cheng-Yang Fu, Mykhailo Shvets, Alexander C. Berg

COCO test-dev results are up to 41. 4 mAP for RetinaMask-101 vs 39. 1mAP for RetinaNet-101, while the runtime is the same during evaluation.

Ranked #154 on Object Detection on COCO minival

Object Detection

340

Paper
Code

2018 Low-Power Image Recognition Challenge

no code implementations • 3 Oct 2018 • Sergei Alyamkin, Matthew Ardi, Achille Brighton, Alexander C. Berg, Yiran Chen, Hsin-Pai Cheng, Bo Chen, Zichen Fan, Chen Feng, Bo Fu, Kent Gauen, Jongkook Go, Alexander Goncharenko, Xuyang Guo, Hong Hanh Nguyen, Andrew Howard, Yuanjun Huang, Donghyun Kang, Jaeyoun Kim, Alexander Kondratyev, Seungjae Lee, Suwoong Lee, Junhyeok Lee, Zhiyu Liang, Xin Liu, Juzheng Liu, Zichao Li, Yang Lu, Yung-Hsiang Lu, Deeptanshu Malik, Eunbyung Park, Denis Repin, Tao Sheng, Liang Shen, Fei Sun, David Svitov, George K. Thiruvathukal, Baiwu Zhang, Jingchi Zhang, Xiaopeng Zhang, Shaojie Zhuo

The Low-Power Image Recognition Challenge (LPIRC, https://rebootingcomputing. ieee. org/lpirc) is an annual competition started in 2015.

Paper
Add Code

Target Driven Instance Detection

1 code implementation • 13 Mar 2018 • Phil Ammirato, Cheng-Yang Fu, Mykhailo Shvets, Jana Kosecka, Alexander C. Berg

While state-of-the-art general object detectors are getting better and better, there are not many systems specifically designed to take advantage of the instance detection problem.

Object

Paper
Code

Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers

no code implementations • ECCV 2018 • Eunbyung Park, Alexander C. Berg

The meta learning is driven by the goal of deep networks that can quickly be adapted to robustly model a particular target in future frames.

Meta-Learning

Paper
Add Code

Video Highlight Prediction Using Audience Chat Reactions

no code implementations • EMNLP 2017 • Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander C. Berg

Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis.

Paper
Add Code

Transformation-Grounded Image Generation Network for Novel 3D View Synthesis

2 code implementations • CVPR 2017 • Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, Alexander C. Berg

Instead of taking a 'blank slate' approach, we first explicitly infer the parts of the geometry visible both in the input and novel views and then re-cast the remaining synthesis problem as image completion.

Image Generation Novel View Synthesis

Paper
Code

A Dataset for Developing and Benchmarking Active Vision

no code implementations • 27 Feb 2017 • Phil Ammirato, Patrick Poirson, Eunbyung Park, Jana Kosecka, Alexander C. Berg

We present a new public dataset with a focus on simulating robotic vision tasks in everyday indoor environments using real imagery.

Benchmarking General Classification +5

Paper
Add Code

Synthesizing Training Data for Object Detection in Indoor Scenes

no code implementations • 25 Feb 2017 • Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka

In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection.

Object object-detection +1

Paper
Add Code

DSSD : Deconvolutional Single Shot Detector

2 code implementations • 23 Jan 2017 • Cheng-Yang Fu, Wei Liu, Ananth Ranga, Ambrish Tyagi, Alexander C. Berg

The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection.

object-detection Object Detection

111

Paper
Code

Combining Multiple Cues for Visual Madlibs Question Answering

no code implementations • 1 Nov 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.

Attribute General Classification +3

Paper
Add Code

Fast Single Shot Detection and Pose Estimation

no code implementations • 19 Sep 2016 • Patrick Poirson, Phil Ammirato, Cheng-Yang Fu, Wei Liu, Jana Kosecka, Alexander C. Berg

For applications in navigation and robotics, estimating the 3D pose of objects is as important as detection.

Object Tracking Pose Estimation

Paper
Add Code

When was that made?

no code implementations • 12 Aug 2016 • Sirion Vittayakorn, Alexander C. Berg, Tamara L. Berg

Toward this goal, we utilize features from existing deep networks and also fine-tune new networks for temporal estimation.

Retrieval

Paper
Add Code

Solving Visual Madlibs with Multiple Cues

no code implementations • 11 Aug 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.

Activity Prediction Attribute +4

Paper
Add Code

Modeling Context in Referring Expressions

4 code implementations • 31 Jul 2016 • Licheng Yu, Patrick Poirson, Shan Yang, Alexander C. Berg, Tamara L. Berg

Humans refer to objects in their environments all the time, especially in dialogue with other people.

Referring Expression Referring expression generation +1

394

Paper
Code

SSD: Single Shot MultiBox Detector

223 code implementations • 8 Dec 2015 • Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg

Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference.

Ranked #3 on Object Detection on PASCAL VOC 2012

LIDAR Semantic Segmentation Low-Light Image Enhancement +4

27,708

Paper
Code

Visual Madlibs: Fill in the Blank Description Generation and Question Answering

no code implementations • ICCV 2015 • Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg

In this paper, we introduce a new dataset consisting of 360, 001 focused natural language descriptions for 10, 738 images.

Multiple-choice Question Answering

Paper
Add Code

Where to Buy It: Matching Street Clothing Photos in Online Shops

no code implementations • ICCV 2015 • M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

In this paper, we define a new task, Exact Street to Shop, where our goal is to match a real-world example of a garment item to the same item in an online shop.

Retrieval

Paper
Add Code

Learning to decompose for object detection and instance segmentation

no code implementations • 19 Nov 2015 • Eunbyung Park, Alexander C. Berg

Although deep convolutional neural networks(CNNs) have achieved remarkable results on object detection and segmentation, pre- and post-processing steps such as region proposals and non-maximum suppression(NMS), have been required.

Instance Segmentation Object +4

Paper
Add Code

Piecewise Linear Activation Functions For More Efficient Deep Networks

no code implementations • 11 Nov 2015 • Cheng-Yang Fu, Alexander C. Berg

This submission has been withdrawn by arXiv administrators because it is intentionally incomplete, which is in violation of our policies.

Paper
Add Code

ParseNet: Looking Wider to See Better

4 code implementations • 15 Jun 2015 • Wei Liu, Andrew Rabinovich, Alexander C. Berg

When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines.

Ranked #39 on Semantic Segmentation on PASCAL VOC 2012 test

Segmentation Semantic Segmentation

76,579

Paper
Code

PAIGE: PAirwise Image Geometry Encoding for Improved Efficiency in Structure-From-Motion

no code implementations • CVPR 2015 • Johannes L. Schonberger, Alexander C. Berg, Jan-Michael Frahm

Based on the insights of this evaluation, we propose a learning-based approach, the PAirwise Image Geometry Encoding (PAIGE), to efficiently identify image pairs with scene overlap without the need to perform exhaustive putative matching and geometric verification.

Paper
Add Code

MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching

2 code implementations • CVPR 2015 • Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg

We perform a comprehensive set of experiments on standard datasets to carefully study the contributions of each aspect of MatchNet, with direct comparisons to established methods.

Computational Efficiency Metric Learning +1

188

Paper
Code

Visual Madlibs: Fill in the blank Image Generation and Question Answering

no code implementations • 31 May 2015 • Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg

In this paper, we introduce a new dataset consisting of 360, 001 focused natural language descriptions for 10, 738 images.

Image Generation Multiple-choice +1

Paper
Add Code

ImageNet Large Scale Visual Recognition Challenge

12 code implementations • 1 Sep 2014 • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei

The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images.

General Classification Image Classification +4

715

Paper
Code

Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition

no code implementations • NeurIPS 2011 • Jia Deng, Sanjeev Satheesh, Alexander C. Berg, Fei Li

We present a novel approach to efficiently learn a label tree for large scale classification with many classes.

Classification General Classification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.