Search Results for author: Svetlana Lazebnik

Found 43 papers, 12 papers with code

JoIN: Joint GANs Inversion for Intrinsic Image Decomposition

no code implementations18 May 2023 Viraj Shah, Svetlana Lazebnik, Julien Philip

In this work, we propose to solve ill-posed inverse imaging problems using a bank of Generative Adversarial Networks (GAN) as a prior and apply our method to the case of Intrinsic Image Decomposition for faces and materials.

Image Relighting Intrinsic Image Decomposition

One-Shot Stylization for Full-Body Human Images

no code implementations14 Apr 2023 Aiyu Cui, Svetlana Lazebnik

Since body shape deformation is an essential component of an art character's style, we incorporate a novel skeleton deformation module to reshape the pose of the input person and modify the DiOr pose-guided person generator to be more robust to the rescaled poses falling outside the distribution of the realistic poses that the generator is originally trained on.

Robust Online Video Instance Segmentation with Track Queries

1 code implementation16 Nov 2022 Zitong Zhan, Daniel McKee, Svetlana Lazebnik

We propose a fully online transformer-based video instance segmentation model that performs comparably to top offline methods on the YouTube-VIS 2019 benchmark and considerably outperforms them on UVO and OVIS.

Image Segmentation Instance Segmentation +5

MultiStyleGAN: Multiple One-shot Image Stylizations using a Single GAN

no code implementations8 Oct 2022 Viraj Shah, Ayush Sarkar, Sudharsan Krishnakumar Anitha, Svetlana Lazebnik

Recent approaches for one-shot stylization such as JoJoGAN fine-tune a pre-trained StyleGAN2 generator on a single style reference image.

One-Shot Face Stylization

Transfer of Representations to Video Label Propagation: Implementation Factors Matter

no code implementations10 Mar 2022 Daniel McKee, Zitong Zhan, Bing Shuai, Davide Modolo, Joseph Tighe, Svetlana Lazebnik

This work studies feature representations for dense label propagation in video, with a focus on recently proposed methods that learn video correspondence using self-supervised signals such as colorization or temporal cycle consistency.


Multi-Object Tracking with Hallucinated and Unlabeled Videos

no code implementations19 Aug 2021 Daniel McKee, Bing Shuai, Andrew Berneshawi, Manchen Wang, Davide Modolo, Svetlana Lazebnik, Joseph Tighe

Next, to tackle harder tracking cases, we mine hard examples across an unlabeled pool of real videos with a tracker trained on our hallucinated video data.

Multi-Object Tracking

Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing

1 code implementation ICCV 2021 Aiyu Cui, Daniel McKee, Svetlana Lazebnik

We propose a flexible person generation framework called Dressing in Order (DiOr), which supports 2D pose transfer, virtual try-on, and several fashion editing tasks.

Fashion Synthesis Pose Transfer +1

GridToPix: Training Embodied Agents with Minimal Supervision

no code implementations ICCV 2021 Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.

PointGoal Navigation Reinforcement Learning (RL)

Bridging the Imitation Gap by Adaptive Insubordination

no code implementations NeurIPS 2021 Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.

Imitation Learning Memorization +2

Memory-Efficient Incremental Learning Through Feature Adaptation

no code implementations ECCV 2020 Ahmet Iscen, Jeffrey Zhang, Svetlana Lazebnik, Cordelia Schmid

We assume that the model is updated incrementally for new classes as new data becomes available sequentially. This requires adapting the previously stored feature vectors to the updated feature space without having access to the corresponding original training images.

Incremental Learning

Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

no code implementations28 May 2019 Zih-Siou Hung, Arun Mallya, Svetlana Lazebnik

The previous VTransE model maps entities and predicates into a low-dimensional embedding vector space where the predicate is interpreted as a translation vector between the embedded features of the bounding box regions of the subject and the object.

Graph Generation Relationship Detection +3

Revisiting Image-Language Networks for Open-ended Phrase Detection

3 code implementations17 Nov 2018 Bryan A. Plummer, Kevin J. Shih, Yichen Li, Ke Xu, Svetlana Lazebnik, Stan Sclaroff, Kate Saenko

Most existing work that grounds natural language phrases in images starts with the assumption that the phrase in question is relevant to the image.

object-detection Object Detection +1

Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering

no code implementations NeurIPS 2018 Medhini Narasimhan, Svetlana Lazebnik, Alexander G. Schwing

Given a question-image pair, deep network techniques have been employed to successively reduce the large set of facts until one of the two entities of the final remaining fact is predicted as the answer.

Factual Visual Question Answering General Knowledge +2

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

no code implementations CVPR 2018 Unnat Jain, Svetlana Lazebnik, Alexander Schwing

In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.

Image Captioning Question Answering +4

Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

1 code implementation ECCV 2018 Arun Mallya, Dillon Davis, Svetlana Lazebnik

This work presents a method for adapting a single, fixed deep neural network to multiple tasks without affecting performance on already learned tasks.

Continual Learning Quantization

Conditional Image-Text Embedding Networks

1 code implementation ECCV 2018 Bryan A. Plummer, Paige Kordas, M. Hadi Kiapour, Shuai Zheng, Robinson Piramuthu, Svetlana Lazebnik

This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model.

Phrase Grounding

Enhancing Video Summarization via Vision-Language Embedding

no code implementations CVPR 2017 Bryan A. Plummer, Matthew Brown, Svetlana Lazebnik

This paper addresses video summarization, or the problem of distilling a raw video into a shorter form while still capturing the original story.

Video Summarization

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

1 code implementation11 Apr 2017 Liwei Wang, Yin Li, Jing Huang, Svetlana Lazebnik

Image-language matching tasks have recently attracted a lot of attention in the computer vision field.

Image-text matching Retrieval +3

Recurrent Models for Situation Recognition

no code implementations ICCV 2017 Arun Mallya, Svetlana Lazebnik

This work proposes Recurrent Neural Network (RNN) models to predict structured 'image situations' -- actions and noun entities fulfilling semantic roles related to the action.

Grounded Situation Recognition Human-Object Interaction Detection +1

Solving Visual Madlibs with Multiple Cues

no code implementations11 Aug 2016 Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.

Activity Prediction Multiple-choice +3

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

no code implementations16 Apr 2016 Arun Mallya, Svetlana Lazebnik

This paper proposes deep convolutional network models that utilize local and global context to make human activity label predictions in still images, achieving state-of-the-art performance on two recent datasets with hundreds of labels each.

General Classification Human-Object Interaction Detection +4

Adaptive Object Detection Using Adjacency and Zoom Prediction

1 code implementation CVPR 2016 Yongxi Lu, Tara Javidi, Svetlana Lazebnik

Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small.

object-detection Object Detection

Learning Informative Edge Maps for Indoor Scene Layout Prediction

no code implementations ICCV 2015 Arun Mallya, Svetlana Lazebnik

We learn to predict 'informative edge' probability maps using two recent methods that exploit local and global context, respectively: structured edge detection forests, and a fully convolutional network for pixelwise labeling.

Edge Detection

Where to Buy It: Matching Street Clothing Photos in Online Shops

no code implementations ICCV 2015 M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

In this paper, we define a new task, Exact Street to Shop, where our goal is to match a real-world example of a garment item to the same item in an online shop.


Learning Deep Structure-Preserving Image-Text Embeddings

no code implementations CVPR 2016 Liwei Wang, Yin Li, Svetlana Lazebnik

This paper proposes a method for learning joint embeddings of images and text using a two-branch neural network with multiple layers of linear projections followed by nonlinearities.

Image Retrieval Metric Learning +1

Training Deeper Convolutional Networks with Deep Supervision

1 code implementation11 May 2015 Liwei Wang, Chen-Yu Lee, Zhuowen Tu, Svetlana Lazebnik

One of the most promising ways of improving the performance of deep convolutional neural networks is by increasing the number of convolutional layers.

General Classification

Scene Parsing with Object Instances and Occlusion Ordering

no code implementations CVPR 2014 Joseph Tighe, Marc Niethammer, Svetlana Lazebnik

This work proposes a method to interpret a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships.

Scene Parsing

Multi-scale Orderless Pooling of Deep Convolutional Activation Features

no code implementations7 Mar 2014 Yunchao Gong, Li-Wei Wang, Ruiqi Guo, Svetlana Lazebnik

Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition.

Classification General Classification +2

Learning Binary Codes for High-Dimensional Data Using Bilinear Projections

no code implementations CVPR 2013 Yunchao Gong, Sanjiv Kumar, Henry A. Rowley, Svetlana Lazebnik

Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on largescale datasets like ImageNet, extremely high-dimensional visual descriptors, e. g., Fisher Vectors, are needed.

Classification Code Generation +4

Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

no code implementations CVPR 2013 Joseph Tighe, Svetlana Lazebnik

This paper presents a system for image parsing, or labeling each pixel in an image with its semantic category, aimed at achieving broad coverage across hundreds of object categories, many of them sparsely sampled.

A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics

no code implementations18 Dec 2012 Yunchao Gong, Qifa Ke, Michael Isard, Svetlana Lazebnik

This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation).

Clustering Image Retrieval +2

Angular Quantization-based Binary Codes for Fast Similarity Search

no code implementations NeurIPS 2012 Yunchao Gong, Sanjiv Kumar, Vishal Verma, Svetlana Lazebnik

Such data typically arises in a large number of vision and text applications where counts or frequencies are used as features.

Quantization Retrieval +1

Locality-sensitive binary codes from shift-invariant kernels

no code implementations NeurIPS 2009 Maxim Raginsky, Svetlana Lazebnik

This paper addresses the problem of designing binary codes for high-dimensional data such that vectors that are similar in the original space map to similar binary strings.

Near-minimax recursive density estimation on the binary hypercube

no code implementations NeurIPS 2008 Maxim Raginsky, Svetlana Lazebnik, Rebecca Willett, Jorge Silva

This paper describes a recursive estimation procedure for multivariate binary densities using orthogonal expansions.

Density Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.