no code implementations • 18 May 2023 • Viraj Shah, Svetlana Lazebnik, Julien Philip
In this work, we propose to solve ill-posed inverse imaging problems using a bank of Generative Adversarial Networks (GAN) as a prior and apply our method to the case of Intrinsic Image Decomposition for faces and materials.
no code implementations • 14 Apr 2023 • Aiyu Cui, Svetlana Lazebnik
Since body shape deformation is an essential component of an art character's style, we incorporate a novel skeleton deformation module to reshape the pose of the input person and modify the DiOr pose-guided person generator to be more robust to the rescaled poses falling outside the distribution of the realistic poses that the generator is originally trained on.
1 code implementation • 16 Nov 2022 • Zitong Zhan, Daniel McKee, Svetlana Lazebnik
We propose a fully online transformer-based video instance segmentation model that performs comparably to top offline methods on the YouTube-VIS 2019 benchmark and considerably outperforms them on UVO and OVIS.
Ranked #10 on
Video Instance Segmentation
on OVIS validation
no code implementations • 8 Oct 2022 • Viraj Shah, Ayush Sarkar, Sudharsan Krishnakumar Anitha, Svetlana Lazebnik
Recent approaches for one-shot stylization such as JoJoGAN fine-tune a pre-trained StyleGAN2 generator on a single style reference image.
no code implementations • 10 Mar 2022 • Daniel McKee, Zitong Zhan, Bing Shuai, Davide Modolo, Joseph Tighe, Svetlana Lazebnik
This work studies feature representations for dense label propagation in video, with a focus on recently proposed methods that learn video correspondence using self-supervised signals such as colorization or temporal cycle consistency.
no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
no code implementations • 19 Aug 2021 • Daniel McKee, Bing Shuai, Andrew Berneshawi, Manchen Wang, Davide Modolo, Svetlana Lazebnik, Joseph Tighe
Next, to tackle harder tracking cases, we mine hard examples across an unlabeled pool of real videos with a tracker trained on our hallucinated video data.
1 code implementation • ICCV 2021 • Aiyu Cui, Daniel McKee, Svetlana Lazebnik
We propose a flexible person generation framework called Dressing in Order (DiOr), which supports 2D pose transfer, virtual try-on, and several fashion editing tasks.
Ranked #1 on
Pose Transfer
on Deep-Fashion
no code implementations • ICCV 2021 • Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing
While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.
no code implementations • NeurIPS 2021 • Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.
no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
Autonomous agents must learn to collaborate.
no code implementations • ECCV 2020 • Ahmet Iscen, Jeffrey Zhang, Svetlana Lazebnik, Cordelia Schmid
We assume that the model is updated incrementally for new classes as new data becomes available sequentially. This requires adapting the previously stored feature vectors to the updated feature space without having access to the corresponding original training images.
no code implementations • 28 May 2019 • Zih-Siou Hung, Arun Mallya, Svetlana Lazebnik
The previous VTransE model maps entities and predicates into a low-dimensional embedding vector space where the predicate is interpreted as a translation vector between the embedded features of the bounding box regions of the subject and the object.
no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi
Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.
3 code implementations • 17 Nov 2018 • Bryan A. Plummer, Kevin J. Shih, Yichen Li, Ke Xu, Svetlana Lazebnik, Stan Sclaroff, Kate Saenko
Most existing work that grounds natural language phrases in images starts with the assumption that the phrase in question is relevant to the image.
no code implementations • NeurIPS 2018 • Medhini Narasimhan, Svetlana Lazebnik, Alexander G. Schwing
Given a question-image pair, deep network techniques have been employed to successively reduce the large set of facts until one of the two entities of the final remaining fact is predicted as the answer.
no code implementations • CVPR 2018 • Unnat Jain, Svetlana Lazebnik, Alexander Schwing
In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.
Ranked #7 on
Visual Dialog
on VisDial v0.9 val
1 code implementation • ECCV 2018 • Arun Mallya, Dillon Davis, Svetlana Lazebnik
This work presents a method for adapting a single, fixed deep neural network to multiple tasks without affecting performance on already learned tasks.
1 code implementation • ECCV 2018 • Bryan A. Plummer, Paige Kordas, M. Hadi Kiapour, Shuai Zheng, Robinson Piramuthu, Svetlana Lazebnik
This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model.
no code implementations • NeurIPS 2017 • Liwei Wang, Alexander G. Schwing, Svetlana Lazebnik
This paper explores image caption generation using conditional variational auto-encoders (CVAEs).
4 code implementations • CVPR 2018 • Arun Mallya, Svetlana Lazebnik
This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting.
Ranked #5 on
Continual Learning
on CUBS (Fine-grained 6 Tasks)
no code implementations • CVPR 2017 • Bryan A. Plummer, Matthew Brown, Svetlana Lazebnik
This paper addresses video summarization, or the problem of distilling a raw video into a shorter form while still capturing the original story.
1 code implementation • 11 Apr 2017 • Liwei Wang, Yin Li, Jing Huang, Svetlana Lazebnik
Image-language matching tasks have recently attracted a lot of attention in the computer vision field.
no code implementations • ICCV 2017 • Arun Mallya, Svetlana Lazebnik
This work proposes Recurrent Neural Network (RNN) models to predict structured 'image situations' -- actions and noun entities fulfilling semantic roles related to the action.
Ranked #9 on
Situation Recognition
on imSitu
Grounded Situation Recognition
Human-Object Interaction Detection
+1
1 code implementation • ICCV 2017 • Bryan A. Plummer, Arun Mallya, Christopher M. Cervantes, Julia Hockenmaier, Svetlana Lazebnik
This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues.
no code implementations • 1 Nov 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.
no code implementations • 11 Aug 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.
no code implementations • 16 Apr 2016 • Arun Mallya, Svetlana Lazebnik
This paper proposes deep convolutional network models that utilize local and global context to make human activity label predictions in still images, achieving state-of-the-art performance on two recent datasets with hundreds of labels each.
Ranked #6 on
Human-Object Interaction Detection
on HICO
General Classification
Human-Object Interaction Detection
+4
1 code implementation • CVPR 2016 • Yongxi Lu, Tara Javidi, Svetlana Lazebnik
Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small.
no code implementations • ICCV 2015 • Arun Mallya, Svetlana Lazebnik
We learn to predict 'informative edge' probability maps using two recent methods that exploit local and global context, respectively: structured edge detection forests, and a fully convolutional network for pixelwise labeling.
no code implementations • ICCV 2015 • M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
In this paper, we define a new task, Exact Street to Shop, where our goal is to match a real-world example of a garment item to the same item in an online shop.
no code implementations • CVPR 2016 • Liwei Wang, Yin Li, Svetlana Lazebnik
This paper proposes a method for learning joint embeddings of images and text using a two-branch neural network with multiple layers of linear projections followed by nonlinearities.
Ranked #14 on
Image Retrieval
on Flickr30K 1K test
2 code implementations • ICCV 2015 • Juan C. Caicedo, Svetlana Lazebnik
We present an active detection model for localizing objects in scenes.
2 code implementations • ICCV 2015 • Bryan A. Plummer, Li-Wei Wang, Chris M. Cervantes, Juan C. Caicedo, Julia Hockenmaier, Svetlana Lazebnik
The Flickr30k dataset has become a standard benchmark for sentence-based image description.
Ranked #16 on
Image Retrieval
on Flickr30K 1K test
1 code implementation • 11 May 2015 • Liwei Wang, Chen-Yu Lee, Zhuowen Tu, Svetlana Lazebnik
One of the most promising ways of improving the performance of deep convolutional neural networks is by increasing the number of convolutional layers.
no code implementations • CVPR 2014 • Joseph Tighe, Marc Niethammer, Svetlana Lazebnik
This work proposes a method to interpret a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships.
no code implementations • 7 Mar 2014 • Yunchao Gong, Li-Wei Wang, Ruiqi Guo, Svetlana Lazebnik
Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition.
no code implementations • CVPR 2013 • Yunchao Gong, Sanjiv Kumar, Henry A. Rowley, Svetlana Lazebnik
Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on largescale datasets like ImageNet, extremely high-dimensional visual descriptors, e. g., Fisher Vectors, are needed.
no code implementations • CVPR 2013 • Joseph Tighe, Svetlana Lazebnik
This paper presents a system for image parsing, or labeling each pixel in an image with its semantic category, aimed at achieving broad coverage across hundreds of object categories, many of them sparsely sampled.
no code implementations • 18 Dec 2012 • Yunchao Gong, Qifa Ke, Michael Isard, Svetlana Lazebnik
This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation).
no code implementations • NeurIPS 2012 • Yunchao Gong, Sanjiv Kumar, Vishal Verma, Svetlana Lazebnik
Such data typically arises in a large number of vision and text applications where counts or frequencies are used as features.
no code implementations • NeurIPS 2009 • Maxim Raginsky, Svetlana Lazebnik
This paper addresses the problem of designing binary codes for high-dimensional data such that vectors that are similar in the original space map to similar binary strings.
no code implementations • NeurIPS 2008 • Maxim Raginsky, Svetlana Lazebnik, Rebecca Willett, Jorge Silva
This paper describes a recursive estimation procedure for multivariate binary densities using orthogonal expansions.