1 code implementation • ICCV 2023 • Ho Kei Cheng, Seoung Wug Oh, Brian Price, Alexander Schwing, Joon-Young Lee
To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation.
Ranked #1 on
Semi-Supervised Video Object Segmentation
on MOSE
Open-Vocabulary Video Segmentation
Open-World Video Segmentation
+7
1 code implementation • CVPR 2023 • Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, LiangYan Gui
To enable interactive generation, our method supports a variety of input modalities that can be easily provided by a human, including images, text, partially observed shapes and combinations of these, further allowing to adjust the strength of each input.
no code implementations • 21 Oct 2022 • Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield
We present a system for collision-free control of a robot manipulator that uses only RGB views of the world.
no code implementations • 12 Oct 2022 • Itai Gat, Yossi Adi, Alexander Schwing, Tamir Hazan
Generalization bounds which assess the difference between the true risk and the empirical risk, have been studied extensively.
1 code implementation • CVPR 2022 • Colin Graber, Cyril Jazra, Wenjie Luo, LiangYan Gui, Alexander Schwing
For this, panoptic segmentations have been studied as a compelling representation in recent work.
2 code implementations • 20 Dec 2021 • Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji
Specifically, the task involves multi-hop questions that require reasoning over image-caption pairs to identify the grounded visual object being referred to and then predicting a span from the news body text to answer the question.
1 code implementation • NeurIPS 2021 • Itai Gat, Idan Schwartz, Alexander Schwing
To study and quantify this concern, we introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features, i. e., modalities.
1 code implementation • ECNLP (ACL) 2022 • Anurendra Kumar, Keval Morabia, Jingjin Wang, Kevin Chen-Chuan Chang, Alexander Schwing
To address this challenge we propose to reformulate WIE as a context-aware Webpage Object Detection task.
Ranked #1 on
Webpage Object Detection
on CoVA
(using extra training data)
no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
no code implementations • ICCV 2021 • Xiaoming Zhao, Harsh Agrawal, Dhruv Batra, Alexander Schwing
It is fundamental for personal robots to reliably navigate to a specified goal.
no code implementations • 4 Aug 2021 • Tom Braude, Idan Schwartz, Alexander Schwing, Ariel Shamir
OIA models interactions between the sentence-corresponding image and important regions in other images of the sequence.
no code implementations • ICCV 2021 • Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing
While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.
no code implementations • CVPR 2021 • Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing
Following this decomposition, we introduce panoptic segmentation forecasting.
1 code implementation • NeurIPS 2020 • Itai Gat, Idan Schwartz, Alexander Schwing, Tamir Hazan
However, regularization with the functional entropy is challenging.
Ranked #3 on
Visual Question Answering (VQA)
on VQA-CP
no code implementations • NeurIPS 2021 • Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat
To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.
Ranked #6 on
Image Generation
on CelebA 256x256
(FID metric)
no code implementations • NeurIPS 2021 • Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.
no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
Autonomous agents must learn to collaborate.
1 code implementation • 21 Apr 2020 • Mang Tik Chiu, Xingqian Xu, Kai Wang, Jennifer Hobbs, Naira Hovakimyan, Thomas S. Huang, Honghui Shi, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Ivan Dozier, Wyatt Dozier, Karen Ghandilyan, David Wilson, Hyunseong Park, Junhee Kim, Sungho Kim, Qinghui Liu, Michael C. Kampffmeyer, Robert Jenssen, Arnt B. Salberg, Alexandre Barbosa, Rodrigo Trevisan, Bingchen Zhao, Shaozuo Yu, Siwei Yang, Yin Wang, Hao Sheng, Xiao Chen, Jingyi Su, Ram Rajagopal, Andrew Ng, Van Thong Huynh, Soo-Hyung Kim, In-Seop Na, Ujjwal Baid, Shubham Innani, Prasad Dutande, Bhakti Baheti, Sanjay Talbar, Jianyu Tang
The first Agriculture-Vision Challenge aims to encourage research in developing novel and effective algorithms for agricultural pattern recognition from aerial images, especially for the semantic segmentation task associated with our challenge dataset.
no code implementations • 21 Feb 2020 • Yuanyi Zhong, Alexander Schwing, Jian Peng
In many vision-based reinforcement learning (RL) problems, the agent controls a movable object in its visual field, e. g., the player's avatar in video games and the robotic arm in visual grasping and manipulation.
2 code implementations • CVPR 2020 • Mang Tik Chiu, Xingqian Xu, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Hrant Khachatrian, Hovnatan Karapetyan, Ivan Dozier, Greg Rose, David Wilson, Adrian Tudor, Naira Hovakimyan, Thomas S. Huang, Honghui Shi
To encourage research in computer vision for agriculture, we present Agriculture-Vision: a large-scale aerial farmland image dataset for semantic segmentation of agricultural patterns.
no code implementations • 14 Dec 2019 • Dawit Belayneh, Federico Carminati, Amir Farbin, Benjamin Hooberman, Gulrukh Khattak, Miaoyuan Liu, Junze Liu, Dominick Olivito, Vitória Barin Pacela, Maurizio Pierini, Alexander Schwing, Maria Spiropulu, Sofia Vallecorsa, Jean-Roch Vlimant, Wei Wei, Matt Zhang
These networks can serve as fast and computationally light methods for particle shower simulation and reconstruction for current and future experiments at particle colliders.
1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander Schwing
Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.
1 code implementation • NeurIPS 2019 • Colin Graber, Alexander Schwing
For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions.
no code implementations • 25 Sep 2019 • Anwesa Choudhuri, Ashok Vardhan Makkuva, Ranvir Rana, Sewoong Oh, Girish Chowdhary, Alexander Schwing
%In fact, contrastive disentanglement and unsupervised recovery are often combined in that we seek additional variations that exhibit salient factors/properties.
no code implementations • NeurIPS Workshop Neuro_AI 2019 • Colin Graber, Ryan Loh, Yurii Vlasov, Alexander Schwing
What can we learn about the functional organization of cortical microcircuits from large-scale recordings of neural activity?
1 code implementation • ICCV 2019 • Tanmay Gupta, Alexander Schwing, Derek Hoiem
Through unsupervised clustering, supervised partitioning, and a zero-shot-like generalization analysis we show that our word embeddings complement text-only embeddings like GloVe by better representing similarities and differences between visual concepts that are difficult to obtain from text corpora alone.
no code implementations • ICCV 2019 • Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing
We encourage this temporal latent space to capture the 'intention' about how to complete the sentence by mimicking a representation which summarizes the future.
no code implementations • CVPR 2019 • Ishan Deshpande, Yuan-Ting Hu, Ruoyu Sun, Ayis Pyrros, Nasir Siddiqui, Sanmi Koyejo, Zhizhen Zhao, David Forsyth, Alexander Schwing
Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning.
no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi
Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.
1 code implementation • CVPR 2019 • Idan Schwartz, Seunghak Yu, Tamir Hazan, Alexander Schwing
We address this issue and develop a general attention mechanism for visual dialog which operates on any number of data utilities.
Ranked #1 on
Visual Dialog
on VisDial v0.9 val
3 code implementations • ICCV 2019 • Tanmay Gupta, Alexander Schwing, Derek Hoiem
We show that for human-object interaction detection a relatively simple factorized model with appearance and layout encodings constructed from pre-trained object detectors outperforms more sophisticated approaches.
no code implementations • NeurIPS 2018 • Mingchao Yu, Zhifeng Lin, Krishna Narra, Songze Li, Youjie Li, Nam Sung Kim, Alexander Schwing, Murali Annavaram, Salman Avestimehr
Data parallelism can boost the training speed of convolutional neural networks (CNN), but could suffer from significant communication costs caused by gradient aggregation.
no code implementations • NeurIPS 2018 • Youjie Li, Mingchao Yu, Songze Li, Salman Avestimehr, Nam Sung Kim, Alexander Schwing
Distributed training of deep nets is an important technique to address some of the present day computing challenges like memory consumption and computational demands.
1 code implementation • NeurIPS 2018 • Colin Graber, Ofer Meshi, Alexander Schwing
Deep structured models are widely used for tasks like semantic segmentation, where explicit correlations between variables provide important prior information which generally helps to reduce the data needs of deep nets.
no code implementations • CVPR 2019 • Aditya Deshpande, Jyoti Aneja, Li-Wei Wang, Alexander Schwing, D. A. Forsyth
We achieve the trifecta: (1) High accuracy for the diverse captions as evaluated by standard captioning metrics and user studies; (2) Faster computation of diverse captions compared to beam search and diverse beam search; and (3) High diversity as evaluated by counting novel sentences, distinct n-grams and mutual overlap (i. e., mBleu-4) scores.
1 code implementation • CVPR 2018 • Ishan Deshpande, Ziyu Zhang, Alexander Schwing
While this is particularly true for early GAN formulations, there has been significant empirically motivated and theoretically founded progress to improve stability, for instance, by using the Wasserstein distance rather than the Jenson-Shannon divergence.
no code implementations • CVPR 2018 • Unnat Jain, Svetlana Lazebnik, Alexander Schwing
In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.
Ranked #7 on
Visual Dialog
on VisDial v0.9 val
no code implementations • NeurIPS 2017 • Ofer Meshi, Alexander Schwing
Finding the maximum a-posteriori (MAP) assignment is a central task in graphical models.
4 code implementations • CVPR 2018 • Jyoti Aneja, Aditya Deshpande, Alexander Schwing
In recent years significant progress has been made in image captioning, using Recurrent Neural Networks powered by long-short-term-memory (LSTM) units.
no code implementations • NeurIPS 2017 • Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel
We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this 'dualing GAN' act in concert.
no code implementations • CVPR 2017 • Unnat Jain, Ziyu Zhang, Alexander Schwing
Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants.
no code implementations • 9 Sep 2015 • Beate Franke, Jean-François Plante, Ribana Roscher, Annie Lee, Cathal Smyth, Armin Hatefi, Fuqi Chen, Einat Gil, Alexander Schwing, Alessandro Selvitella, Michael M. Hoffman, Roger Grosse, Dieter Hendricks, Nancy Reid
The need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context.
no code implementations • 8 Oct 2012 • Tamir Hazan, Alexander Schwing, David Mcallester, Raquel Urtasun
In this paper we derive an efficient algorithm to learn the parameters of structured predictors in general graphical models.