1 code implementation • ECCV 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz
Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.
no code implementations • ECCV 2020 • Yuan-Ting Hu, Heng Wang, Nicolas Ballas, Kristen Grauman, Alexander G. Schwing
Video inpainting is an important technique for a wide variety of applications from video content editing to video restoration.
no code implementations • 13 Mar 2025 • Xiaoming Zhao, Alexander G. Schwing
Classifier-free guidance has become a staple for conditional generation with denoising diffusion models.
no code implementations • 13 Feb 2025 • Pengsheng Guo, Alexander G. Schwing
We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity vector-fields.
no code implementations • 13 Feb 2025 • Jing Wen, Alexander G. Schwing, Shenlong Wang
Generalizable rendering of an animatable human avatar from sparse inputs relies on data priors and inductive biases extracted from training on large data to avoid scene-specific optimization and to enable fast reconstruction.
1 code implementation • 31 Oct 2024 • Kai Yan, Alexander G. Schwing, Yu-Xiong Wang
As suggested by our analysis, in our experiments, we hence find that simply adding TD3 gradients to the finetuning process of ODT effectively improves the online finetuning performance of ODT, especially if ODT is pretrained with low-reward offline data.
1 code implementation • CVPR 2024 • Jing Wen, Xiaoming Zhao, Zhongzheng Ren, Alexander G. Schwing, Shenlong Wang
We introduce GoMAvatar, a novel approach for real-time, memory-efficient, high-quality animatable human modeling.
no code implementations • 4 Apr 2024 • Anwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing
Concretely, we connect a multi-scale visual feature extractor and a large language model (LLM) by developing an object abstractor and an object-to-text abstractor.
no code implementations • 2 Dec 2023 • Pengsheng Guo, Hans Hao, Adam Caccavale, Zhongzheng Ren, Edward Zhang, Qi Shan, Aditya Sankar, Alexander G. Schwing, Alex Colburn, Fangchang Ma
Our analysis identifies the core of these challenges as the interaction among noise levels in the 2D diffusion process, the architecture of the diffusion network, and the 3D model representation.
1 code implementation • 2 Nov 2023 • Kai Yan, Alexander G. Schwing, Yu-Xiong Wang
It minimizes the primal Wasserstein distance between the learner and expert state occupancies and leverages a contrastively learned distance metric.
no code implementations • 12 Oct 2023 • Xiaoming Zhao, Alex Colburn, Fangchang Ma, Miguel Angel Bautista, Joshua M. Susskind, Alexander G. Schwing
In contrast, for dynamic scenes, scene-specific optimization techniques exist, but, to our best knowledge, there is currently no generalized method for dynamic novel view synthesis from a given monocular video.
1 code implementation • 23 May 2023 • Saba Ghaffari, Ehsan Saleh, Alexander G. Schwing, Yu-Xiong Wang, Martin D. Burke, Saurabh Sinha
Protein design, a grand challenge of the day, involves optimization on a fitness landscape, and leading methods adopt a model-based approach where a model is trained on a training set (protein sequences and fitness) and proposes candidates to explore next.
1 code implementation • CVPR 2023 • Anwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing
We evaluate the proposed approach across three challenging tasks: video instance segmentation, multi-object tracking and segmentation, and video panoptic segmentation.
1 code implementation • 18 Oct 2022 • Kai Yan, Alexander G. Schwing, Yu-Xiong Wang
To better benefit from available demonstrations, we develop a method to Combine Explicit and Implicit Priors (CEIP).
1 code implementation • 14 Oct 2022 • Renan A. Rojas-Gomez, Teck-Yian Lim, Alexander G. Schwing, Minh N. Do, Raymond A. Yeh
We propose learnable polyphase sampling (LPS), a pair of learnable down/upsampling layers that enable truly shift-invariant and equivariant convolutional networks.
no code implementations • 11 Oct 2022 • Peiye Zhuang, Liqian Ma, Oluwasanmi Koyejo, Alexander G. Schwing
Recent work on 3D-aware image synthesis has achieved compelling results using advances in neural rendering.
no code implementations • 9 Oct 2022 • Feng Wang, Manling Li, Xudong Lin, Hairong Lv, Alexander G. Schwing, Heng Ji
Recent advances in pre-training vision-language models like CLIP have shown great potential in learning transferable visual representations.
1 code implementation • 4 Aug 2022 • Xiaoming Zhao, Yuan-Ting Hu, Zhongzheng Ren, Alexander G. Schwing
Specifically, a set of 3D locations within the view-frustum of the camera are first projected independently onto the image and a corresponding feature is subsequently extracted for each 3D location.
no code implementations • 28 Jul 2022 • Xiaoming Zhao, Zhizhen Zhao, Alexander G. Schwing
While recovery of geometry from image and video data has received a lot of attention in computer vision, methods to capture the texture for a given geometry are less mature.
1 code implementation • 21 Jul 2022 • Xiaoming Zhao, Fangchang Ma, David Güera, Zhile Ren, Alexander G. Schwing, Alex Colburn
What is really needed to make an existing 2D GAN 3D-aware?
2 code implementations • 14 Jul 2022 • Ho Kei Cheng, Alexander G. Schwing
We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model.
Ranked #1 on
Video Object Segmentation
on YouTube-VOS 2019
(using extra training data)
no code implementations • CVPR 2022 • Zhongzheng Ren, Aseem Agarwala, Bryan Russell, Alexander G. Schwing, Oliver Wang
We introduce an approach for selecting objects in neural volumetric 3D representations, such as multi-plane images (MPI) and neural radiance fields (NeRF).
no code implementations • 12 May 2022 • Iou-Jen Liu, Xingdi Yuan, Marc-Alexandre Côté, Pierre-Yves Oudeyer, Alexander G. Schwing
In order to study how agents can be taught to query external knowledge via language, we first introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld.
1 code implementation • 7 Apr 2022 • Raymond A. Yeh, Yuan-Ting Hu, Mark Hasegawa-Johnson, Alexander G. Schwing
Designing equivariance as an inductive bias into deep-nets has been a prominent approach to build effective models, e. g., a convolutional neural network incorporates translation equivariance.
1 code implementation • CVPR 2022 • Raymond A. Yeh, Yuan-Ting Hu, Zhongzheng Ren, Alexander G. Schwing
To study question (a), in this work, we propose total variation (TV) minimization as a layer for computer vision.
6 code implementations • 20 Dec 2021 • Bowen Cheng, Anwesa Choudhuri, Ishan Misra, Alexander Kirillov, Rohit Girdhar, Alexander G. Schwing
We find Mask2Former also achieves state-of-the-art performance on video instance segmentation without modifying the architecture, the loss or even the training pipeline.
no code implementations • NeurIPS 2021 • Zhongzheng Ren, Xiaoming Zhao, Alexander G. Schwing
We introduce REDO, a class-agnostic framework to REconstruct the Dynamic Objects from RGBD or calibrated videos.
7 code implementations • CVPR 2022 • Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar
While only the semantics of each task differ, current research focuses on designing specialized architectures for each task.
Ranked #3 on
Semantic Segmentation
on Mapillary val
no code implementations • 6 Aug 2021 • Iou-Jen Liu, Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing
We evaluate `semantic tracklets' on the visual multi-agent particle environment (VMPE) and on the challenging visual multi-agent GFootball environment.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
no code implementations • 23 Jul 2021 • Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing
To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring.
3 code implementations • NeurIPS 2021 • Bowen Cheng, Alexander G. Schwing, Alexander Kirillov
Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results.
Ranked #4 on
Semantic Segmentation
on Mapillary val
1 code implementation • NeurIPS 2021 • Ameya D. Patil, Michael Tuttle, Alexander G. Schwing, Naresh R. Shanbhag
Classical adversarial training (AT) frameworks are designed to achieve high adversarial accuracy against a single attack type, typically $\ell_\infty$ norm-bounded perturbations.
no code implementations • CVPR 2021 • Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing
Moreover, existing image-based datasets for mesh reconstruction don't permit to study models which integrate temporal information.
1 code implementation • CVPR 2021 • Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision.
no code implementations • 13 May 2021 • Safa Messaoud, Ismini Lourentzou, Assma Boughoula, Mona Zehni, Zhizhen Zhao, ChengXiang Zhai, Alexander G. Schwing
The recent growth of web video sharing platforms has increased the demand for systems that can efficiently browse, retrieve and summarize video content.
2 code implementations • ICLR 2021 • Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing
Controllable semantic image editing enables a user to change entire image attributes with a few clicks, e. g., gradually making a summer scene look like it was taken in winter.
1 code implementation • ICCV 2021 • Anwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing
In contrast, we formulate a global method for MOTS over the space of assignments rather than detections: First, we find all top-k assignments of objects detected and segmented between any two consecutive frames and develop a structured prediction formulation to score assignment sequences across any number of consecutive frames.
Multi-Object Tracking
Multi-Object Tracking and Segmentation
+4
1 code implementation • NeurIPS 2020 • Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing
In contrast, asynchronous methods achieve high throughput but suffer from stability issues and lower sample efficiency due to `stale policies.'
no code implementations • 21 Oct 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz
Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.
no code implementations • NeurIPS 2020 • Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing
Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i. e., all unlabeled examples are equally weighted.
no code implementations • CVPR 2020 • Safa Messaoud, Maghav Kumar, Alexander G. Schwing
In this paper, we show that we can learn program heuristics, i. e., policies, for solving inference in higher order CRFs for the task of semantic segmentation, using reinforcement learning.
2 code implementations • CVPR 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander G. Schwing, Jan Kautz
Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training.
Ranked #1 on
Weakly Supervised Object Detection
on COCO test-dev
1 code implementation • NeurIPS 2019 • Raymond A. Yeh, Yuan-Ting Hu, Alexander G. Schwing
We propose Chirality Nets, a family of deep nets that is equivariant to the "chirality transform," i. e., the transformation to create a chiral pair.
2 code implementations • 31 Oct 2019 • Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing
Sample efficiency and scalability to a large number of agents are two important goals for multi-agent reinforcement learning systems.
Deep Reinforcement Learning
Multi-agent Reinforcement Learning
+2
1 code implementation • NeurIPS 2019 • Tiantian Fang, Alexander G. Schwing
Inferring the most likely configuration for a subset of variables of a joint distribution given the remaining ones - which we refer to as co-generation - is an important challenge that is computationally demanding for all but the simplest settings.
1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander G. Schwing
Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.
no code implementations • 13 Jul 2019 • Peiye Zhuang, Alexander G. Schwing, Sanmi Koyejo
Thus, our results suggest that data augmentation via synthesis is a promising approach to address the limited availability of fMRI data, and to improve the quality of predictive fMRI models.
no code implementations • ICLR 2019 • Iou-Jen Liu, Jian Peng, Alexander G. Schwing
A zoo of deep nets is available these days for almost any given task, and it is increasingly unclear which net to start with when addressing a new task, or which net to use as an initialization for fine-tuning a new model.
no code implementations • NeurIPS 2018 • Medhini Narasimhan, Svetlana Lazebnik, Alexander G. Schwing
Given a question-image pair, deep network techniques have been employed to successively reduce the large set of facts until one of the two entities of the final remaining fact is predicted as the answer.
no code implementations • ECCV 2018 • Safa Messaoud, David Forsyth, Alexander G. Schwing
Colorizing a given gray-level image is an important task in the media and advertising industry.
no code implementations • ECCV 2018 • Medhini Narasimhan, Alexander G. Schwing
Question answering is an important task for autonomous agents and virtual assistants alike and was shown to support the disabled in efficiently navigating an overwhelming environment.
no code implementations • ECCV 2018 • Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing
Due to the formulation as a prediction task, most of these methods require fine-tuning during test time, such that the deep nets memorize the appearance of the objects of interest in the given video.
no code implementations • ECCV 2018 • Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing
We even demonstrate competitive results comparable to deep learning based methods in the semi-supervised setting on the DAVIS dataset.
Ranked #4 on
Video Salient Object Detection
on DAVSOD-Difficult20
(using extra training data)
no code implementations • ECCV 2018 • Moitreya Chatterjee, Alexander G. Schwing
Paragraph generation from images, which has gained popularity recently, is an important task for video summarization, editing, and support of the disabled.
no code implementations • CVPR 2018 • Raymond A. Yeh, Minh N. Do, Alexander G. Schwing
Textual grounding, i. e., linking words to objects in images, is a challenging but important task for robotics and human-computer interaction.
no code implementations • NeurIPS 2017 • Raymond A. Yeh, JinJun Xiong, Wen-mei W. Hwu, Minh N. Do, Alexander G. Schwing
Textual grounding is an important but challenging task for human-computer interaction, robotics and knowledge mining.
no code implementations • NeurIPS 2017 • Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing
Instance level video object segmentation is an important technique for video editing and compression.
no code implementations • ICLR 2018 • Keyi Yu, Yang Liu, Alexander G. Schwing, Jian Peng
Recent advances in recurrent neural nets (RNNs) have shown much promise in many applications in natural language processing.
no code implementations • ICLR 2018 • Peiye Zhuang, Alexander G. Schwing, Oluwasanmi Koyejo
Our classification results provide a quantitative evaluation of the quality of the generated images, and also serve as an additional contribution of this manuscript.
no code implementations • NeurIPS 2017 • Liwei Wang, Alexander G. Schwing, Svetlana Lazebnik
This paper explores image caption generation using conditional variational auto-encoders (CVAEs).
1 code implementation • NeurIPS 2017 • Idan Schwartz, Alexander G. Schwing, Tamir Hazan
The quest for algorithms that enable cognitive abilities is an important part of machine learning.
1 code implementation • 5 Nov 2016 • Frank S. He, Yang Liu, Alexander G. Schwing, Jian Peng
We propose a novel training algorithm for reinforcement learning which combines the strength of deep Q-learning with a constrained optimization approach to tighten optimality and encourage faster reward propagation.
7 code implementations • CVPR 2017 • Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do
In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning on the available data.
4 code implementations • CVPR 2016 • Wenjie Luo, Alexander G. Schwing, Raquel Urtasun
In the past year, convolutional neural networks have been shown to perform extremely well for stereo estimation.
1 code implementation • 19 Nov 2015 • Yang Song, Alexander G. Schwing, Richard S. Zemel, Raquel Urtasun
Supervised training of deep neural nets typically relies on minimizing cross-entropy.
no code implementations • CVPR 2015 • Chenxi Liu, Alexander G. Schwing, Kaustav Kundu, Raquel Urtasun, Sanja Fidler
What sets us apart from past work in layout estimation is the use of floor plans as a source of prior knowledge, as well as localization of each image within a bigger space (apartment).
no code implementations • CVPR 2015 • Jia Xu, Alexander G. Schwing, Raquel Urtasun
Despite the promising performance of conventional fully supervised algorithms, semantic segmentation has remained an important, yet challenging task.
no code implementations • ICCV 2015 • Ziyu Zhang, Alexander G. Schwing, Sanja Fidler, Raquel Urtasun
In this paper we tackle the problem of instance-level segmentation and depth ordering from a single monocular image.
no code implementations • 9 Mar 2015 • Alexander G. Schwing, Raquel Urtasun
Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation.
no code implementations • 9 Jul 2014 • Liang-Chieh Chen, Alexander G. Schwing, Alan L. Yuille, Raquel Urtasun
Towards this goal, we propose a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials.
no code implementations • CVPR 2014 • Andrea Cohen, Alexander G. Schwing, Marc Pollefeys
We propose a sequential optimization technique for segmenting a rectified image of a facade into semantic categories.
no code implementations • CVPR 2014 • Jia Xu, Alexander G. Schwing, Raquel Urtasun
We tackle the problem of weakly labeled semantic segmentation, where the only source of annotation are image tags encoding which classes are present in the scene.