no code implementations • 11 Dec 2024 • Andrew Szot, Bogdan Mazoure, Omar Attia, Aleksei Timofeev, Harsh Agrawal, Devon Hjelm, Zhe Gan, Zsolt Kira, Alexander Toshev
We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language and vision tasks these models are typically trained on.
1 code implementation • 5 Dec 2024 • Shaunak Halbe, Junjiao Tian, K J Joseph, James Seale Smith, Katherine Stevo, Vineeth N Balasubramanian, Zsolt Kira
In this paper, we propose GRAIN, a new pretraining strategy aimed at aligning representations at both fine and coarse levels simultaneously.
no code implementations • 14 Nov 2024 • Matthew Hull, Chao Zhang, Zsolt Kira, Duen Horng Chau
Differentiable rendering methods have emerged as a promising means for generating photo-realistic and physically plausible adversarial attacks by manipulating 3D objects and scenes that can deceive deep neural networks (DNNs).
1 code implementation • 3 Nov 2024 • Junjiao Tian, Chengyue Huang, Zsolt Kira
This exploration is beneficial for finding good loss basins when training from scratch.
1 code implementation • 26 Oct 2024 • Muhammad Zubair Irshad, Mauro Comi, Yen-Chen Lin, Nick Heppert, Abhinav Valada, Rares Ambrus, Zsolt Kira, Jonathan Tremblay
Neural Fields have emerged as a transformative approach for 3D scene representation in computer vision and robotics, enabling accurate inference of geometry, 3D semantics, and dynamics from posed 2D data.
1 code implementation • 3 Oct 2024 • Ahmad Elawady, Gunjan Chhablani, Ram Ramrakhya, Karmesh Yadav, Dhruv Batra, Zsolt Kira, Andrew Szot
Intelligent embodied agents need to quickly adapt to new scenarios by integrating long histories of experience into decision-making.
no code implementations • 9 Jul 2024 • Sriram Yenamandra, Arun Ramachandran, Mukul Khanna, Karmesh Yadav, Jay Vakil, Andrew Melnik, Michael Büttner, Leon Harz, Lyon Brown, Gora Chand Nandi, Arjun PS, Gaurav Kumar Yadav, Rahul Kala, Robert Haschke, Yang Luo, Jinxin Zhu, Yansen Han, Bingyi Lu, Xuan Gu, Qinyuan Liu, Yaping Zhao, Qiting Ye, Chenxiao Dou, Yansong Chua, Volodymyr Kuzma, Vladyslav Humennyy, Ruslan Partsey, Jonathan Francis, Devendra Singh Chaplot, Gunjan Chhablani, Alexander Clegg, Theophile Gervet, Vidhi Jain, Ram Ramrakhya, Andrew Szot, Austin Wang, Tsung-Yen Yang, Aaron Edsinger, Charlie Kemp, Binit Shah, Zsolt Kira, Dhruv Batra, Roozbeh Mottaghi, Yonatan Bisk, Chris Paxton
In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments.
1 code implementation • 24 Jun 2024 • Abhinav Narayan Harish, Larry Heck, Josiah P. Hanna, Zsolt Kira, Andrew Szot
We present Reinforcement Learning via Auxiliary Task Distillation (AuxDistill), a new method that enables reinforcement learning (RL) to perform long-horizon robot control problems by distilling behaviors from auxiliary RL tasks.
no code implementations • 12 Jun 2024 • Vishnu Jaganathan, Hannah Hanyun Huang, Muhammad Zubair Irshad, Varun Jampani, Amit Raj, Zsolt Kira
Our framework enables a wide variety of editing tasks such as manual local edits, correspondence based style transfer from any example image, and a combination of different styles from multiple example images.
no code implementations • 12 Jun 2024 • Andrew Szot, Bogdan Mazoure, Harsh Agrawal, Devon Hjelm, Zsolt Kira, Alexander Toshev
For discrete actions, we demonstrate that semantically aligning these actions with the native output token space of the MLLM leads to the strongest performance.
1 code implementation • 9 May 2024 • Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner
This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains.
no code implementations • 18 Apr 2024 • James Seale Smith, Lazar Valkov, Shaunak Halbe, Vyshnavi Gutta, Rogerio Feris, Zsolt Kira, Leonid Karlinsky
This continual learning (CL) phenomenon has been extensively studied, but primarily in a setting where only a small amount of past data can be stored.
1 code implementation • CVPR 2024 • Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani, Sriram Yenamandra, Theophile Gervet, Matthew Chang, Zsolt Kira, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi
The Embodied AI community has made significant strides in visual navigation tasks, exploring targets from 3D coordinates, objects, language descriptions, and images.
1 code implementation • 1 Apr 2024 • Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus
Given the capabilities of neural fields in densely representing a 3D scene from 2D images, we ask the question: Can we scale their self-supervised pretraining, specifically using masked autoencoders, to generate effective 3D representations from posed RGB images.
no code implementations • CVPR 2024 • Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra, Zsolt Kira, Kuo-Hao Zeng, Luca Weihs
Datasets for image description are typically constructed by curating relevant images and asking humans to annotate the contents of the image; neither of those two steps are straightforward for objects not present in the image.
no code implementations • CVPR 2024 • Junjiao Tian, Lavisha Aggarwal, Andrea Colaco, Zsolt Kira, Mar Gonzalez-Franco
The proposed method does not require any training or language dependency to extract quality segmentation for any images.
no code implementations • 14 Dec 2023 • Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Hao-Shu Fang, Shibo Zhao, Shayegan Omidshafiei, Dong-Ki Kim, Ali-akbar Agha-mohammadi, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Chen Wang, Zsolt Kira, Fei Xia, Yonatan Bisk
Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i. e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of general-purpose robotics, and also exploring (ii) what a robotics-specific foundation model would look like.
no code implementations • 30 Nov 2023 • James Seale Smith, Yen-Chang Hsu, Zsolt Kira, Yilin Shen, Hongxia Jin
We show that STAMINA outperforms the prior SOTA for the setting of text-to-image continual customization on a 50-concept benchmark composed of landmarks and human faces, with no stored replay data.
1 code implementation • 26 Nov 2023 • Jann Goschenhofer, Bernd Bischl, Zsolt Kira
Constrained clustering allows the training of classification models using pairwise constraints only, which are weak and relatively easy to mine, while still yielding full-supervision-level model performance.
1 code implementation • NeurIPS 2023 • Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet
Construction of a universal detector poses a crucial question: How can we most effectively train a model on a large mixture of datasets?
1 code implementation • 19 Oct 2023 • Mayank Lunayach, Sergey Zakharov, Dian Chen, Rares Ambrus, Zsolt Kira, Muhammad Zubair Irshad
In this work, we address the challenging task of 3D object recognition without the reliance on real-world 3D labeled data.
4 code implementations • 19 Oct 2023 • Xavier Puig, Eric Undersander, Andrew Szot, Mikael Dallaire Cote, Tsung-Yen Yang, Ruslan Partsey, Ruta Desai, Alexander William Clegg, Michal Hlavac, So Yeon Min, Vladimír Vondruš, Theophile Gervet, Vincent-Pierre Berges, John M. Turner, Oleksandr Maksymets, Zsolt Kira, Mrinal Kalakrishnan, Jitendra Malik, Devendra Singh Chaplot, Unnat Jain, Dhruv Batra, Akshara Rai, Roozbeh Mottaghi
We present Habitat 3. 0: a simulation platform for studying collaborative human-robot tasks in home environments.
no code implementations • 28 Sep 2023 • Benjamin Hoover, Hendrik Strobelt, Dmitry Krotov, Judy Hoffman, Zsolt Kira, Duen Horng Chau
The generative process of Diffusion Models (DMs) has recently set state-of-the-art on many AI generation benchmarks.
1 code implementation • 28 Aug 2023 • Ran Liu, Sahil Khose, Jingyun Xiao, Lakshmi Sathidevi, Keerthan Ramnath, Zsolt Kira, Eva L. Dyer
To address this challenge, we propose a novel approach for distribution-aware latent augmentation that leverages the relationships across samples to guide the augmentation procedure.
2 code implementations • ICCV 2023 • Muhammad Zubair Irshad, Sergey Zakharov, Katherine Liu, Vitor Guizilini, Thomas Kollar, Adrien Gaidon, Zsolt Kira, Rares Ambrus
NeO 360's representation allows us to learn from a large collection of unbounded 3D scenes while offering generalizability to new views and novel scenes from as few as a single image during inference.
Ranked #1 on Generalizable Novel View Synthesis on NERDS 360
1 code implementation • 23 Aug 2023 • Junjiao Tian, Lavisha Aggarwal, Andrea Colaco, Zsolt Kira, Mar Gonzalez-Franco
The proposed method does not require any training or language dependency to extract quality segmentation for any images.
Ranked #1 on Semantic Segmentation on COCO-Stuff-27
no code implementations • 20 Jun 2023 • Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner, Zsolt Kira, Manolis Savva, Angel Chang, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi, Yonatan Bisk, Chris Paxton
HomeRobot (noun): An affordable compliant robot that navigates homes and manipulates a wide range of objects in order to complete everyday tasks.
1 code implementation • 16 Jun 2023 • Shaunak Halbe, James Seale Smith, Junjiao Tian, Zsolt Kira
In this paper, we attempt to tackle forgetting and heterogeneity while minimizing overhead costs and without requiring access to any stored data.
no code implementations • 31 May 2023 • Andrew Szot, Unnat Jain, Dhruv Batra, Zsolt Kira, Ruta Desai, Akshara Rai
We present the task of "Social Rearrangement", consisting of cooperative everyday tasks like setting up the dinner table, tidying a house or unpacking groceries in a simulated multi-agent environment.
no code implementations • CVPR 2023 • Chia-Wen Kuo, Zsolt Kira
The image captioning model encodes each view independently with a shared encoder efficiently, and a contrastive loss is incorporated across the encoded views in a novel way to improve their representation quality and the model's data efficiency.
1 code implementation • NeurIPS 2023 • Chen-Hao Chao, Wei-Fang Sun, Yen-Chang Hsu, Zsolt Kira, Chun-Yi Lee
In this paper, we establish a connection between the parameterization of flow-based and energy-based generative models, and present a new flow-based modeling approach called energy-based normalizing flow (EBFlow).
no code implementations • 17 May 2023 • Rabah Ouldnoughi, Chia-Wen Kuo, Zsolt Kira
Generalized Category Discovery (GCD) requires a model to both classify known categories and cluster unknown categories in unlabeled data.
1 code implementation • 21 Apr 2023 • Harsh Maheshwari, Yen-Cheng Liu, Zsolt Kira
We create the first benchmark for semi-supervised multi-modal semantic segmentation and also report the robustness to missing modalities.
Ranked #1 on Semantic Segmentation on SUN-RGBD (Mean IoU (test) metric, using extra training data)
RGBD Semantic Segmentation Robust Semi-Supervised RGBD Semantic Segmentation +3
no code implementations • 12 Apr 2023 • James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin
We show that C-LoRA not only outperforms several baselines for our proposed setting of text-to-image continual customization, which we refer to as Continual Diffusion, but that we achieve a new state-of-the-art in the well-established rehearsal-free continual learning setting for image classification.
no code implementations • 28 Mar 2023 • Andrew Szot, Amy Zhang, Dhruv Batra, Zsolt Kira, Franziska Meier
How well do reward functions learned with inverse reinforcement learning (IRL) generalize?
2 code implementations • CVPR 2023 • Junjiao Tian, Xiaoliang Dai, Chih-Yao Ma, Zecheng He, Yen-Cheng Liu, Zsolt Kira
To solve this problem, we propose Trainable Projected Gradient Method (TPGM) to automatically learn the constraint imposed for each layer for a fine-grained fine-tuning regularization.
no code implementations • 14 Mar 2023 • Karmesh Yadav, Arjun Majumdar, Ram Ramrakhya, Naoki Yokoyama, Alexei Baevski, Zsolt Kira, Oleksandr Maksymets, Dhruv Batra
We present a single neural network architecture composed of task-agnostic components (ViTs, convolutions, and LSTMs) that achieves state-of-art results on both the ImageNav ("go to location in <this picture>") and ObjectNav ("find a chair") tasks without any task-specific modules like object detection, segmentation, mapping, or planning modules.
no code implementations • 10 Mar 2023 • Nathaniel Moore Glaser, Zsolt Kira
This paper addresses the task of joint multi-agent perception and planning, especially as it relates to the real-world challenge of collision-free navigation for connected self-driving vehicles.
no code implementations • 8 Dec 2022 • Indranil Sur, Zachary Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron E. Taylor, Mustafa Burak Gurbuz, James Smith, Sahana Joshi, Nathalie Japkowicz, Michael Baron, Zsolt Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, Michael Piacentino, Jesse Hostetler, Aswin Raghavan
In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system.
2 code implementations • CVPR 2023 • James Seale Smith, Leonid Karlinsky, Vyshnavi Gutta, Paola Cascante-Bonilla, Donghyun Kim, Assaf Arbelle, Rameswar Panda, Rogerio Feris, Zsolt Kira
Our experiments show that we outperform the current SOTA method DualPrompt on established benchmarks by as much as 4. 5% in average final accuracy.
no code implementations • 20 Nov 2022 • Chia-Wen Kuo, Chih-Yao Ma, Judy Hoffman, Zsolt Kira
In Vision-and-Language Navigation (VLN), researchers typically take an image encoder pre-trained on ImageNet without fine-tuning on the environments that the agent will be trained or tested on.
1 code implementation • CVPR 2023 • James Seale Smith, Paola Cascante-Bonilla, Assaf Arbelle, Donghyun Kim, Rameswar Panda, David Cox, Diyi Yang, Zsolt Kira, Rogerio Feris, Leonid Karlinsky
This leads to reasoning mistakes, which need to be corrected as they occur by teaching VL models the missing SVLC skills; often this must be done using private data where the issue was found, which naturally leads to a data-free continual (no task-id) VL learning setting.
no code implementations • 7 Oct 2022 • Yen-Cheng Liu, Chih-Yao Ma, Junjiao Tian, Zijian He, Zsolt Kira
Specifically, Polyhistor achieves competitive accuracy compared to the state-of-the-art while only using ~10% of their trainable parameters.
1 code implementation • 21 Sep 2022 • Junjiao Tian, James Seale Smith, Zsolt Kira
For the more typical applications of FL where the number of clients is large (e. g., edge-device and mobile applications), these methods cannot be applied, motivating the need for a stateless approach to heterogeneous FL which can be used for any number of clients.
no code implementations • 15 Sep 2022 • Farrukh Rahman, Ömer Mubarek, Zsolt Kira
Our work empirically explores the low data regime for video classification and discovers that, surprisingly, transformers perform extremely well in the low-labeled video setting compared to CNNs.
no code implementations • 29 Aug 2022 • Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira
To address this problem, we consider online and offline OOD detection modules, which are integrated with SSOD methods.
2 code implementations • 27 Jul 2022 • Muhammad Zubair Irshad, Sergey Zakharov, Rares Ambrus, Thomas Kollar, Zsolt Kira, Adrien Gaidon
A novel disentangled shape and appearance database of priors is first learned to embed objects in their respective shape and appearance space.
3D Shape Reconstruction From A Single 2D Image 6D Pose Estimation +4
1 code implementation • CVPR 2022 • Yen-Cheng Liu, Chih-Yao Ma, Zsolt Kira
In this paper, we present Unbiased Teacher v2, which shows the generalization of SS-OD method to anchor-free detectors and also introduces Listen2Student mechanism for the unsupervised regression loss.
no code implementations • 16 Jun 2022 • Mayank Lunayach, James Smith, Zsolt Kira
Online few-shot learning describes a setting where models are trained and evaluated on a stream of data while learning emerging classes.
1 code implementation • CVPR 2022 • Chia-Wen Kuo, Zsolt Kira
A key limitation of such methods, however, is that the output of the model is conditioned only on the object detector's outputs.
Ranked #12 on Image Captioning on COCO Captions
no code implementations • 31 Mar 2022 • James Seale Smith, Junjiao Tian, Shaunak Halbe, Yen-Chang Hsu, Zsolt Kira
Next, we explore how to leverage knowledge from a pre-trained model in rehearsal-free continual learning and find that vanilla L2 parameter regularization outperforms EWC parameter regularization and feature distillation.
no code implementations • 18 Mar 2022 • Yen-Chang Hsu, James Smith, Yilin Shen, Zsolt Kira, Hongxia Jin
Knowledge distillation (KD) is a substantial strategy for transferring learned knowledge from one neural network model to another.
3 code implementations • 3 Mar 2022 • Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone, Zsolt Kira
This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation.
Ranked #1 on 6D Pose Estimation using RGBD on CAMERA25
no code implementations • 28 Oct 2021 • Junjiao Tian, Yen-Change Hsu, Yilin Shen, Hongxia Jin, Zsolt Kira
We are the first to propose a method that works well across both OOD detection and calibration and under different types of shifts.
1 code implementation • NeurIPS 2021 • Junjiao Tian, Dylan Yung, Yen-Chang Hsu, Zsolt Kira
It is well known that vision classification models suffer from poor calibration in the face of data distribution shifts.
1 code implementation • 11 Oct 2021 • Yousef Emam, Gennaro Notomista, Paul Glotfelter, Zsolt Kira, Magnus Egerstedt
Reinforcement Learning (RL) has been shown to be effective in many scenarios.
Model-based Reinforcement Learning reinforcement-learning +3
no code implementations • 29 Sep 2021 • Junjiao Tian, Yen-Chang Hsu, Yilin Shen, Hongxia Jin, Zsolt Kira
To this end, we theoretically derive two score functions for OOD detection, the covariate shift score and concept shift score, based on the decomposition of KL-divergence for both scores, and propose a geometrically-inspired method (Geometric ODIN) to improve OOD detection under both shifts with only in-distribution data.
no code implementations • 29 Sep 2021 • Zhuoran Yu, Yen-Cheng Liu, Chih-Yao Ma, Zsolt Kira
Inspired by the fact that teacher/student pseudo-labeling approaches result in a weak and sparse gradient signal due to the difficulty of confidence-thresholding, CrossMatch leverages \textit{multi-scale feature extraction} in object detection.
no code implementations • 1 Jul 2021 • Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
In this paper, we address bandwidth-limited and obstruction-prone collaborative perception, specifically in the context of multi-agent semantic segmentation.
no code implementations • 1 Jul 2021 • Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
In this paper, we address the multi-robot collaborative perception problem, specifically in the context of multi-view infilling for distributed semantic segmentation.
6 code implementations • NeurIPS 2021 • Andrew Szot, Alex Clegg, Eric Undersander, Erik Wijmans, Yili Zhao, John Turner, Noah Maestre, Mustafa Mukadam, Devendra Chaplot, Oleksandr Maksymets, Aaron Gokaslan, Vladimir Vondrus, Sameer Dharur, Franziska Meier, Wojciech Galuba, Angel Chang, Zsolt Kira, Vladlen Koltun, Jitendra Malik, Manolis Savva, Dhruv Batra
We introduce Habitat 2. 0 (H2. 0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios.
1 code implementation • 28 Jun 2021 • Junjiao Tian, Niluthpol Mithun, Zach Seymour, Han-Pang Chiu, Zsolt Kira
There are two major drawbacks to these methods: 1) constantly up-weighting minority classes can introduce excessive false positives in semantic segmentation; 2) a minority class is not necessarily a hard class.
2 code implementations • ICCV 2021 • James Smith, Yen-Chang Hsu, Jonathan Balloch, Yilin Shen, Hongxia Jin, Zsolt Kira
Modern computer vision applications suffer from catastrophic forgetting when incrementally learning new concepts over time.
Ranked #5 on Class Incremental Learning on cifar100
1 code implementation • 16 Mar 2021 • Jingdao Chen, Zsolt Kira, Yong K. Cho
3D point cloud segmentation is an important function that helps robots understand the layout of their surrounding environment and perform tasks such as grasping objects, avoiding obstacles, and finding landmarks.
4 code implementations • ICLR 2021 • Yen-Cheng Liu, Chih-Yao Ma, Zijian He, Chia-Wen Kuo, Kan Chen, Peizhao Zhang, Bichen Wu, Zsolt Kira, Peter Vajda
To address this, we introduce Unbiased Teacher, a simple yet effective approach that jointly trains a student and a gradually progressing teacher in a mutually-beneficial manner.
1 code implementation • 23 Jan 2021 • James Smith, Jonathan Balloch, Yen-Chang Hsu, Zsolt Kira
Our work investigates whether we can significantly reduce this memory budget by leveraging unlabeled data from an agent's environment in a realistic and challenging continual learning paradigm.
1 code implementation • 1 Jan 2021 • Junjiao Tian, Niluthpol Chowdhury Mithun, Zachary Seymour, Han-Pang Chiu, Zsolt Kira
Many works have proposed to weigh the standard cross entropy loss function with pre-computed weights based on class statistics such as the number of samples and class margins.
no code implementations • NeurIPS 2020 • Junjiao Tian, Yen-Cheng Liu, Nathan Glaser, Yen-Chang Hsu, Zsolt Kira
Neural Networks can perform poorly when the training label distribution is heavily imbalanced, as well as when the testing data differs from the training distribution.
Ranked #31 on Long-tail Learning on CIFAR-100-LT (ρ=10)
no code implementations • 24 Aug 2020 • Benjamin Wilson, Zsolt Kira, James Hays
In this work, we address the long-tail problem by leveraging both the large class-taxonomies of modern 2D datasets and the robustness of state-of-the-art 2D detection methods.
2 code implementations • ECCV 2020 • Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira
Recent state-of-the-art semi-supervised learning (SSL) methods use a combination of image-based transformations and consistency regularization as core components.
2 code implementations • 19 Jun 2020 • Nathan Somavarapu, Chih-Yao Ma, Zsolt Kira
Convolutional Neural Networks (CNNs) show impressive performance in the standard classification setting where training and testing data are drawn i. i. d.
Ranked #57 on Domain Generalization on PACS
2 code implementations • CVPR 2020 • Yen-Cheng Liu, Junjiao Tian, Nathaniel Glaser, Zsolt Kira
While significant advances have been made for single-agent perception, many applications require multiple sensing agents and cross-agent communication due to benefits such as coverage and robustness.
1 code implementation • 21 Mar 2020 • Yen-Cheng Liu, Junjiao Tian, Chih-Yao Ma, Nathan Glaser, Chia-Wen Kuo, Zsolt Kira
In this paper, we propose the problem of collaborative perception, where robots can combine their local observations with those of neighboring agents in a learnable way to improve accuracy on a perception task.
Multi-agent Reinforcement Learning Reinforcement Learning +2
1 code implementation • CVPR 2020 • Min-Hung Chen, Baopu Li, Yingze Bao, Ghassan AlRegib, Zsolt Kira
Despite the recent progress of fully-supervised action segmentation techniques, the performance is still not fully satisfactory.
Ranked #13 on Action Segmentation on GTEA
2 code implementations • CVPR 2020 • Yen-Chang Hsu, Yilin Shen, Hongxia Jin, Zsolt Kira
Deep neural networks have attained remarkable performance when applied to data that comes from the same distribution as that of the training set, but can significantly degrade otherwise.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
no code implementations • 6 Nov 2019 • Junjiao Tian, Wesley Cheung, Nathan Glaser, Yen-Cheng Liu, Zsolt Kira
Specifically, we analyze a number of uncertainty measures, each of which captures a different aspect of uncertainty, and we propose a novel way to fuse degraded inputs by scaling modality-specific output softmax probabilities.
5 code implementations • ICCV 2019 • Min-Hung Chen, Zsolt Kira, Ghassan AlRegib, Jaekwon Yoo, Ruxin Chen, Jian Zheng
Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on four video DA datasets (e. g. 7. 9% accuracy gain over "Source only" from 73. 9% to 81. 8% on "HMDB --> UCF", and 10. 3% gain on "Kinetics --> Gameplay").
no code implementations • 12 Jun 2019 • Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira
We then show that when combined with these regularizers, the proposed method facilitates the propagation of information from generated prototypes to image data to further improve results.
2 code implementations • 1 Jun 2019 • Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira
When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model uses the correct image regions to output particular words, or if the model is hallucinating based on priors in the dataset and/or the language model.
no code implementations • 29 May 2019 • Angel Daruna, Weiyu Liu, Zsolt Kira, Sonia Chernova
Service robots benefit from encoding information in semantically meaningful ways to enable more robust task execution.
1 code implementation • 26 May 2019 • Weiyu Liu, Angel Daruna, Zsolt Kira, Sonia Chernova
The objective of the knowledge base completion problem is to infer missing information from existing facts in a knowledge base.
5 code implementations • 26 May 2019 • Min-Hung Chen, Zsolt Kira, Ghassan AlRegib
Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on three video DA datasets.
Ranked #1 on Domain Adaptation on UCF-to-Olympic
13 code implementations • ICLR 2019 • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang
Few-shot classification aims to learn a classifier to recognize unseen classes during training with limited labeled examples.
no code implementations • ICLR Workshop LLD 2019 • Phillip Odom, Aaron Keech, Zsolt Kira
Hierarchical reinforcement learning captures sub-task information to learn modular policies that can be quickly adapted to new tasks.
Constrained Clustering Hierarchical Reinforcement Learning +3
no code implementations • ICLR Workshop LLD 2019 • James Smith, Seth Baer, Zsolt Kira, Constantine Dovrolis
We first pose the Unsupervised Continual Learning (UCL) problem: learning salient representations from a non-stationary stream of unlabeled data in which the number of object classes varies with time.
no code implementations • 24 Mar 2019 • Angel Daruna, Weiyu Liu, Zsolt Kira, Sonia Chernova
Autonomous service robots require computational frameworks that allow them to generalize knowledge to new situations in a manner that models uncertainty while scaling to real-world problem sizes.
3 code implementations • CVPR 2019 • Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira
As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.
Ranked #115 on Vision and Language Navigation on VLN Challenge
1 code implementation • CVPR 2019 (Oral) 2019 • Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira
As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.
1 code implementation • 18 Feb 2019 • Jingdao Chen, Yong K. Cho, Zsolt Kira
Mobile robots need to create high-definition 3D maps of the environment for applications such as remote surveillance and infrastructure mapping.
Robotics
2 code implementations • ICLR 2019 • Chih-Yao Ma, Jiasen Lu, Zuxuan Wu, Ghassan AlRegib, Zsolt Kira, Richard Socher, Caiming Xiong
The Vision-and-Language Navigation (VLN) task entails an agent following navigational instruction in photo-realistic unknown environments.
Ranked #115 on Vision and Language Navigation on VLN Challenge
Natural Language Visual Grounding Vision and Language Navigation +2
1 code implementation • ICLR 2019 • Yen-Chang Hsu, Zhaoyang Lv, Joel Schlosser, Phillip Odom, Zsolt Kira
This work presents a new strategy for multi-class classification that requires no class-specific labels, but instead leverages pairwise similarity between examples, which is a weaker form of annotation.
no code implementations • 16 Nov 2018 • Chia-Wen Kuo, Jacob Ashmore, David Huggins, Zsolt Kira
This paper presents a challenging computer vision task, namely the detection of generic components on a PCB, and a novel set of deep-learning methods that are able to jointly leverage the appearance of individual components and the propagation of information across the structure of the board to accurately detect and identify various types of components on a PCB.
3 code implementations • 30 Oct 2018 • Yen-Chang Hsu, Yen-Cheng Liu, Anita Ramasamy, Zsolt Kira
Continual learning has received a great deal of attention recently with several approaches being proposed.
no code implementations • 28 Jun 2018 • Yen-Chang Hsu, Zhaoyang Lv, Joel Schlosser, Phillip Odom, Zsolt Kira
The proposed objective directly minimizes the negative log-likelihood of cluster assignment with respect to the pairwise constraints, has no hyper-parameters, and demonstrates improved scalability and performance on both supervised learning and unsupervised transfer learning.
Ranked #1 on Ecg Risk Stratification on ngm
1 code implementation • 17 Mar 2018 • Yen-Chang Hsu, Zheng Xu, Zsolt Kira, Jiawei Huang
We utilize the most fundamental property of instance labeling -- the pairwise relationship between pixels -- as the supervision to formulate the learning objective, then apply it to train a fully convolutional network (FCN) for learning to perform pixel-wise clustering.
Ranked #14 on Lane Detection on TuSimple
1 code implementation • ICLR 2018 • Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira
The key insight is that, in addition to features, we can transfer similarity information and this is sufficient to learn a similarity function and clustering network to perform both domain adaptation and cross-task transfer learning.
no code implementations • CVPR 2018 • Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf
Human actions often involve complex interactions across several inter-related objects in the scene.
no code implementations • 16 Nov 2017 • Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf
We address the problem of video captioning by grounding language generation on object interactions in the video.
8 code implementations • ICLR 2018 • Naveen Kodali, Jacob Abernethy, James Hays, Zsolt Kira
We propose studying GAN training dynamics as regret minimization, which is in contrast to the popular view that there is consistent minimization of a divergence between real and generated distributions.
4 code implementations • 30 Mar 2017 • Chih-Yao Ma, Min-Hung Chen, Zsolt Kira, Ghassan AlRegib
We demonstrate that using both RNNs (using LSTMs) and Temporal-ConvNets on spatiotemporal feature matrices are able to exploit spatiotemporal dynamics to improve the overall performance.
Ranked #56 on Action Recognition on HMDB-51
no code implementations • 5 Dec 2016 • Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira
We propose that this network can be learned with contrastive loss which is only based on weak binary pair-wise constraints.
no code implementations • 27 Jul 2016 • Zhaoyang Lv, Chris Beall, Pablo F. Alcantarilla, Fuxin Li, Zsolt Kira, Frank Dellaert
We propose a continuous optimization method for solving dense 3D scene flow problems from stereo imagery.
2 code implementations • 19 Nov 2015 • Yen-Chang Hsu, Zsolt Kira
Robustness analysis also shows that the method is largely insensitive to the number of clusters.