1 code implementation • ICCV 2023 • Liqiang He, Wei Wang, Albert Chen, Min Sun, Cheng-Hao Kuo, Sinisa Todorovic
We propose a Bidirectional Alignment for domain adaptive Detection with Transformers (BiADT) to improve cross domain object detection performance.
no code implementations • ICCV 2023 • Nicolas Aziere, Sinisa Todorovic
Our key novelty is that we augment the original training videos in the deep feature space, not in the visual spatiotemporal domain as done by previous work.
1 code implementation • CVPR 2022 • Khoi Nguyen, Sinisa Todorovic
This paper addresses incremental few-shot instance segmentation, where a few examples of new object classes arrive when access to training examples of old classes is not available anymore, and the goal is to perform well on both old and new classes.
no code implementations • CVPR 2022 • Liqiang He, Sinisa Todorovic
Second, we use a mini-detector to initialize the content queries in the decoder with classification and regression embeddings of the respective heads in the mini-detector.
1 code implementation • 3 Sep 2021 • Arjun R. Akula, Keze Wang, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, Sinisa Todorovic, Joyce Chai, Song-Chun Zhu
More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.
1 code implementation • ICCV 2021 • Khoi Nguyen, Sinisa Todorovic
The resulting predictions on training images are taken as the pseudo-ground truth for the standard training of Mask-RCNN, which we use for amodal instance segmentation of test images.
no code implementations • CVPR 2021 • Jun Li, Sinisa Todorovic
This paper is about action segmentation under weak supervision in training, where the ground truth provides only a set of actions present, but neither their temporal ordering nor when they occur in a training video.
no code implementations • CVPR 2021 • Jun Li, Sinisa Todorovic
Our SSL trains an RNN to recognize positive and negative action sequences, and the RNN's hidden layer is taken as our new action-level feature embedding.
1 code implementation • CVPR 2021 • Khoi Nguyen, Sinisa Todorovic
This paper is about few-shot instance segmentation, where training and test image sets do not share the same object classes.
no code implementations • 16 Aug 2020 • Khoi Nguyen, Sinisa Todorovic
This paper addresses unsupervised few-shot object recognition, where all training images are unlabeled, and test images are divided into queries and a few labeled support images per object class of interest.
no code implementations • CVPR 2020 • Jun Li, Sinisa Todorovic
This paper is about weakly supervised action segmentation, where the ground truth specifies only a set of actions present in a training video, but not their true temporal ordering.
no code implementations • ICLR 2020 • Jun Li, Li Fuxin, Sinisa Todorovic
We specify two new optimization algorithms: Cayley SGD with momentum, and Cayley ADAM on the Stiefel manifold.
1 code implementation • ICCV 2019 • Jun Li, Peng Lei, Sinisa Todorovic
This paper is about labeling video frames with action classes under weak supervision in training, where we have access to a temporal ordering of actions, but their start and end frames in training videos are unknown.
1 code implementation • ICCV 2019 • Khoi Nguyen, Sinisa Todorovic
Finally, the target object is segmented in the query image by using a cosine similarity between the class feature vector and the query's feature map.
Ranked #74 on Few-Shot Semantic Segmentation on COCO-20i (5-shot)
no code implementations • 15 Sep 2019 • Arjun R. Akula, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, Sinisa Todorovic, Joyce Y. Chai, Song-Chun Zhu
We present a new explainable AI (XAI) framework aimed at increasing justified human trust and reliance in the AI machine through explanations.
Action Recognition Explainable Artificial Intelligence (XAI) +2
no code implementations • 13 Mar 2019 • Arjun R. Akula, Sinisa Todorovic, Joyce Y. Chai, Song-Chun Zhu
This paper presents an explainable AI (XAI) system that provides explanations for its predictions.
no code implementations • CVPR 2018 • Peng Lei, Sinisa Todorovic
This paper is about temporal segmentation of human actions in videos.
Ranked #24 on Action Segmentation on GTEA
1 code implementation • CVPR 2017 • Behrooz Mahasseni, Michael Lam, Sinisa Todorovic
The summarizer is the autoencoder long short-term memory network (LSTM) aimed at, first, selecting video frames, and then decoding the obtained summarization for reconstructing the input video.
no code implementations • CVPR 2017 • Behrooz Mahasseni, Sinisa Todorovic, Alan Fern
In this work, we study a poorly understood trade-off between accuracy and runtime costs for deep semantic video segmentation.
no code implementations • CVPR 2017 • Anirban Roy, Sinisa Todorovic
This paper addresses the problem of weakly supervised semantic image segmentation.
no code implementations • CVPR 2017 • Michael Lam, Behrooz Mahasseni, Sinisa Todorovic
This motivates us to formulate our problem as a sequential search for informative parts over a deep feature map produced by a deep Convolutional Neural Network (CNN).
no code implementations • CVPR 2017 • Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu
This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities.
Ranked #12 on Group Activity Recognition on Volleyball
no code implementations • CVPR 2018 • Peng Lei, Fuxin Li, Sinisa Todorovic
Using deep learning, this paper addresses the problem of joint object boundary detection and boundary motion estimation in videos, which we named boundary flow estimation.
1 code implementation • 14 Dec 2016 • Xu Xu, Sinisa Todorovic
Each state of the beam search corresponds to a candidate CNN.
no code implementations • 26 Jul 2016 • Behrooz Mahasseni, Sinisa Todorovic, Alan Fern
Our second contribution is the algorithm for learning a policy for the sparse selection of supervoxels and their descriptors for budgeted CRF inference.
no code implementations • 24 Jun 2016 • Dan Xie, Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu
This paper is about detecting functional objects and inferring human intentions in surveillance videos of public spaces.
no code implementations • CVPR 2016 • Anirban Roy, Sinisa Todorovic
This paper presents a novel deep architecture, called neural regression forest (NRF), for depth estimation from a single image.
no code implementations • CVPR 2016 • Behrooz Mahasseni, Sinisa Todorovic
This paper argues that large-scale action recognition in video can be greatly improved by providing an additional modality in training data -- namely, 3D human-skeleton sequences -- aimed at complementing poorly represented or missing features of human actions in the training videos.
no code implementations • ICCV 2015 • Zhuo Deng, Sinisa Todorovic, Longin Jan Latecki
In this paper, we address the problem of semantic scene segmentation of RGB-D images of indoor scenes.
no code implementations • 11 Jun 2015 • Shell X. Hu, Christopher K. I. Williams, Sinisa Todorovic
This paper presents a new probabilistic generative model for image segmentation, i. e. the task of partitioning an image into homogeneous regions.
no code implementations • CVPR 2015 • Michael Lam, Janardhan Rao Doppa, Sinisa Todorovic, Thomas G. Dietterich
The mainstream approach to structured prediction problems in computer vision is to learn an energy function such that the solution minimizes that function.
no code implementations • CVPR 2015 • Sebastian Kaltwang, Sinisa Todorovic, Maja Pantic
Our model is a latent tree (LT) that represents input features of facial landmark points and FAU intensities as leaf nodes, and encodes their higher-order dependencies with latent nodes at tree levels closer to the root.
no code implementations • CVPR 2015 • Sheng Chen, Alan Fern, Sinisa Todorovic
This problem is a middle-ground between frame-level person counting, which does not localize counts, and person detection aimed at perfectly localizing people with count-one detections.
no code implementations • CVPR 2015 • Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic, Song-Chun Zhu
This paper addresses a new problem of parsing low-resolution aerial videos of large spatial areas, in terms of 1) grouping, 2) recognizing events and 3) assigning roles to people engaged in events.
no code implementations • CVPR 2014 • Anirban Roy, Sinisa Todorovic
This paper addresses the problem of assigning object class labels to image pixels.
no code implementations • CVPR 2014 • Sheng Chen, Alan Fern, Sinisa Todorovic
This paper presents a new approach to tracking people in crowded scenes, where people are subject to long-term (partial) occlusions and may assume varying postures and articulations.
no code implementations • NeurIPS 2010 • William Brendel, Sinisa Todorovic
The algorithm seeks a solution directly in the discrete domain, instead of relaxing MWIS to a continuous problem, as common in previous work.
no code implementations • NeurIPS 2010 • Nadia Payet, Sinisa Todorovic
We use these class histograms for a non-parametric estimation of the distribution ratios.