no code implementations • ECCV 2020 • Hongyuan Du, Linjun Li, Bo Liu, Nuno Vasconcelos
The sparsity of point clouds limits deep learning models on capturing long-range dependencies, which makes features extracted by the models ambiguous.
no code implementations • 25 Jan 2024 • Zhen Wang, Yuelei Li, Jia Wan, Nuno Vasconcelos
Our proposed smoothed density map input for ControlNet significantly improves ControlNet's performance in generating crowds in the correct locations.
no code implementations • 1 Dec 2023 • Deepak Sridhar, Yunsheng Li, Nuno Vasconcelos
Vision Transformers have received significant attention due to their impressive performance in many vision tasks.
no code implementations • 14 Jun 2023 • Zhiyuan Hu, Jiancheng Lyu, Dashan Gao, Nuno Vasconcelos
We show that a foundation model equipped with POP learning is able to outperform classic CL methods by a significant margin.
no code implementations • 9 Jun 2023 • Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
Research in scene graph generation (SGG) usually considers two-stage models, that is, detecting a set of entities, followed by combining them and labeling all possible relationships.
1 code implementation • 4 Jun 2023 • Tz-Ying Wu, Chih-Hui Ho, Nuno Vasconcelos
A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
no code implementations • ICCV 2023 • Jiteng Mu, Shen Sang, Nuno Vasconcelos, Xiaolong Wang
While NeRF-based human representations have shown impressive novel view synthesis results, most methods still rely on a large number of images / views for training.
1 code implementation • CVPR 2023 • Yi Li, Kyle Min, Subarna Tripathi, Nuno Vasconcelos
Do video-text transformers learn to model temporal relationships across frames?
Ranked #4 on Video Question Answering on AGQA 2.0 balanced (Average Accuracy metric)
no code implementations • 12 Apr 2023 • Yuzhao Chen, Zonghuan Li, Zhiyuan Hu, Nuno Vasconcelos
In this work, we propose the Taxonomic Class Incremental Learning (TCIL) problem.
no code implementations • CVPR 2023 • Zhiyuan Hu, Yunsheng Li, Jiancheng Lyu, Dashan Gao, Nuno Vasconcelos
This is accomplished by the introduction of dense connections between the intermediate layers of the task expert networks, that enable the transfer of knowledge from old to new tasks via feature sharing and reusing.
no code implementations • ICCV 2023 • Yuwei Zhang, Chih-Hui Ho, Nuno Vasconcelos
To resolve the first drawback, we propose a new testing dataset, RGQA, which combines AQs from an existing VQA dataset with around 29K human-annotated UQs.
no code implementations • CVPR 2023 • Pei Wang, Nuno Vasconcelos
A new approach, based on semi-supervised learning (SSL) and denoted as SSL with human filtering (SSL-HF) is proposed.
1 code implementation • 11 Dec 2022 • Chih-Hui Ho, Nuno Vasconcelos
The problem of adversarial defenses for image classification, where the goal is to robustify a classifier against adversarial examples, is considered.
1 code implementation • 15 Nov 2022 • Chih-Hui Ho, Srikar Appalaraju, Bhavan Jasani, R. Manmatha, Nuno Vasconcelos
We present YORO - a multi-modal transformer encoder-only architecture for the Visual Grounding (VG) task.
1 code implementation • 7 Jul 2022 • Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Pei Yu, Jing Yin, Lu Yuan, Zicheng Liu, Nuno Vasconcelos
We formulate this as a learning problem where the goal is to assign operators to proposals, in the detection head, so that the total computational cost is constrained and the precision is maximized.
no code implementations • 29 Jun 2022 • Mark Tenzer, Zeeshan Rasheed, Khurram Shafique, Nuno Vasconcelos
A need to understand and predict vehicles' behavior underlies both public and private goals in the transportation domain, including urban planning and management, ride-sharing services, and intelligent transportation systems.
1 code implementation • CVPR 2022 • Yi Li, Rameswar Panda, Yoon Kim, Chun-Fu, Chen, Rogerio Feris, David Cox, Nuno Vasconcelos
In particular, given a source sentence an autoregressive hallucination transformer is used to predict a discrete visual representation from the input text, and the combined text and hallucinated representations are utilized to obtain the target translation.
1 code implementation • CVPR 2022 • Tz-Ying Wu, Gurumurthy Swaminathan, Zhizhong Li, Avinash Ravichandran, Nuno Vasconcelos, Rahul Bhotika, Stefano Soatto
We hypothesize that a strong base model can provide a good representation for novel classes and incremental learning can be done with small adaptations.
1 code implementation • CVPR 2022 • Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu
We represent the correspondence maps of different images as warped coordinate frames transformed from a canonical coordinate frame, i. e., the correspondence map, which describes the structure (e. g., the shape of a face), is controlled via a transformation.
1 code implementation • CVPR 2022 • Pei Wang, Zhaowei Cai, Hao Yang, Gurumurthy Swaminathan, Nuno Vasconcelos, Bernt Schiele, Stefano Soatto
This is enabled by a unified architecture, Omni-DETR, based on the recent progress on student-teacher framework and end-to-end transformer based object detection.
Ranked #14 on Semi-Supervised Object Detection on COCO 2% labeled data
no code implementations • CVPR 2022 • Jiacheng Cheng, Nuno Vasconcelos
This suggests the hypothesis that DNN calibration can be improved by providing calibration supervision to all such binary problems.
no code implementations • CVPR 2022 • Yi Li, Nuno Vasconcelos
DRL is then formulated as an adversarial learning problem between the video and spatial models, with the objective of maximizing the dynamic score of learned spatiotemporal classifier.
1 code implementation • ICCV 2021 • Zhirui Dai, Yuepeng Jiang, Yi Li, Bo Liu, Antoni B. Chan, Nuno Vasconcelos
A dataset of crowd scenes with people annotations under a bird's eye view (BEV) and ground truth for metric distances is introduced, and several measures for the evaluation of social distance detection systems are proposed.
no code implementations • 24 Aug 2021 • Brandon Leung, Chih-Hui Ho, Amir Persekian, David Orozco, Yen Chang, Erik Sandstrom, Bo Liu, Nuno Vasconcelos
Second, it is used to show that the augmentation of in the wild datasets, such as ImageNet, with in the lab data, such as OOWL500, can significantly decrease these biases, leading to object recognizers of improved generalization.
no code implementations • 23 Aug 2021 • Brandon Leung, Chih-Hui Ho, Nuno Vasconcelos
Much recent progress has been made in reconstructing the 3D shape of an object from an image of it, i. e. single view 3D reconstruction.
no code implementations • ICCV 2021 • Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
Significant effort has been recently devoted to modeling visual relations.
1 code implementation • ICCV 2021 • Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Lu Yuan, Zicheng Liu, Lei Zhang, Nuno Vasconcelos
This paper aims at addressing the problem of substantial performance degradation at extremely low computational cost (e. g. 5M FLOPs on ImageNet classification).
no code implementations • CVPR 2021 • Jiacheng Cheng, Nuno Vasconcelos
The problem of novelty detection in fine-grained visual classification (FGVC) is considered.
no code implementations • CVPR 2021 • Pei Wang, Kabir Nagrecha, Nuno Vasconcelos
This is formulated as a problem of functional optimization where, at each teaching iteration, the teacher seeks to align the steepest descent directions of the risk of (1) the teaching set and (2) entire example population.
no code implementations • ICCV 2021 • Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos
A new learning algorithm is then proposed for GeometrIc Structure Transfer (GIST), with resort to a combination of loss functions that combine class-balanced and random sampling to guarantee that, while overfitting to the popular classes is restricted to geometric parameters, it is leveraged to transfer class geometry from popular to few-shot classes.
no code implementations • 1 May 2021 • Bo Liu, Mandar Dixit, Roland Kwitt, Gang Hua, Nuno Vasconcelos
In the absence of dense pose sampling in image space, these latent space trajectories provide cross-modal guidance for learning.
no code implementations • 1 May 2021 • Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos
It is shown that, unlike class-balanced sampling, this is an adversarial augmentation strategy.
no code implementations • 1 May 2021 • Bo Liu, Haoxiang Li, Hao Kang, Nuno Vasconcelos, Gang Hua
A consistency loss has been introduced to limit the impact from unlabeled data while leveraging them to update the feature embedding.
1 code implementation • ICCV 2021 • Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan Yuille, Nuno Vasconcelos, Xiaolong Wang
To deal with the large shape variance, we introduce Articulated Signed Distance Functions (A-SDF) to represent articulated shapes with a disentangled latent space, where we have separate codes for encoding shape and articulation.
no code implementations • CVPR 2021 • Pei Wang, Yijun Li, Krishna Kumar Singh, Jingwan Lu, Nuno Vasconcelos
We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images from only a single training sample.
1 code implementation • CVPR 2021 • Pei Wang, Yijun Li, Nuno Vasconcelos
Extensive research in neural style transfer methods has shown that the correlation between features extracted by a pre-trained VGG network has a remarkable ability to capture the visual style of an image.
no code implementations • CVPR 2021 • Pedro Morgado, Ishan Misra, Nuno Vasconcelos
Second, since self-supervised contrastive learning relies on random sampling of negative instances, instances that are semantically similar to the base instance can be used as faulty negatives.
1 code implementation • CVPR 2021 • Yunsheng Li, Lu Yuan, Yinpeng Chen, Pei Wang, Nuno Vasconcelos
However, such a static model is difficult to handle conflicts across multiple domains, and suffers from a performance degradation in both source domains and target domain.
1 code implementation • ICLR 2021 • Yunsheng Li, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Dongdong Chen, Ye Yu, Lu Yuan, Zicheng Liu, Mei Chen, Nuno Vasconcelos
It has two limitations: (a) it increases the number of convolutional weights by K-times, and (b) the joint optimization of dynamic attention and static convolution kernels is challenging.
no code implementations • ICCV 2021 • Pei Wang, Nuno Vasconcelos
Preliminary studies show that the accuracy of classifiers trained on the final dataset is a function of the accuracy of the student annotators.
no code implementations • 24 Nov 2020 • Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Lu Yuan, Zicheng Liu, Lei Zhang, Nuno Vasconcelos
In this paper, we present MicroNet, which is an efficient convolutional neural network using extremely low computational cost (e. g. 6 MFLOPs on ImageNet classification).
no code implementations • NeurIPS 2020 • Pedro Morgado, Yi Li, Nuno Vasconcelos
To learn from these spatial cues, we tasked a network to perform contrastive audio-visual spatial alignment of 360{\deg} video and spatial audio.
no code implementations • NeurIPS 2020 • Chih-Hui Ho, Nuno Vasconcelos
This paper addresses the problem, by introducing a new family of adversarial examples for constrastive learning and using these examples to define a new adversarial training algorithm for SSL, denoted as CLAE.
no code implementations • 27 Jul 2020 • Pedro Morgado, Yunsheng Li, Jose Costa Pereira, Mohammad Saberian, Nuno Vasconcelos
The use of a fixed set of proxies (weights of the CNN classification layer) is proposed to eliminate this ambiguity, and a procedure to design proxy sets that are nearly optimal for both classification and hashing is introduced.
1 code implementation • ECCV 2020 • Tz-Ying Wu, Pedro Morgado, Pei Wang, Chih-Hui Ho, Nuno Vasconcelos
Motivated by this, a deep realistic taxonomic classifier (Deep-RTC) is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
1 code implementation • CVPR 2020 • Bo Liu, Hao Kang, Haoxiang Li, Gang Hua, Nuno Vasconcelos
It is argued that the classic softmax classifier is a poor solution for open-set recognition, since it tends to overfit on the training classes.
1 code implementation • CVPR 2021 • Pedro Morgado, Nuno Vasconcelos, Ishan Misra
Our method uses contrastive learning for cross-modal discrimination of video from audio and vice-versa.
Ranked #3 on Self-Supervised Audio Classification on ESC-50
2 code implementations • CVPR 2020 • Pei Wang, Nuno Vasconcelos
It is argued that self-awareness, namely the ability to produce classification confidence scores, is important for the computation of discriminant explanations, which seek to identify regions where it is easy to discriminate between prediction and counter class.
2 code implementations • CVPR 2020 • Zhaowei Cai, Nuno Vasconcelos
Low-precision networks, with weights and activations quantized to low bit-width, are widely used to accelerate inference on edge devices.
no code implementations • 4 Apr 2020 • Xudong Wang, Shizhong Han, Yunqiang Chen, Dashan Gao, Nuno Vasconcelos
A volumetric attention(VA) module for 3D medical image segmentation and detection is proposed.
1 code implementation • CVPR 2020 • Chih-Hui Ho, Bo Liu, Tz-Ying Wu, Nuno Vasconcelos
Multiview recognition has been well studied in the literature and achieves decent performance in object recognition and retrieval task.
1 code implementation • CVPR 2020 • Yiran Xu, Xiaoyin Yang, Lihang Gong, Hsuan-Chu Lin, Tz-Ying Wu, Yunsheng Li, Nuno Vasconcelos
The new paradigm lies between the end-to-end and pipelined approaches, and is inspired by how humans solve the problem.
no code implementations • 1 Aug 2019 • Qihang Peng, Andrew Gilman, Nuno Vasconcelos, Pamela C. Cosman, Laurence B. Milstein
We propose a robust spectrum sensing framework based on deep learning.
1 code implementation • CVPR 2019 • Pedro Morgado, Nuno Vasconcelos
Under the standard paradigm of network fine-tuning, an entirely new CNN is learned per task, and the final network size is independent of task complexity.
no code implementations • 24 Jun 2019 • Yunsheng Li, Nuno Vasconcelos
The problem of multi-domain learning of deep networks is considered.
4 code implementations • 24 Jun 2019 • Zhaowei Cai, Nuno Vasconcelos
In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives.
Ranked #4 on Instance Segmentation on BDD100K val
no code implementations • 27 May 2019 • Mandar Dixit, Yunsheng Li, Nuno Vasconcelos
Somewhat surprisingly, the scene classification results are superior to those of a CNN explicitly trained for scene classification, using a large scene dataset (Places).
3 code implementations • CVPR 2019 • Yunsheng Li, Lu Yuan, Nuno Vasconcelos
In this paper, we propose a novel bidirectional learning framework for domain adaptation of segmentation.
Ranked #7 on Semantic Segmentation on DADA-seg
1 code implementation • CVPR 2019 • Yi Li, Nuno Vasconcelos
An experimental set-up is also introduced to measure the bias of any dataset for a given representation, and the impact of this bias on the performance of recognition models.
1 code implementation • CVPR 2019 • Xudong Wang, Zhaowei Cai, Dashan Gao, Nuno Vasconcelos
Experiments, on a newly established universal object detection benchmark of 11 diverse datasets, show that the proposed detector outperforms a bank of individual detectors, a multi-domain detector, and a baseline universal detector, with a 1. 3x parameter increase over a single-domain baseline detector.
no code implementations • 22 Nov 2018 • Peng Jiang, Zhiyi Pan, Nuno Vasconcelos, Baoquan Chen, Jingliang Peng
Following this analysis, we propose super diffusion, a novel inclusive learning-based framework for salient object detection, which makes the optimum and robust performance by integrating a large pool of feature spaces, scales and even features originally computed for non-diffusion-based salient object detection.
1 code implementation • 7 Sep 2018 • Pedro Morgado, Nuno Vasconcelos, Timothy Langlois, Oliver Wang
Using our approach, we show that it is possible to infer the spatial location of sound sources based only on 360 video and a mono audio track.
no code implementations • ECCV 2018 • Yingwei Li, Yi Li, Nuno Vasconcelos
The notion of the representation bias of a dataset is proposed to combat this problem.
no code implementations • ECCV 2018 • Pei Wang, Nuno Vasconcelos
It is argued that this should be a predictor independent of the classifier itself, but tuned to it, and learned without explicit supervision, so as to learn from its mistakes.
no code implementations • CVPR 2018 • Bo Liu, Xudong Wang, Mandar Dixit, Roland Kwitt, Nuno Vasconcelos
A new architecture, denoted the FeATure TransfEr Network (FATTEN), is proposed for the modeling of feature trajectories induced by variations of object pose.
8 code implementations • CVPR 2018 • Zhaowei Cai, Nuno Vasconcelos
In object detection, an intersection over union (IoU) threshold is required to define positives and negatives.
Ranked #4 on 2D Object Detection on SARDet-100K
no code implementations • ICCV 2017 • Yunsheng Li, Mandar Dixit, Nuno Vasconcelos
This enables the design of a network architecture, the MFAFVNet, that can be trained in an end to end manner.
1 code implementation • CVPR 2017 • Mandar Dixit, Roland Kwitt, Marc Niethammer, Nuno Vasconcelos
We implement our approach as a deep encoder-decoder architecture that learns the synthesis function in an end-to-end manner.
1 code implementation • CVPR 2017 • Pedro Morgado, Nuno Vasconcelos
The role of semantics in zero-shot learning is considered.
1 code implementation • CVPR 2017 • Zhaowei Cai, Xiaodong He, Jian Sun, Nuno Vasconcelos
The problem of quantizing the activations of a deep neural network is considered.
1 code implementation • 8 Dec 2016 • Mandar Dixit, Roland Kwitt, Marc Niethammer, Nuno Vasconcelos
We implement our approach as a deep encoder-decoder architecture that learns the synthesis function in an end-to-end manner.
no code implementations • NeurIPS 2016 • Mandar D. Dixit, Nuno Vasconcelos
While this problem is currently addressed with Fisher vector representations, these are now shown ineffective for the high-dimensional and highly non-linear features extracted by modern CNNs.
no code implementations • 26 Jul 2016 • Marian George, Mandar Dixit, Gábor Zogg, Nuno Vasconcelos
In this work, we propose a novel domain generalization approach for fine-grained scene recognition.
1 code implementation • 25 Jul 2016 • Zhaowei Cai, Quanfu Fan, Rogerio S. Feris, Nuno Vasconcelos
A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection.
Ranked #24 on Pedestrian Detection on Caltech
no code implementations • 24 Jul 2016 • Xiangyun Zhao, Xiaodan Liang, Luoqi Liu, Teng Li, Yugang Han, Nuno Vasconcelos, Shuicheng Yan
Objective functions for training of deep networks for face-related recognition tasks, such as facial expression recognition (FER), usually consider each sample independently.
Ranked #2 on Facial Expression Recognition (FER) on Oulu-CASIA
no code implementations • CVPR 2016 • Yingwei Li, Weixin Li, Vijay Mahadevan, Nuno Vasconcelos
To account for long-range inhomogeneous dynamics, a VLAD descriptor is derived for the LDS and pooled over the whole video, to arrive at the final VLAD^3 representation.
no code implementations • ICCV 2015 • Bo Liu, Nuno Vasconcelos
A large video dataset for the evaluation of adaptation approaches to crowd counting is also introduced.
no code implementations • ICCV 2015 • Peng Jiang, Nuno Vasconcelos, Jingliang Peng
In this work, we propose a generic scheme to promote any diffusion-based salient object detection algorithm by original ways to re-synthesize the diffusion matrix and construct the seed vector.
no code implementations • ICCV 2015 • Zhaowei Cai, Mohammad Saberian, Nuno Vasconcelos
CompACT cascades are shown to seek an optimal trade-off between accuracy and complexity by pushing features of higher complexity to the later cascade stages, where only a few difficult candidate patches remain to be classified.
Ranked #26 on Pedestrian Detection on Caltech
no code implementations • CVPR 2015 • Sayed Hossein Khatoonabadi, Nuno Vasconcelos, Ivan V. Bajic, Yufeng Shan
Visual saliency has been shown to depend on the unpredictability of the visual stimulus given its surround.
no code implementations • CVPR 2015 • Mandar Dixit, Si Chen, Dashan Gao, Nikhil Rasiwasia, Nuno Vasconcelos
A semantic FV is then computed as a Gaussian Mixture FV in the space of these natural parameters.
no code implementations • CVPR 2015 • Weixin Li, Nuno Vasconcelos
Under this formulation, both positive and negative bags are soft, in the sense that negative bags can also contain positive instances.
no code implementations • 14 May 2015 • Yi Hong, Nikhil Singh, Roland Kwitt, Nuno Vasconcelos, Marc Niethammer
We then specialize this idea to the Grassmann manifold and demonstrate that it yields a simple, extensible and easy-to-implement solution to the parametric regression problem.
no code implementations • NeurIPS 2014 • Mohammad Saberian, Nuno Vasconcelos
SBBoost is a boosting algorithm for maximization of this margin.
no code implementations • CVPR 2014 • Can Xu, Nuno Vasconcelos
A new method for learning pooling receptive fields for recognition is presented.
no code implementations • CVPR 2014 • Song Lu, Vijay Mahadevan, Nuno Vasconcelos
The propagation of the resulting saliency seeds, using a diffusion process, is finally shown to outperform the state of the art on a number of salient object detection datasets.
no code implementations • CVPR 2013 • Weixin Li, Qian Yu, Harpreet Sawhney, Nuno Vasconcelos
A video sequence is decomposed into short-term segments, which are characterized by the dynamics of their attributes.
1 code implementation • 5 Dec 2012 • Hamed Masnadi-Shirazi, Nuno Vasconcelos, Arya Iranmehr
Minimization of the new hinge loss is shown to be a generalization of the classic SVM optimization problem, and can be solved by identical procedures.
no code implementations • NeurIPS 2012 • Weixin Li, Nuno Vasconcelos
The proposed method is shown to outperform similar classifiers derived from the kernel dynamic system (KDS) and state-of-the-art approaches for dynamics-based or attribute-based action recognition.
no code implementations • NeurIPS 2012 • Vijay Mahadevan, Nuno Vasconcelos
A model connecting visual tracking and saliency has recently been proposed.
no code implementations • NeurIPS 2011 • Vijay Mahadevan, Chi W. Wong, Jose C. Pereira, Tom Liu, Nuno Vasconcelos, Lawrence K. Saul
To perform this visualization, we augment MCU with an additional step for metric learning in the high dimensional voxel space.
no code implementations • NeurIPS 2011 • Mohammad J. Saberian, Nuno Vasconcelos
Two algorithms are proposed: 1) CD-MCBoost, based on coordinate descent, updates one predictor component at a time, 2) GD-MCBoost, based on gradient descent, updates all components jointly.
no code implementations • NeurIPS 2010 • Hamed Masnadi-Shirazi, Nuno Vasconcelos
It is shown that, when the risk is in canonical form and the link is inverse sigmoidal, the margin properties of the loss are determined by a single parameter.
no code implementations • NeurIPS 2010 • Kritika Muralidharan, Nuno Vasconcelos
This leads to a novel measure for the dominance of a given orientation $\theta$, which is similar to that used by SIFT.
no code implementations • NeurIPS 2010 • Nuno Vasconcelos, Mohammad J. Saberian
The problem of optimal and automatic design of a detector cascade is considered.
no code implementations • NeurIPS 2008 • Hamed Masnadi-Shirazi, Nuno Vasconcelos
This shows that the standard approach of proceeding from the specification of a loss, to the minimization of conditional risk is overly restrictive.
no code implementations • NeurIPS 2007 • Dashan Gao, Vijay Mahadevan, Nuno Vasconcelos
The classical hypothesis, that bottom-up saliency is a center-surround process, is combined with a more recent hypothesis that all saliency decisions are optimal in a decision-theoretic sense.