no code implementations • CVPR 2022 • Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu (Richard) Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio S. Feris
It is thus better to tailor synthetic pre-training data to a specific downstream task, for best performance.
1 code implementation • CVPR 2022 • Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio S. Feris, David Harwath, James Glass, Hilde Kuehne
In this work, we present a multi-modal, modality agnostic fusion transformer that learns to exchange information between multiple modalities, such as video, audio, and text, and integrate them into a fused representation in a joined multi-modal embedding space.
no code implementations • 30 Nov 2021 • Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio S. Feris
It is thus better to tailor synthetic pre-training data to a specific downstream task, for best performance.
no code implementations • 29 Jan 2019 • Michele Merler, Nalini Ratha, Rogerio S. Feris, John R. Smith
We expect face recognition to work equally accurately for every face.
3 code implementations • ICCV 2019 • Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, JinJun Xiong, Rogerio S. Feris, Minh N. Do
Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction.
no code implementations • 17 Jan 2018 • Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas
We propose a novel method for real-time face alignment in videos based on a recurrent encoder-decoder network model.
no code implementations • 22 Jul 2017 • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R. Smith, Rogerio S. Feris
The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media.
no code implementations • 19 Aug 2016 • Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas
We propose a novel recurrent encoder-decoder network model for real-time video-based face alignment.
1 code implementation • 25 Jul 2016 • Zhaowei Cai, Quanfu Fan, Rogerio S. Feris, Nuno Vasconcelos
A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection.
Ranked #24 on Pedestrian Detection on Caltech
no code implementations • ICCV 2015 • Junshi Huang, Rogerio S. Feris, Qiang Chen, Shuicheng Yan
To address this problem, we propose a Dual Attribute-aware Ranking Network (DARN) for retrieval feature learning.
no code implementations • ICCV 2015 • Yu Cheng, Felix X. Yu, Rogerio S. Feris, Sanjiv Kumar, Alok Choudhary, Shih-Fu Chang
We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection.
no code implementations • CVPR 2013 • Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu Chang
In this paper, we propose a novel formulation to automatically design discriminative "category-level attributes", which can be efficiently encoded by a compact category-attribute matrix.