Search Results for author: Rogerio S. Feris

Found 12 papers, 3 papers with code

Designing Category-Level Attributes for Discriminative Visual Recognition

no code implementations • CVPR 2013 • Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu Chang

In this paper, we propose a novel formulation to automatically design discriminative "category-level attributes", which can be efficiently encoded by a compact category-attribute matrix.

Attribute Transfer Learning +1

Paper
Add Code

An exploration of parameter redundancy in deep networks with circulant projections

no code implementations • ICCV 2015 • Yu Cheng, Felix X. Yu, Rogerio S. Feris, Sanjiv Kumar, Alok Choudhary, Shih-Fu Chang

We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection.

Paper
Add Code

Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network

no code implementations • ICCV 2015 • Junshi Huang, Rogerio S. Feris, Qiang Chen, Shuicheng Yan

To address this problem, we propose a Dual Attribute-aware Ranking Network (DARN) for retrieval feature learning.

Attribute Image Retrieval +2

Paper
Add Code

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

1 code implementation • 25 Jul 2016 • Zhaowei Cai, Quanfu Fan, Rogerio S. Feris, Nuno Vasconcelos

A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection.

Ranked #24 on Pedestrian Detection on Caltech

Face Detection Feature Upsampling +4

402

Paper
Code

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

no code implementations • 19 Aug 2016 • Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

We propose a novel recurrent encoder-decoder network model for real-time video-based face alignment.

Decoder Face Alignment

Paper
Add Code

Automatic Curation of Golf Highlights using Multimodal Excitement Features

no code implementations • 22 Jul 2017 • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R. Smith, Rogerio S. Feris

The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media.

Action Recognition Retrieval +2

Paper
Add Code

RED-Net: A Recurrent Encoder-Decoder Network for Video-based Face Alignment

no code implementations • 17 Jan 2018 • Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

We propose a novel method for real-time face alignment in videos based on a recurrent encoder-decoder network model.

Decoder Face Alignment

Paper
Add Code

Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection

2 code implementations • ICCV 2019 • Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, JinJun Xiong, Rogerio S. Feris, Minh N. Do

Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction.

Fine-Grained Action Detection Optical Flow Estimation

Paper
Code

Diversity in Faces

no code implementations • 29 Jan 2019 • Michele Merler, Nalini Ratha, Rogerio S. Feris, John R. Smith

We expect face recognition to work equally accurately for every face.

Cultural Vocal Bursts Intensity Prediction Face Recognition +1

Paper
Add Code

Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data

no code implementations • 30 Nov 2021 • Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio S. Feris

It is thus better to tailor synthetic pre-training data to a specific downstream task, for best performance.

Paper
Add Code

Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval

1 code implementation • CVPR 2022 • Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio S. Feris, David Harwath, James Glass, Hilde Kuehne

In this work, we present a multi-modal, modality agnostic fusion transformer that learns to exchange information between multiple modalities, such as video, audio, and text, and integrate them into a fused representation in a joined multi-modal embedding space.

Action Localization Retrieval +2

Paper
Code

Task2Sim: Towards Effective Pre-Training and Transfer From Synthetic Data

no code implementations • CVPR 2022 • Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu (Richard) Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio S. Feris

It is thus better to tailor synthetic pre-training data to a specific downstream task, for best performance.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.