Search Results for author: Rogerio S. Feris

Found 12 papers, 3 papers with code

Designing Category-Level Attributes for Discriminative Visual Recognition

no code implementations CVPR 2013 Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu Chang

In this paper, we propose a novel formulation to automatically design discriminative "category-level attributes", which can be efficiently encoded by a compact category-attribute matrix.

Attribute Transfer Learning +1

An exploration of parameter redundancy in deep networks with circulant projections

no code implementations ICCV 2015 Yu Cheng, Felix X. Yu, Rogerio S. Feris, Sanjiv Kumar, Alok Choudhary, Shih-Fu Chang

We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection.

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

no code implementations19 Aug 2016 Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

We propose a novel recurrent encoder-decoder network model for real-time video-based face alignment.

Decoder Face Alignment

Automatic Curation of Golf Highlights using Multimodal Excitement Features

no code implementations22 Jul 2017 Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R. Smith, Rogerio S. Feris

The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media.

Action Recognition Retrieval +2

RED-Net: A Recurrent Encoder-Decoder Network for Video-based Face Alignment

no code implementations17 Jan 2018 Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

We propose a novel method for real-time face alignment in videos based on a recurrent encoder-decoder network model.

Decoder Face Alignment

Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval

1 code implementation CVPR 2022 Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio S. Feris, David Harwath, James Glass, Hilde Kuehne

In this work, we present a multi-modal, modality agnostic fusion transformer that learns to exchange information between multiple modalities, such as video, audio, and text, and integrate them into a fused representation in a joined multi-modal embedding space.

Action Localization Retrieval +2

Cannot find the paper you are looking for? You can Submit a new open access paper.