Search Results for author: SouYoung Jin

Found 10 papers, 2 papers with code

End-to-end Face Detection and Cast Grouping in Movies Using Erdős-Rényi Clustering

no code implementations7 Sep 2017 SouYoung Jin, Hang Su, Chris Stauffer, Erik Learned-Miller

We introduce a novel verification method, rank-1 counts verification, that has this property, and use it in a link-based clustering scheme.

Clustering Face Detection

End-To-End Face Detection and Cast Grouping in Movies Using Erdos-Renyi Clustering

no code implementations ICCV 2017 SouYoung Jin, Hang Su, Chris Stauffer, Erik Learned-Miller

We introduce a novel verification method, rank-1 counts verification, that has this property, and use it in a link-based clustering scheme.

Clustering Face Detection

Unsupervised Hard Example Mining from Videos for Improved Object Detection

no code implementations ECCV 2018 SouYoung Jin, Aruni RoyChowdhury, Huaizu Jiang, Ashish Singh, Aditya Prasad, Deep Chakraborty, Erik Learned-Miller

In this work, we show how large numbers of hard negatives can be obtained {\em automatically} by analyzing the output of a trained detector on video sequences.

Face Detection object-detection +2

Automatic adaptation of object detectors to new domains using self-training

1 code implementation CVPR 2019 Aruni RoyChowdhury, Prithvijit Chakrabarty, Ashish Singh, SouYoung Jin, Huaizu Jiang, Liangliang Cao, Erik Learned-Miller

Our results demonstrate the usefulness of incorporating hard examples obtained from tracking, the advantage of using soft-labels via distillation loss versus hard-labels, and show promising performance as a simple method for unsupervised domain adaptation of object detectors, with minimal dependence on hyper-parameters.

Knowledge Distillation Pedestrian Detection +1

Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

no code implementations CVPR 2021 Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva

With this in mind, the descriptions people generate for videos of different dynamic events can greatly improve our understanding of the key information of interest in each video.

Contrastive Learning Retrieval +1

Cross-Modal Discrete Representation Learning

no code implementations ACL 2022 Alexander H. Liu, SouYoung Jin, Cheng-I Jeff Lai, Andrew Rouditchenko, Aude Oliva, James Glass

Recent advances in representation learning have demonstrated an ability to represent information from different modalities such as video, text, and audio in a single high-level embedding vector.

Cross-Modal Retrieval Quantization +4

Leveraging Temporal Context in Low Representational Power Regimes

no code implementations CVPR 2023 Camilo L. Fosco, SouYoung Jin, Emilie Josephs, Aude Oliva

We show that including information from the ETM during training improves action recognition and anticipation performance on various egocentric video datasets.

Action Anticipation Action Recognition

LangNav: Language as a Perceptual Representation for Navigation

no code implementations11 Oct 2023 Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim

We explore the use of language as a perceptual representation for vision-and-language navigation (VLN), with a focus on low-data settings.

Image Captioning Language Modelling +4

Learning Human Action Recognition Representations Without Real Humans

1 code implementation NeurIPS 2023 Howard Zhong, Samarth Mishra, Donghyun Kim, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Aude Oliva, Rogerio Feris

To this end, we present, for the first time, a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.

Action Recognition Ethics +2

FT2TF: First-Person Statement Text-To-Talking Face Generation

no code implementations9 Dec 2023 Xingjian Diao, Ming Cheng, Wayner Barrios, SouYoung Jin

This achievement highlights our model capability to bridge first-person statements and dynamic face generation, providing insightful guidance for future work.

Talking Face Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.