Search Results for author: Yin Cui

Found 25 papers, 17 papers with code

Revisiting 3D ResNets for Video Recognition

1 code implementation3 Sep 2021 Xianzhi Du, Yeqing Li, Yin Cui, Rui Qian, Jing Li, Irwan Bello

A recent work from Bello shows that training and scaling strategies may be more significant than model architectures for visual recognition.

Action Classification Contrastive Learning +1

Federated Multi-Target Domain Adaptation

no code implementations17 Aug 2021 Chun-Han Yao, Boqing Gong, Yin Cui, Hang Qi, Yukun Zhu, Ming-Hsuan Yang

We further take the server-client and inter-client domain shifts into account and pose a domain adaptation problem with one source (centralized server data) and multiple targets (distributed client data).

Domain Adaptation Federated Learning +3

Single Image Texture Translation for Data Augmentation

1 code implementation25 Jun 2021 Boyi Li, Yin Cui, Tsung-Yi Lin, Serge Belongie

In this paper, we explore the use of Single Image Texture Translation (SITT) for data augmentation.

Data Augmentation Few-Shot Image Classification +2

Bridging the Gap Between Object Detection and User Intent via Query-Modulation

no code implementations18 Jun 2021 Marco Fornoni, Chaochao Yan, Liangchen Luo, Kimberly Wilber, Alex Stark, Yin Cui, Boqing Gong, Andrew Howard

This often leads to incorrect results, such as lack of a high-confidence detection on the object of interest, or detection with a wrong class label.

Object Detection

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

2 code implementations22 Apr 2021 Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong

We train VATT end-to-end from scratch using multimodal contrastive losses and evaluate its performance by the downstream tasks of video action recognition, audio event classification, image classification, and text-to-video retrieval.

 Ranked #1 on Action Classification on Moments in Time (using extra training data)

Action Classification Action Recognition +7

Spatiotemporal Contrastive Video Representation Learning

3 code implementations CVPR 2021 Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui

Our representations are learned using a contrastive loss, where two augmented clips from the same short video are pulled together in the embedding space, while clips from different videos are pushed away.

Contrastive Learning Data Augmentation +4

Rethinking Pre-training and Self-training

2 code implementations NeurIPS 2020 Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin D. Cubuk, Quoc V. Le

For example, on the COCO object detection dataset, pre-training benefits when we use one fifth of the labeled data, and hurts accuracy when we use all labeled data.

 Ranked #1 on Semantic Segmentation on PASCAL VOC 2012 test (using extra training data)

Data Augmentation Object Detection +1

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

3 code implementations ECCV 2020 Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie, Bharath Hariharan, Hartwig Adam, Serge Belongie

In this work we explore the task of instance segmentation with attribute localization, which unifies instance segmentation (detect and segment each object instance) and fine-grained visual attribute categorization (recognize one or multiple attributes).

Fine-Grained Visual Categorization Fine-Grained Visual Recognition +3

Measuring Dataset Granularity

1 code implementation21 Dec 2019 Yin Cui, Zeqi Gu, Dhruv Mahajan, Laurens van der Maaten, Serge Belongie, Ser-Nam Lim

We also investigate the interplay between dataset granularity with a variety of factors and find that fine-grained datasets are more difficult to learn from, more difficult to transfer to, more difficult to perform few-shot learning with, and more vulnerable to adversarial attacks.

Few-Shot Learning

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

5 code implementations CVPR 2020 Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Golnaz Ghiasi, Mingxing Tan, Yin Cui, Quoc V. Le, Xiaodan Song

We propose SpineNet, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search.

Classification General Classification +5

The iMaterialist Fashion Attribute Dataset

1 code implementation13 Jun 2019 Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui, Yuan Li, Matthew R. Scott, Hartwig Adam, Serge Belongie

The dataset is constructed from over one million fashion images with a label space that includes 8 groups of 228 fine-grained attributes in total.

General Classification Image Classification +1

Class-Balanced Loss Based on Effective Number of Samples

6 code implementations CVPR 2019 Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang song, Serge Belongie

We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss.

Image Classification Long-tail Learning

Learning to Evaluate Image Captioning

1 code implementation CVPR 2018 Yin Cui, Guandao Yang, Andreas Veit, Xun Huang, Serge Belongie

To address these two challenges, we propose a novel learning based discriminative evaluation metric that is directly trained to distinguish between human and machine-generated captions.

Data Augmentation Image Captioning

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

1 code implementation CVPR 2018 Yin Cui, Yang song, Chen Sun, Andrew Howard, Serge Belongie

We propose a measure to estimate domain similarity via Earth Mover's Distance and demonstrate that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure.

Fine-Grained Image Classification Fine-Grained Visual Categorization +1

The iNaturalist Species Classification and Detection Dataset

3 code implementations CVPR 2018 Grant Van Horn, Oisin Mac Aodha, Yang song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie

Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories.

Classification General Classification +1

Kernel Pooling for Convolutional Neural Networks

no code implementations CVPR 2017 Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, Yuanqing Lin, Serge Belongie

We demonstrate how to approximate kernels such as Gaussian RBF up to a given order using compact explicit feature maps in a parameter-free manner.

Face Recognition Fine-Grained Visual Categorization +2

Learning Deep Representations for Ground-to-Aerial Geolocalization

no code implementations CVPR 2015 Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays

Most approaches predict the location of a query image by matching to ground-level images with known locations (e. g., street-view data).

Face Verification

Building A Large Concept Bank for Representing Events in Video

no code implementations29 Mar 2014 Yin Cui, Dong Liu, Jiawei Chen, Shih-Fu Chang

In this paper, we propose to build Concept Bank, the largest concept library consisting of 4, 876 concepts specifically designed to cover 631 real-world events.

Event Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.