Search Results for author: Yanghao Li

Found 19 papers, 11 papers with code

Improved Multiscale Vision Transformers for Classification and Detection

no code implementations2 Dec 2021 Yanghao Li, Chao-yuan Wu, Haoqi Fan, Karttikeya Mangalam, Bo Xiong, Jitendra Malik, Christoph Feichtenhofer

In this paper, we study Multiscale Vision Transformers (MViT) as a unified architecture for image and video classification, as well as object detection.

Classification Object Detection +2

Benchmarking Detection Transfer Learning with Vision Transformers

no code implementations22 Nov 2021 Yanghao Li, Saining Xie, Xinlei Chen, Piotr Dollar, Kaiming He, Ross Girshick

The complexity of object detection methods can make this benchmarking non-trivial when new architectures, such as Vision Transformer (ViT) models, arrive.

Object Detection Self-Supervised Learning +1

PyTorchVideo: A Deep Learning Library for Video Understanding

1 code implementation18 Nov 2021 Haoqi Fan, Tullie Murrell, Heng Wang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong, Nikhila Ravi, Meng Li, Haichuan Yang, Jitendra Malik, Ross Girshick, Matt Feiszli, Aaron Adcock, Wan-Yen Lo, Christoph Feichtenhofer

We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing.

Self-Supervised Learning Video Understanding

Multiscale Vision Transformers

2 code implementations ICCV 2021 Haoqi Fan, Bo Xiong, Karttikeya Mangalam, Yanghao Li, Zhicheng Yan, Jitendra Malik, Christoph Feichtenhofer

We evaluate this fundamental architectural prior for modeling the dense nature of visual signals for a variety of video recognition tasks where it outperforms concurrent vision transformers that rely on large scale external pre-training and are 5-10x more costly in computation and parameters.

Action Classification Action Recognition +2

Learning Model-Blind Temporal Denoisers without Ground Truths

no code implementations7 Jul 2020 Yanghao Li, Bichuan Guo, Jiangtao Wen, Zhen Xia, Shan Liu, Yuxing Han

Denoisers trained with synthetic data often fail to cope with the diversity of unknown noises, giving way to methods that can adapt to existing noise without knowing its ground truth.

Denoising Optical Flow Estimation +1

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

no code implementations31 Jan 2020 Sijie Song, Jiaying Liu, Yanghao Li, Zongming Guo

In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition.

Action Recognition Optical Flow Estimation +1

EGO-TOPO: Environment Affordances from Egocentric Video

1 code implementation CVPR 2020 Tushar Nagarajan, Yanghao Li, Christoph Feichtenhofer, Kristen Grauman

We introduce a model for environment affordances that is learned directly from egocentric video.

Scale-Aware Trident Networks for Object Detection

4 code implementations ICCV 2019 Yanghao Li, Yuntao Chen, Naiyan Wang, Zhao-Xiang Zhang

In this work, we first present a controlled experiment to investigate the effect of receptive fields for scale variation in object detection.

Object Detection

Temporal Bilinear Networks for Video Action Recognition

no code implementations25 Nov 2018 Yanghao Li, Sijie Song, Yuqi Li, Jiaying Liu

Temporal modeling in videos is a fundamental yet challenging problem in computer vision.

Action Recognition

PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding

no code implementations22 Mar 2017 Chunhui Liu, Yueyu Hu, Yanghao Li, Sijie Song, Jiaying Liu

Despite the fact that many 3D human activity benchmarks being proposed, most existing action datasets focus on the action recognition tasks for the segmented videos.

Action Detection Action Recognition +1

Demystifying Neural Style Transfer

2 code implementations4 Jan 2017 Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou

Neural Style Transfer has recently demonstrated very exciting results which catches eyes in both academia and industry.

Domain Adaptation Style Transfer

Factorized Bilinear Models for Image Recognition

1 code implementation ICCV 2017 Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou

Although Deep Convolutional Neural Networks (CNNs) have liberated their power in various computer vision tasks, the most important components of CNN, convolutional layers and fully connected layers, are still limited to linear transformations.

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

no code implementations24 Mar 2016 Wentao Zhu, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Yanghao Li, Li Shen, Xiaohui Xie

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.

Action Recognition Skeleton Based Action Recognition

Revisiting Batch Normalization For Practical Domain Adaptation

1 code implementation15 Mar 2016 Yanghao Li, Naiyan Wang, Jianping Shi, Jiaying Liu, Xiaodi Hou

However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain.

Domain Adaptation Fine-tuning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.