Search Results for author: Chao-yuan Wu

Found 20 papers, 14 papers with code

Reversible Vision Transformers

4 code implementations CVPR 2022 Karttikeya Mangalam, Haoqi Fan, Yanghao Li, Chao-yuan Wu, Bo Xiong, Christoph Feichtenhofer, Jitendra Malik

Reversible Vision Transformers achieve a reduced memory footprint of up to 15. 5x at roughly identical model complexity, parameters and accuracy, demonstrating the promise of reversible vision transformers as an efficient backbone for hardware resource limited training regimes.

Image Classification object-detection +2

Multiview Compressive Coding for 3D Reconstruction

1 code implementation CVPR 2023 Chao-yuan Wu, Justin Johnson, Jitendra Malik, Christoph Feichtenhofer, Georgia Gkioxari

We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic large-scale training from diverse RGB-D videos.

3D Reconstruction Self-Supervised Learning +1

MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition

1 code implementation CVPR 2022 Chao-yuan Wu, Yanghao Li, Karttikeya Mangalam, Haoqi Fan, Bo Xiong, Jitendra Malik, Christoph Feichtenhofer

Instead of trying to process more frames at once like most existing methods, we propose to process videos in an online fashion and cache "memory" at each iteration.

Ranked #3 on Action Anticipation on EPIC-KITCHENS-100 (using extra training data)

Action Anticipation Action Classification +2

A ConvNet for the 2020s

45 code implementations CVPR 2022 Zhuang Liu, Hanzi Mao, Chao-yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.

Classification Domain Generalization +3

MViTv2: Improved Multiscale Vision Transformers for Classification and Detection

7 code implementations CVPR 2022 Yanghao Li, Chao-yuan Wu, Haoqi Fan, Karttikeya Mangalam, Bo Xiong, Jitendra Malik, Christoph Feichtenhofer

In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for image and video classification, as well as object detection.

 Ranked #1 on Action Classification on Kinetics-600 (GFLOPs metric)

Action Classification Action Recognition +6

Towards Long-Form Video Understanding

2 code implementations CVPR 2021 Chao-yuan Wu, Philipp Krähenbühl

Our world offers a never-ending stream of visual stimuli, yet today's vision systems only accurately recognize patterns within a few seconds.

Action Recognition Video Recognition +1

A Multigrid Method for Efficiently Training Video Models

3 code implementations CVPR 2020 Chao-yuan Wu, Ross Girshick, Kaiming He, Christoph Feichtenhofer, Philipp Krähenbühl

We empirically demonstrate a general and robust grid schedule that yields a significant out-of-the-box training speedup without a loss in accuracy for different models (I3D, non-local, SlowFast), datasets (Kinetics, Something-Something, Charades), and training settings (with and without pre-training, 128 GPUs or 1 GPU).

Action Detection Action Recognition +2

Fashion++: Minimal Edits for Outfit Improvement

no code implementations ICCV 2019 Wei-Lin Hsiao, Isay Katsman, Chao-yuan Wu, Devi Parikh, Kristen Grauman

We introduce Fashion++, an approach that proposes minimal adjustments to a full-body clothing outfit that will have maximal impact on its fashionability.

Image Generation

Video Compression through Image Interpolation

1 code implementation ECCV 2018 Chao-yuan Wu, Nayan Singhal, Philipp Krähenbühl

An ever increasing amount of our digital communication, media consumption, and content creation revolves around videos.

Video Compression

Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization

no code implementations ICML 2017 Qi Lei, Ian En-Hsu Yen, Chao-yuan Wu, Inderjit S. Dhillon, Pradeep Ravikumar

We consider the popular problem of sparse empirical risk minimization with linear predictors and a large number of both features and observations.

Spectral Methods for Nonparametric Models

no code implementations31 Mar 2017 Hsiao-Yu Fish Tung, Chao-yuan Wu, Manzil Zaheer, Alexander J. Smola

Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models.

Recurrent Recommender Networks

no code implementations WSDM 2017 Chao-yuan Wu, Amr Ahmed, Alex Beutel, Alexander J. Smola, How Jing

Recommender systems traditionally assume that user profiles and movie attributes are static.

Recommendation Systems

Explaining reviews and ratings with PACO: Poisson Additive Co-Clustering

no code implementations6 Dec 2015 Chao-yuan Wu, Alex Beutel, Amr Ahmed, Alexander J. Smola

With this novel technique we propose a new Bayesian model for joint collaborative filtering of ratings and text reviews through a sum of simple co-clusterings.

Clustering Collaborative Filtering

Cannot find the paper you are looking for? You can Submit a new open access paper.