Search Results for author: Minh Vo

Found 17 papers, 8 papers with code

Ego4D: Around the World in 3,000 Hours of Egocentric Video

5 code implementations CVPR 2022 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

BANMo: Building Animatable 3D Neural Models from Many Casual Videos

1 code implementation CVPR 2022 Gengshan Yang, Minh Vo, Natalia Neverova, Deva Ramanan, Andrea Vedaldi, Hanbyul Joo

Our key insight is to merge three schools of thought; (1) classic deformable shape models that make use of articulated bones and blend skinning, (2) volumetric neural radiance fields (NeRFs) that are amenable to gradient-based optimization, and (3) canonical embeddings that generate correspondences between pixels and an articulated model.

3D Shape Reconstruction from Videos Dynamic Reconstruction

TAVA: Template-free Animatable Volumetric Actors

1 code implementation17 Jun 2022 RuiLong Li, Julian Tanke, Minh Vo, Michael Zollhofer, Jurgen Gall, Angjoo Kanazawa, Christoph Lassner

Since TAVA does not require a body template, it is applicable to humans as well as other creatures such as animals.

ContactOpt: Optimizing Contact to Improve Grasps

1 code implementation CVPR 2021 Patrick Grady, Chengcheng Tang, Christopher D. Twigg, Minh Vo, Samarth Brahmbhatt, Charles C. Kemp

Given a hand mesh and an object mesh, a deep model trained on ground truth contact data infers desirable contact across the surfaces of the meshes.

Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet

1 code implementation9 Jul 2022 Shihao Zou, Yuanlu Xu, Chao Li, Lingni Ma, Li Cheng, Minh Vo

In this paper, we propose Snipper, a unified framework to perform multi-person 3D pose estimation, tracking, and motion forecasting simultaneously in a single stage.

3D Pose Estimation Motion Forecasting +1

CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles

1 code implementation CVPR 2018 N. Dinesh Reddy, Minh Vo, Srinivasa G. Narasimhan

In this work, we develop a framework to fuse both the single-view feature tracks and multi-view detected part locations to significantly improve the detection, localization and reconstruction of moving vehicles, even in the presence of strong occlusions.

3D Reconstruction Point Tracking

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

1 code implementation ICCV 2021 Kejie Li, Daniel DeTone, Steven Chen, Minh Vo, Ian Reid, Hamid Rezatofighi, Chris Sweeney, Julian Straub, Richard Newcombe

Localizing objects and estimating their extent in 3D is an important step towards high-level 3D scene understanding, which has many applications in Augmented Reality and Robotics.

3D Object Detection Object +2

Self-supervised Multi-view Person Association and Its Applications

no code implementations22 May 2018 Minh Vo, Ersin Yumer, Kalyan Sunkavalli, Sunil Hadap, Yaser Sheikh, Srinivasa Narasimhan

Reliable markerless motion tracking of people participating in a complex group activity from multiple moving cameras is challenging due to frequent occlusions, strong viewpoint and appearance variations, and asynchronous video streams.

Clustering

Spatiotemporal Bundle Adjustment for Dynamic 3D Reconstruction

no code implementations CVPR 2016 Minh Vo, Srinivasa G. Narasimhan, Yaser Sheikh

In this paper, we present a spatiotemporal bundle adjustment approach that jointly optimizes four coupled sub-problems: estimating camera intrinsics and extrinsics, triangulating 3D static points, as well as subframe temporal alignment between cameras and estimating 3D trajectories of dynamic points.

3D Reconstruction Dynamic Reconstruction

4D Visualization of Dynamic Events from Unconstrained Multi-View Videos

no code implementations CVPR 2020 Aayush Bansal, Minh Vo, Yaser Sheikh, Deva Ramanan, Srinivasa Narasimhan

We present a data-driven approach for 4D space-time visualization of dynamic events from videos captured by hand-held multiple cameras.

Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in the Wild

no code implementations24 Jul 2020 Minh Vo, Yaser Sheikh, Srinivasa G. Narasimhan

The triangulation constraint, however, is invalid for moving points captured in multiple unsynchronized videos and bundle adjustment is not designed to estimate the temporal alignment between cameras.

3D Human Reconstruction

TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video

no code implementations ECCV 2020 Tiancheng Zhi, Christoph Lassner, Tony Tung, Carsten Stoll, Srinivasa G. Narasimhan, Minh Vo

We present TexMesh, a novel approach to reconstruct detailed human meshes with high-resolution full-body texture from RGB-D video.

ANR: Articulated Neural Rendering for Virtual Avatars

no code implementations CVPR 2021 Amit Raj, Julian Tanke, James Hays, Minh Vo, Carsten Stoll, Christoph Lassner

The combination of traditional rendering with neural networks in Deferred Neural Rendering (DNR) provides a compelling balance between computational complexity and realism of the resulting images.

Neural Rendering

EgoHumans: An Egocentric 3D Multi-Human Benchmark

no code implementations25 May 2023 Rawal Khirodkar, Aayush Bansal, Lingni Ma, Richard Newcombe, Minh Vo, Kris Kitani

We present EgoHumans, a new multi-view multi-human video benchmark to advance the state-of-the-art of egocentric human 3D pose estimation and tracking.

3D Pose Estimation Human Detection

Ego-Humans: An Ego-Centric 3D Multi-Human Benchmark

no code implementations ICCV 2023 Rawal Khirodkar, Aayush Bansal, Lingni Ma, Richard Newcombe, Minh Vo, Kris Kitani

We present EgoHumans, a new multi-view multi-human video benchmark to advance the state-of-the-art of egocentric human 3D pose estimation and tracking.

3D Pose Estimation Human Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.