1 code implementation • 14 Jun 2024 • Julian Straub, Daniel DeTone, Tianwei Shen, Nan Yang, Chris Sweeney, Richard Newcombe
The advent of wearable computers enables a new source of context for AI that is embedded in egocentric sensor data.
Ranked #1 on
3D Reconstruction
on Aria Synthetic Environments
no code implementations • 30 Apr 2024 • Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sangjin Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, CongCong Li
In this paper, we propose STT, a Stateful Tracking model built with Transformers, that can consistently track objects in the scenes while also predicting their states accurately.
no code implementations • 26 Mar 2024 • Qiao Gu, Zhaoyang Lv, Duncan Frost, Simon Green, Julian Straub, Chris Sweeney
In this paper we present EgoLifter, a novel system that can automatically segment scenes captured from egocentric sensors into a complete decomposition of individual 3D objects.
1 code implementation • 20 Feb 2024 • Zhaoyang Lv, Nicholas Charron, Pierre Moulon, Alexander Gamino, Cheng Peng, Chris Sweeney, Edward Miller, Huixuan Tang, Jeff Meissner, Jing Dong, Kiran Somasundaram, Luis Pesqueira, Mark Schwesinger, Omkar Parkhi, Qiao Gu, Renzo De Nardi, Shangyi Cheng, Steve Saarinen, Vijay Baiyya, Yuyang Zou, Richard Newcombe, Jakob Julian Engel, Xiaqing Pan, Carl Ren
We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses.
no code implementations • 24 Aug 2023 • Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira, Harry Lanaras, Henry Howard-Jenkins, Huixuan Tang, Hyo Jin Kim, Jaime Rivera, Ji Luo, Jing Dong, Julian Straub, Kevin Bailey, Kevin Eckenhoff, Lingni Ma, Luis Pesqueira, Mark Schwesinger, Maurizio Monge, Nan Yang, Nick Charron, Nikhil Raina, Omkar Parkhi, Peter Borschowa, Pierre Moulon, Prince Gupta, Raul Mur-Artal, Robbie Pennington, Sachin Kulkarni, Sagar Miglani, Santosh Gondi, Saransh Solanki, Sean Diener, Shangyi Cheng, Simon Green, Steve Saarinen, Suvam Patra, Tassos Mourikis, Thomas Whelan, Tripti Singh, Vasileios Balntas, Vijay Baiyya, Wilson Dreewes, Xiaqing Pan, Yang Lou, Yipu Zhao, Yusuf Mansour, Yuyang Zou, Zhaoyang Lv, Zijian Wang, Mingfei Yan, Carl Ren, Renzo De Nardi, Richard Newcombe
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception.
no code implementations • 4 Jun 2022 • Gil Avraham, Julian Straub, Tianwei Shen, Tsun-Yi Yang, Hugo Germain, Chris Sweeney, Vasileios Balntas, David Novotny, Daniel DeTone, Richard Newcombe
This paper presents a framework that combines traditional keypoint-based camera pose optimization with an invertible neural rendering mechanism.
no code implementations • CVPR 2022 • Fangyin Wei, Rohan Chabra, Lingni Ma, Christoph Lassner, Michael Zollhöfer, Szymon Rusinkiewicz, Chris Sweeney, Richard Newcombe, Mira Slavcheva
In addition, our representation enables a large variety of applications, such as few-shot reconstruction, the generation of novel articulations, and novel view-synthesis.
no code implementations • CVPR 2022 • Enric Corona, Tomas Hodan, Minh Vo, Francesc Moreno-Noguer, Chris Sweeney, Richard Newcombe, Lingni Ma
This paper proposes a do-it-all neural model of human hands, named LISA.
no code implementations • CVPR 2022 • Tony Ng, Hyo Jin Kim, Vincent Lee, Daniel DeTone, Tsun-Yi Yang, Tianwei Shen, Eddy Ilg, Vassileios Balntas, Krystian Mikolajczyk, Chris Sweeney
We let a feature encoding network and image reconstruction network compete with each other, such that the feature encoder tries to impede the image reconstruction with its generated descriptors, while the reconstructor tries to recover the input image from the descriptors.
1 code implementation • ICCV 2021 • Kejie Li, Daniel DeTone, Steven Chen, Minh Vo, Ian Reid, Hamid Rezatofighi, Chris Sweeney, Julian Straub, Richard Newcombe
Localizing objects and estimating their extent in 3D is an important step towards high-level 3D scene understanding, which has many applications in Augmented Reality and Robotics.
4 code implementations • 1 Mar 2021 • Philipp Jund, Chris Sweeney, Nichola Abdo, Zhifeng Chen, Jonathon Shlens
In this work, we introduce a new large-scale dataset for scene flow estimation derived from corresponding tracked 3D objects, which is $\sim$1, 000$\times$ larger than previous real-world datasets in terms of the number of annotated frames.
Ranked #6 on
Scene Flow Estimation
on Argoverse 2
no code implementations • 27 Aug 2020 • Aleksander Holynski, David Geraghty, Jan-Michael Frahm, Chris Sweeney, Richard Szeliski
Low-frequency long-range errors (drift) are an endemic problem in 3D structure from motion, and can often hamper reasonable reconstructions of the scene.
no code implementations • 21 Aug 2020 • Sungyong Baik, Hyo Jin Kim, Tianwei Shen, Eddy Ilg, Kyoung Mu Lee, Chris Sweeney
We tackle the problem of visual localization under changing conditions, such as time of day, weather, and seasons.
no code implementations • ACL 2019 • Chris Sweeney, Maryam Najafian
Word embedding models have gained a lot of traction in the Natural Language Processing community, however, they suffer from unintended demographic biases.
no code implementations • 8 Jun 2019 • Chris Sweeney, Aleksander Holynski, Brian Curless, Steve M Seitz
We present a novel Structure from Motion pipeline that is capable of reconstructing accurate camera poses for panorama-style video capture without prior camera intrinsic calibration.
no code implementations • 3 Apr 2019 • Rohan Chabra, Julian Straub, Chris Sweeney, Richard Newcombe, Henry Fuchs
We propose a system that uses a convolution neural network (CNN) to estimate depth from a stereo pair followed by volumetric fusion of the predicted depth maps to produce a 3D reconstruction of a scene.
no code implementations • 4 Oct 2017 • Qiaodong Cui, Victor Fragoso, Chris Sweeney, Pradeep Sen
We present GraphMatch, an approximate yet efficient method for building the matching graph for large-scale structure-from-motion (SfM) pipelines.
no code implementations • 27 Sep 2017 • Victor Fragoso, Chris Sweeney, Pradeep Sen, Matthew Turk
While RANSAC-based methods are robust to incorrect image correspondences (outliers), their hypothesis generators are not robust to correct image correspondences (inliers) with positional error (noise).
no code implementations • 13 Jul 2016 • Chris Sweeney, Victor Fragoso, Tobias Hollerer, Matthew Turk
We introduce the distributed camera model, a novel model for Structure-from-Motion (SfM).
no code implementations • ICCV 2015 • Chris Sweeney, Torsten Sattler, Tobias Hollerer, Matthew Turk, Marc Pollefeys
The viewing graph represents a set of views that are related by pairwise relative geometries.
no code implementations • CVPR 2015 • Chris Sweeney, Laurent Kneip, Tobias Hollerer, Matthew Turk
We propose a novel solution for computing the relative pose between two generalized cameras that includes reconciling the internal scale of the generalized cameras.