no code implementations • 16 Jun 2025 • Nick Yiwen Huang, Akin Caliskan, Berkay Kicanaoglu, James Tompkin, Hyeongwoo Kim
We consider the problem of disentangling 3D from large vision-language models, which we show on generative 3D portraits.
no code implementations • 30 May 2025 • Yiqing Liang, JieLin Qiu, Wenhao Ding, Zuxin Liu, James Tompkin, Mengdi Xu, Mengzhou Xia, Zhengzhong Tu, Laixi Shi, Jiacheng Zhu
Specifically, (1) We developed a multimodal RLVR framework for multi-dataset post-training by curating a dataset that contains different verifiable vision-language problems and enabling multi-domain online RL learning with different verifiable rewards; (2) We proposed a data mixture strategy that learns to predict the RL fine-tuning outcome from the data mixture distribution, and consequently optimizes the best mixture.
no code implementations • CVPR 2025 • Runfeng Li, Mikhail Okunev, Zixuan Guo, Anh Ha Duong, Christian Richardt, Matthew O'Toole, James Tompkin
Quickly achieving high-fidelity dynamic 3D reconstruction from a single viewpoint is a significant challenge in computer vision.
no code implementations • CVPR 2025 • Yiqing Liang, Abhishek Badki, Hang Su, James Tompkin, Orazio Gallo
Large models have shown generalization across datasets for many low-level vision tasks, like depth estimation, but no such general models exist for scene flow.
1 code implementation • 9 Jan 2025 • Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, James Tompkin
There is a widely-spread claim that GANs are difficult to train, and GAN architectures in the literature are littered with empirical tricks.
Ranked #3 on
Image Generation
on ImageNet 32x32
no code implementations • 5 Dec 2024 • Yiqing Liang, Mikhail Okunev, Mikaela Angelina Uy, Runfeng Li, Leonidas Guibas, James Tompkin, Adam W. Harley
Gaussian splatting methods are emerging as a popular approach for converting multi-view image data into scene representations that allow view synthesis.
no code implementations • CVPR 2024 • Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang, James Tompkin, Min H. Kim
We present a method to reconstruct indoor and outdoor static scene geometry and appearance from an omnidirectional video moving in a small circular sweep.
no code implementations • 18 Dec 2023 • Yiqing Liang, Numair Khan, Zhengqin Li, Thu Nguyen-Phuoc, Douglas Lanman, James Tompkin, Lei Xiao
Specifically, we propose a template set of 3D Gaussians residing in a canonical space, and a time-dependent forward-warping deformation field to model dynamic objects.
no code implementations • 22 May 2023 • Yiwen Huang, Zhiqiu Yu, Xinjie Yi, Yue Wang, James Tompkin
This results in a new model that effectively removes the quality tax between 3DMM conditioned face GANs and the unconditional StyleGAN.
no code implementations • ICCV 2023 • Yiqing Liang, Eliot Laidlaw, Alexander Meyerowitz, Srinath Sridhar, James Tompkin
From video, we reconstruct a neural volume that captures time-varying color, density, scene flow, semantics, and attention information.
no code implementations • ICCV 2023 • Aarrushi Shandilya, Benjamin Attal, Christian Richardt, James Tompkin, Matthew O'Toole
We present an image formation model and optimization procedure that combines the advantages of neural radiance fields and structured light imaging.
no code implementations • 6 Oct 2022 • Andreas Meuleman, Hakyeong Kim, James Tompkin, Min H. Kim
Fusing RGB stereo and ToF information is a promising direction to overcome these issues, but a key problem remains: to provide high-quality 2D RGB images, the main color sensor's lens is optically stabilized, resulting in an unknown pose for the floating lens that breaks the geometric relationships between the multimodal image sensors.
no code implementations • CVPR 2022 • Donglai Wei, Siddhant Kharbanda, Sarthak Arora, Roshan Roy, Nishant Jain, Akash Palrecha, Tanav Shah, Shray Mathur, Ritik Mathur, Abhijay Kemkar, Anirudh Chakravarthy, Zudi Lin, Won-Dong Jang, Yansong Tang, Song Bai, James Tompkin, Philip H.S. Torr, Hanspeter Pfister
Many video understanding tasks require analyzing multi-shot videos, but existing datasets for video object segmentation (VOS) only consider single-shot videos.
1 code implementation • 22 Nov 2021 • Yiheng Xie, Towaki Takikawa, Shunsuke Saito, Or Litany, Shiqin Yan, Numair Khan, Federico Tombari, James Tompkin, Vincent Sitzmann, Srinath Sridhar
Recent advances in machine learning have created increasing interest in solving visual computing problems using a class of coordinate-based neural networks that parametrize physical properties of scenes or objects across space and time.
no code implementations • ICCV 2021 • Kwang In Kim, James Tompkin
Then, we empirically estimate and strengthen the statistical dependence between the initial noisy predictor and the additional features via manifold denoising.
1 code implementation • NeurIPS 2021 • Benjamin Attal, Eliot Laidlaw, Aaron Gokaslan, Changil Kim, Christian Richardt, James Tompkin, Matthew O'Toole
Neural networks can represent and accurately reconstruct radiance fields for static 3D scenes (e. g., NeRF).
no code implementations • 2 Sep 2021 • Beatrix-Emőke Fülöp-Balogh, Eleanor Tursman, James Tompkin, Julie Digne, Nicolas Bonneel
Structure from motion (SfM) enables us to reconstruct a scene via casual capture from cameras at different viewpoints, and novel view synthesis (NVS) allows us to render a captured scene from a new viewpoint.
1 code implementation • 7 Jul 2021 • Numair Khan, Min H. Kim, James Tompkin
We present an algorithm to estimate fast and accurate depth maps from light fields via a sparse set of depth edges and gradients.
no code implementations • 24 Jun 2021 • Youssef A. Mejjati, Isa Milefchik, Aaron Gokaslan, Oliver Wang, Kwang In Kim, James Tompkin
We present an algorithm that learns a coarse 3D representation of objects from unposed multi-view 2D mask supervision, then uses it to generate detailed mask and image texture.
1 code implementation • CVPR 2021 • Numair Khan, Min H. Kim, James Tompkin
We present a method to estimate dense depth by optimizing a sparse set of points such that their diffusion into a depth map minimizes a multi-view reprojection error from RGB supervision.
1 code implementation • 6 Dec 2020 • Purvi Goel, Loudon Cohen, James Guesman, Vikas Thamizharasan, James Tompkin, Daniel Ritchie
In this paper, we explore how to use differentiable ray tracing to refine an initial coarse mesh and per-mesh-facet material representation.
no code implementations • 2 Dec 2020 • Won-Dong Jang, Donglai Wei, Xingxuan Zhang, Brian Leahy, Helen Yang, James Tompkin, Dalit Ben-Yosef, Daniel Needleman, Hanspeter Pfister
To alleviate the problem, we propose to classify input features into intermediate shape codes and recover complete object shapes from them.
1 code implementation • 9 Sep 2020 • Numair Khan, Min H. Kim, James Tompkin
Previous light field depth estimation methods typically estimate a depth map only for the central sub-aperture view, and struggle with view consistent estimation.
1 code implementation • ECCV 2020 • Atsunobu Kotani, Stefanie Tellex, James Tompkin
Instead, we introduce the Decoupled Style Descriptor (DSD) model for handwriting, which factors both character- and writer-level styles and allows our model to represent an overall greater space of styles.
1 code implementation • ECCV 2020 • Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, James Tompkin
Our approach is to simultaneously learn depth and disocclusions via a multi-sphere image representation, which can be rendered with correct 6DoF disparity and motion parallax in VR.
1 code implementation • 1 Jan 2020 • Youssef Alami Mejjati, Zejiang Shen, Michael Snower, Aaron Gokaslan, Oliver Wang, James Tompkin, Kwang In Kim
We present an algorithm to generate diverse foreground objects and composite them into background images using a GAN architecture.
1 code implementation • NeurIPS 2018 • Youssef Alami Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim
Current unsupervised image-to-image translation techniques struggle to focus their attention on individual objects without altering the background or the way multiple objects interact within a scene.
4 code implementations • ECCV 2018 • Aaron Gokaslan, Vivek Ramanujan, Daniel Ritchie, Kwang In Kim, James Tompkin
Unsupervised image-to-image translation techniques are able to map local texture between two domains, but they are typically unsuccessful when the domains require larger shape change.
2 code implementations • 6 Jun 2018 • Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim
Current unsupervised image-to-image translation techniques struggle to focus their attention on individual objects without altering the background or the way multiple objects interact within a scene.
no code implementations • CVPR 2018 • Kwang In Kim, Juhyun Park, James Tompkin
When learning functions on manifolds, we can improve performance by regularizing with respect to the intrinsic manifold geometry rather than the ambient space.
no code implementations • ICCV 2017 • Kwang In Kim, James Tompkin, Christian Richardt
We present an algorithm for test-time combination of a set of reference predictors with unknown parametric forms.
no code implementations • 12 Jun 2017 • James Tompkin, Kwang In Kim, Hanspeter Pfister, Christian Theobalt
Large databases are often organized by hand-labeled metadata, or criteria, which are expensive to collect.
no code implementations • CVPR 2018 • Daniel Haehn, Verena Kaynig, James Tompkin, Jeff W. Lichtman, Hanspeter Pfister
Automatic cell image segmentation methods in connectomics produce merge and split errors, which require correction through proofreading.
no code implementations • ICCV 2015 • Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt
Existing approaches for diffusion on graphs, e. g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer.
no code implementations • CVPR 2015 • Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt
In many learning tasks, the structure of the target space of a function holds rich information about the relationships between evaluations of functions on different data points.
no code implementations • CVPR 2015 • Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt
The iterated graph Laplacian enables high-order regularization, but it has a high computational complexity and so cannot be applied to large problems.