Search Results for author: Zhaoyang Lv

Found 18 papers, 9 papers with code

VideoLLM-online: Online Video Large Language Model for Streaming Video

no code implementations CVPR 2024 Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou

Recent Large Language Models have been enhanced with vision capabilities, enabling them to comprehend images, videos, and interleaved vision-language content.

Language Modelling Large Language Model

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

no code implementations26 Mar 2024 Qiao Gu, Zhaoyang Lv, Duncan Frost, Simon Green, Julian Straub, Chris Sweeney

In this paper we present EgoLifter, a novel system that can automatically segment scenes captured from egocentric sensors into a complete decomposition of individual 3D objects.

3D Reconstruction Object

LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing

no code implementations15 Feb 2024 Bryan Wang, Yuliang Li, Zhaoyang Lv, Haijun Xia, Yan Xu, Raj Sodhi

Based on these findings, we propose design implications to inform the future development of agent-assisted content editing.

Video Editing

AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields

1 code implementation21 Jul 2022 Andreas Kurz, Thomas Neff, Zhaoyang Lv, Michael Zollhöfer, Markus Steinberger

However, rendering images with this new paradigm is slow due to the fact that an accurate quadrature of the volume rendering equation requires a large number of samples for each ray.

Novel View Synthesis

LiveView: Dynamic Target-Centered MPI for View Synthesis

no code implementations11 Jul 2021 Sushobhan Ghosh, Zhaoyang Lv, Nathan Matsuda, Lei Xiao, Andrew Berkovich, Oliver Cossairt

Existing Multi-Plane Image (MPI) based view-synthesis methods generate an MPI aligned with the input view using a fixed number of planes in one forward pass.

Novel View Synthesis

Neural 3D Video Synthesis from Multi-view Video

1 code implementation CVPR 2022 Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, Zhaoyang Lv

We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation.

Motion Interpolation

STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering

no code implementations CVPR 2021 Wentao Yuan, Zhaoyang Lv, Tanner Schmidt, Steven Lovegrove

We achieve this by jointly optimizing the parameters of two neural radiance fields and a set of rigid poses which align the two fields at each frame.

Neural Rendering Object

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation ICCV 2019 Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

miniSAM: A Flexible Factor Graph Non-linear Least Squares Optimization Framework

1 code implementation3 Sep 2019 Jing Dong, Zhaoyang Lv

Many problems in computer vision and robotics can be phrased as non-linear least squares optimization problems represented by factor graphs, for example, simultaneous localization and mapping (SLAM), structure from motion (SfM), motion planning, and control.

Benchmarking Motion Planning +1

Multi-class Classification without Multi-class Labels

1 code implementation ICLR 2019 Yen-Chang Hsu, Zhaoyang Lv, Joel Schlosser, Phillip Odom, Zsolt Kira

This work presents a new strategy for multi-class classification that requires no class-specific labels, but instead leverages pairwise similarity between examples, which is a weaker form of annotation.

Classification General Classification +1

Taking a Deeper Look at the Inverse Compositional Algorithm

1 code implementation CVPR 2019 Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger

In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment.

Motion Estimation regression

A probabilistic constrained clustering for transfer learning and image category discovery

no code implementations28 Jun 2018 Yen-Chang Hsu, Zhaoyang Lv, Joel Schlosser, Phillip Odom, Zsolt Kira

The proposed objective directly minimizes the negative log-likelihood of cluster assignment with respect to the pairwise constraints, has no hyper-parameters, and demonstrates improved scalability and performance on both supervised learning and unsupervised transfer learning.

Constrained Clustering Deep Clustering +2

Learning to cluster in order to transfer across domains and tasks

1 code implementation ICLR 2018 Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira

The key insight is that, in addition to features, we can transfer similarity information and this is sufficient to learn a similarity function and clustering network to perform both domain adaptation and cross-task transfer learning.

Constrained Clustering Transfer Learning +1

Deep Image Category Discovery using a Transferred Similarity Function

no code implementations5 Dec 2016 Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira

We propose that this network can be learned with contrastive loss which is only based on weak binary pair-wise constraints.

Clustering Transfer Learning

A Continuous Optimization Approach for Efficient and Accurate Scene Flow

no code implementations27 Jul 2016 Zhaoyang Lv, Chris Beall, Pablo F. Alcantarilla, Fuxin Li, Zsolt Kira, Frank Dellaert

We propose a continuous optimization method for solving dense 3D scene flow problems from stereo imagery.


Cannot find the paper you are looking for? You can Submit a new open access paper.