Search Results for author: Yijin Li

Found 21 papers, 4 papers with code

CoProSketch: Controllable and Progressive Sketch Generation with Diffusion Model

no code implementations11 Apr 2025 Ruohao Zhan, Yijin Li, Yisheng He, Shuo Chen, Yichen Shen, Xinyu Chen, Zilong Dong, Zhaoyang Huang, Guofeng Zhang

We propose a novel framework CoProSketch, providing prominent controllability and details for sketch generation with diffusion models.

Image Generation

OpticFusion: Multi-Modal Neural Implicit 3D Reconstruction of Microstructures by Fusing White Light Interferometry and Optical Microscopy

1 code implementation16 Jan 2025 Shuo Chen, Yijin Li, Guofeng Zhang

However, conventional WLI cannot capture the natural color of a sample's surface, which is essential for many microscale research applications that require both 3D geometry and color information.

3D geometry 3D Reconstruction

GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking

no code implementations5 Jan 2025 Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li

Specifically, we propose a novel framework that constructs a pseudo 4D Gaussian field with dense 3D point tracking and renders the Gaussian field for all video frames.

Novel View Synthesis Point Tracking +1

GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking

no code implementations CVPR 2025 Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li

Inspired by Monocular Dynamic novel View Synthesis (MDVS) that optimizes a 4D representation and renders videos according to different 4D elements, such as camera pose and object motion editing, we bring dynamic 3D Gaussian fields to video generation.

Novel View Synthesis Point Tracking +1

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

no code implementations4 Nov 2024 Yitong Dong, Yijin Li, Zhaoyang Huang, Weikang Bian, Jingbo Liu, Hujun Bao, Zhaopeng Cui, Hongsheng Li, Guofeng Zhang

We integrate pose embedding to encapsulate information such as multi-view camera poses, providing implicit geometric constraints for multi-view disparity feature fusion dominated by attention.

ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses

no code implementations30 Oct 2024 Junjie Ni, Guofeng Zhang, Guanglin Li, Yijin Li, Xinyang Liu, Zhaoyang Huang, Hujun Bao

On the YFCC100M dataset, our matching accuracy is competitive with LoFTR, a state-of-the-art transformer-based architecture, while the inference speed is boosted to 4 times, even outperforming the CNN-based methods.

BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events

no code implementations27 Oct 2024 Yijin Li, Yichen Shen, Zhaoyang Huang, Shuo Chen, Weikang Bian, Xiaoyu Shi, Fu-Yun Wang, Keqiang Sun, Hujun Bao, Zhaopeng Cui, Guofeng Zhang, Hongsheng Li

BlinkVision enables extensive benchmarks on three types of correspondence tasks (optical flow, point tracking, and scene flow estimation) for both image-based and event-based methods, offering new observations, practices, and insights for future research.

Event-based vision Optical Flow Estimation +2

BlinkTrack: Feature Tracking over 100 FPS via Events and Images

no code implementations26 Sep 2024 Yichen Shen, Yijin Li, Shuo Chen, Guanglin Li, Zhaoyang Huang, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

Feature tracking is crucial for, structure from motion (SFM), simultaneous localization and mapping (SLAM), object tracking and various computer vision tasks.

Object Tracking Simultaneous Localization and Mapping

Multi-View Neural 3D Reconstruction of Micro-/Nanostructures with Atomic Force Microscopy

1 code implementation21 Jan 2024 Shuo Chen, Mao Peng, Yijin Li, Bing-Feng Ju, Hujun Bao, Yuan-Liu Chen, Guofeng Zhang

However, conventional AFM scanning struggles to reconstruct complex 3D micro-/nanostructures precisely due to limitations such as incomplete sample topography capturing and tip-sample convolution artifacts.

3D Reconstruction Surface Reconstruction

Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a Light-Weight ToF Sensor

no code implementations ICCV 2023 Xinyang Liu, Yijin Li, Yanbin Teng, Hujun Bao, Guofeng Zhang, yinda zhang, Zhaopeng Cui

Specifically, we propose a multi-modal implicit scene representation that supports rendering both the signals from the RGB camera and light-weight ToF sensor which drives the optimization by comparing with the raw sensor inputs.

Pose Tracking

Graph-based Asynchronous Event Processing for Rapid Object Recognition

no code implementations ICCV 2021 Yijin Li, Han Zhou, Bangbang Yang, Ye Zhang, Zhaopeng Cui, Hujun Bao, Guofeng Zhang

Different from traditional video cameras, event cameras capture asynchronous events stream in which each event encodes pixel location, trigger time, and the polarity of the brightness changes.

graph construction Object Recognition

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow

no code implementations8 Jun 2023 Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Yijin Li, Hongwei Qin, Jifeng Dai, Xiaogang Wang, Hongsheng Li

This paper introduces a novel transformer-based network architecture, FlowFormer, along with the Masked Cost Volume AutoEncoding (MCVA) for pretraining it to tackle the problem of optical flow estimation.

Decoder Optical Flow Estimation

Context-PIPs: Persistent Independent Particles Demands Spatial Context Features

no code implementations3 Jun 2023 Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yitong Dong, Yijin Li, Hongsheng Li

We tackle the problem of Persistent Independent Particles (PIPs), also called Tracking Any Point (TAP), in videos, which specifically aims at estimating persistent long-term trajectories of query points in videos.

Point Tracking

DiffInDScene: Diffusion-based High-Quality 3D Indoor Scene Generation

1 code implementation CVPR 2024 Xiaoliang Ju, Zhaoyang Huang, Yijin Li, Guofeng Zhang, Yu Qiao, Hongsheng Li

In addition to the scene generation, the final part of DiffInDScene can be used as a post-processing module to refine the 3D reconstruction results from multi-view stereo.

3D Generation 3D Reconstruction +1

PATS: Patch Area Transportation with Subdivision for Local Feature Matching

no code implementations CVPR 2023 Junjie Ni, Yijin Li, Zhaoyang Huang, Hongsheng Li, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

However, estimating scale differences between these patches is non-trivial since the scale differences are determined by both relative camera poses and scene structures, and thus spatially varying over image pairs.

Graph Matching Optical Flow Estimation +2

BlinkFlow: A Dataset to Push the Limits of Event-based Optical Flow Estimation

no code implementations14 Mar 2023 Yijin Li, Zhaoyang Huang, Shuo Chen, Xiaoyu Shi, Hongsheng Li, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40\% on average and up to 90\%.

Event-based Optical Flow Optical Flow Estimation

DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image

no code implementations27 Sep 2022 Yijin Li, Xinyang Liu, Wenqi Dong, Han Zhou, Hujun Bao, Guofeng Zhang, yinda zhang, Zhaopeng Cui

Light-weight time-of-flight (ToF) depth sensors are small, cheap, low-energy and have been massively deployed on mobile devices for the purposes like autofocus, obstacle detection, etc.

3D Reconstruction Depth Completion +2

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

no code implementations5 May 2022 Bangbang Yang, yinda zhang, Yijin Li, Zhaopeng Cui, Sean Fanello, Hujun Bao, Guofeng Zhang

We, as human beings, can understand and picture a familiar scene from arbitrary viewpoints given a single image, whereas this is still a grand challenge for computers.

Data Augmentation Neural Rendering +1

Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering

no code implementations ICCV 2021 Bangbang Yang, yinda zhang, Yinghao Xu, Yijin Li, Han Zhou, Hujun Bao, Guofeng Zhang, Zhaopeng Cui

In this paper, we present a novel neural scene rendering system, which learns an object-compositional neural radiance field and produces realistic rendering with editing capability for a clustered and real-world scene.

Neural Rendering Novel View Synthesis +1

VS-Net: Voting with Segmentation for Visual Localization

1 code implementation CVPR 2021 Zhaoyang Huang, Han Zhou, Yijin Li, Bangbang Yang, Yan Xu, Xiaowei Zhou, Hujun Bao, Guofeng Zhang, Hongsheng Li

To address this problem, we propose a novel visual localization framework that establishes 2D-to-3D correspondences between the query image and the 3D map with a series of learnable scene-specific landmarks.

Segmentation Semantic Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.