Search Results for author: Zhihao LI

Found 38 papers, 26 papers with code

Harnessing Scale and Physics: A Multi-Graph Neural Operator Framework for PDEs on Arbitrary Geometries

1 code implementation18 Nov 2024 Zhihao LI, Haoze Song, Di Xiao, Zhilu Lai, Wei Wang

Partial Differential Equations (PDEs) underpin many scientific phenomena, yet traditional computational approaches often struggle with complex, nonlinear systems and irregular geometries.

Management

Temporal As a Plugin: Unsupervised Video Denoising with Pre-Trained Image Denoisers

1 code implementation17 Sep 2024 Zixuan Fu, Lanqing Guo, Chong Wang, YuFei Wang, Zhihao LI, Bihan Wen

Recent advancements in deep learning have shown impressive results in image and video denoising, leveraging extensive pairs of noisy and noise-free data for supervision.

Image Denoising Video Denoising

SoccerNet 2024 Challenges Results

1 code implementation16 Sep 2024 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Victor Joos, Floriane Magera, Jan Held, Seyed Abolfazl Ghasemzadeh, Xin Zhou, Karolina Seweryn, Mateusz Kowalczyk, Zuzanna Mróz, Szymon Łukasik, Michał Hałoń, Hassan Mkhallati, Adrien Deliège, Carlos Hinojosa, Karen Sanchez, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Adam Gorski, Albert Clapés, Andrei Boiarov, Anton Afanasiev, Artur Xarles, Atom Scott, Byoungkwon Lim, Calvin Yeung, Cristian Gonzalez, Dominic Rüfenacht, Enzo Pacilio, Fabian Deuser, Faisal Sami Altawijri, Francisco Cachón, Hankyul Kim, Haobo Wang, Hyeonmin Choe, Hyunwoo J Kim, Il-Min Kim, Jae-Mo Kang, Jamshid Tursunboev, Jian Yang, Jihwan Hong, JiMin Lee, Jing Zhang, Junseok Lee, Kexin Zhang, Konrad Habel, Licheng Jiao, Linyi Li, Marc Gutiérrez-Pérez, Marcelo Ortega, Menglong Li, Milosz Lopatto, Nikita Kasatkin, Nikolay Nemtsev, Norbert Oswald, Oleg Udin, Pavel Kononov, Pei Geng, Saad Ghazai Alotaibi, Sehyung Kim, Sergei Ulasen, Sergio Escalera, Shanshan Zhang, Shuyuan Yang, Sunghwan Moon, Thomas B. Moeslund, Vasyl Shandyba, Vladimir Golovkin, Wei Dai, WonTaek Chung, Xinyu Liu, Yongqiang Zhu, Youngseo Kim, Yuan Li, Yuting Yang, Yuxuan Xiao, Zehua Cheng, Zhihao LI

The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team.

Action Spotting Dense Video Captioning +2

EAGLE: Elevating Geometric Reasoning through LLM-empowered Visual Instruction Tuning

no code implementations21 Aug 2024 Zhihao LI, Yao Du, Yang Liu, Yan Zhang, Yufang Liu, Mengdi Zhang, Xunliang Cai

To address these limitations, we propose EAGLE, a novel two-stage end-to-end visual enhancement MLLM framework designed to ElevAte Geometric reasoning through LLM-Empowered visual instruction tuning.

Masked Angle-Aware Autoencoder for Remote Sensing Images

1 code implementation4 Aug 2024 Zhihao LI, Biao Hou, Siteng Ma, Zitong Wu, Xianpeng Guo, Bo Ren, Licheng Jiao

We design a \textit{scaling center crop} operation to create the rotated crop with random orientation on each original image, introducing the explicit angle variation.

Representation Learning

From Chaos to Clarity: 3DGS in the Dark

no code implementations12 Jun 2024 Zhihao LI, YuFei Wang, Alex Kot, Bihan Wen

Our study reveals that 3D Gaussian Splatting (3DGS) is particularly susceptible to this noise, leading to numerous elongated Gaussian shapes that overfit the noise, thereby significantly degrading reconstruction quality and reducing inference speed, especially in scenarios with limited views.

Novel View Synthesis Self-Supervised Learning

M2NO: Multiresolution Operator Learning with Multiwavelet-based Algebraic Multigrid Method

no code implementations7 Jun 2024 Zhihao LI, Zhilu Lai, Xiaobo Zhang, Wei Wang

Solving partial differential equations (PDEs) effectively necessitates a multi-scale approach, particularly critical in high-dimensional scenarios characterized by increasing grid points or resolution.

Operator learning Super-Resolution

ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model

1 code implementation31 May 2024 YuFei Wang, Zhihao LI, Lanqing Guo, Wenhan Yang, Alex C. Kot, Bihan Wen

Recently, 3D Gaussian Splatting (3DGS) has become a promising framework for novel view synthesis, offering fast rendering speeds and high fidelity.

Image Compression Novel View Synthesis

MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes

no code implementations23 May 2024 Ruiyuan Gao, Kai Chen, Zhihao LI, Lanqing Hong, Zhenguo Li, Qiang Xu

While controllable generative models for images and videos have achieved remarkable success, high-quality models for 3D scenes, particularly in unbounded scenarios like autonomous driving, remain underdeveloped due to high data acquisition costs.

3D Generation Autonomous Driving +2

MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

no code implementations20 May 2024 Jiayue Liu, Xiao Tang, Freeman Cheng, Roy Yang, Zhihao LI, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan

To tackle this problem, we present MirrorGaussian, the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting.

Novel View Synthesis

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

1 code implementation CVPR 2024 Jiaqi Lin, Zhihao LI, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, Wenming Yang

Existing NeRF-based methods for large scene reconstruction often have limitations in visual quality and rendering speed.

Deformable 3D Gaussian Splatting for Animatable Human Avatars

no code implementations22 Dec 2023 HyunJun Jung, Nikolas Brasch, Jifei Song, Eduardo Perez-Pellitero, Yiren Zhou, Zhihao LI, Nassir Navab, Benjamin Busam

ParDy-Human introduces parameter-driven dynamics into 3D Gaussian Splatting where 3D Gaussians are deformed by a human pose model to animate the avatar.

Human Animation Novel View Synthesis

Physics-Guided Human Motion Capture with Pose Probability Modeling

1 code implementation19 Aug 2023 Jingyi Ju, Buzhen Huang, Chen Zhu, Zhihao LI, Yangang Wang

To address the obstacles, our key-idea is to employ physics as denoising guidance in the reverse diffusion process to reconstruct physically plausible human motion from a modeled pose probability distribution.

Denoising

PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised RGB-D Point Cloud Registration

1 code implementation ICCV 2023 Mingzhi Yuan, Kexue Fu, Zhihao LI, Yucong Meng, Manning Wang

Point cloud registration is a task to estimate the rigid transformation between two unaligned scans, which plays an important role in many computer vision applications.

Point Cloud Registration

Scale-aware Test-time Click Adaptation for Pulmonary Nodule and Mass Segmentation

1 code implementation28 Jul 2023 Zhihao LI, Jiancheng Yang, Yongchao Xu, Li Zhang, Wenhui Dong, Bo Du

Extensive experiments on both open-source and in-house datasets consistently demonstrate the effectiveness of the proposed method over some CNN and Transformer-based segmentation methods.

Image Segmentation Management +4

Hierarchical Matching and Reasoning for Multi-Query Image Retrieval

1 code implementation26 Jun 2023 Zhong Ji, Zhihao LI, Yan Zhang, Haoran Wang, Yanwei Pang, Xuelong Li

Afterwards, the VR module is developed to excavate the potential semantic correlations among multiple region-query pairs, which further explores the high-level reasoning similarity.

Image Retrieval Retrieval

SoftGPT: Learn Goal-oriented Soft Object Manipulation Skills by Generative Pre-trained Heterogeneous Graph Transformer

1 code implementation22 Jun 2023 Junjia Liu, Zhihao LI, WanYu Lin, Sylvain Calinon, Kay Chen Tan, Fei Chen

Soft object manipulation tasks in domestic scenes pose a significant challenge for existing robotic skill learning techniques due to their complex dynamics and variable shape characteristics.

Object

Efficient Visual Computing with Camera RAW Snapshots

1 code implementation15 Dec 2022 Zhihao LI, Ming Lu, Xu Zhang, Xin Feng, M. Salman Asif, Zhan Ma

Conventional cameras capture image irradiance on a sensor and convert it to RGB images using an image signal processor (ISP).

Autonomous Driving Image Compression +2

PointCLM: A Contrastive Learning-based Framework for Multi-instance Point Cloud Registration

1 code implementation1 Sep 2022 Mingzhi Yuan, Zhihao LI, Qiuye Jin, Xinrong Chen, Manning Wang

Multi-instance point cloud registration is the problem of estimating multiple poses of source point cloud instances within a target point cloud.

Contrastive Learning Point Cloud Registration

CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation

6 code implementations1 Aug 2022 Zhihao LI, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan

Top-down methods dominate the field of 3D human pose and shape estimation, because they are decoupled from human detection and allow researchers to focus on the core problem.

3D human pose and shape estimation Human Detection +2

HICF: Hyperbolic Informative Collaborative Filtering

1 code implementation19 Jul 2022 Menglin Yang, Zhihao LI, Min Zhou, Jiahong Liu, Irwin King

The results reveal that (1) tail items get more emphasis in hyperbolic space than that in Euclidean space, but there is still ample room for improvement; (2) head items receive modest attention in hyperbolic space, which could be considerably improved; (3) and nonetheless, the hyperbolic models show more competitive performance than Euclidean models.

Collaborative Filtering Recommendation Systems

Rendering Nighttime Image Via Cascaded Color and Brightness Compensation

1 code implementation19 Apr 2022 Zhihao LI, Si Yi, Zhan Ma

Image signal processing (ISP) is crucial for camera imaging, and neural networks (NN) solutions are extensively deployed for daytime scenes.

Tone Mapping

Event Transformer

1 code implementation11 Apr 2022 Bin Jiang, Zhihao LI, M. Salman Asif, Xun Cao, Zhan Ma

The event camera's low power consumption and ability to capture microsecond brightness changes make it attractive for various computer vision tasks.

Event-based vision Optical Flow Estimation

Hyperbolic Graph Neural Networks: A Review of Methods and Applications

1 code implementation28 Feb 2022 Menglin Yang, Min Zhou, Zhihao LI, Jiahong Liu, Lujia Pan, Hui Xiong, Irwin King

Graph neural networks generalize conventional neural networks to graph-structured data and have received widespread attention due to their impressive representation ability.

Anatomy Graph Learning

Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

1 code implementation ICCV 2021 Zhihao Liang, Zhihao LI, Songcen Xu, Mingkui Tan, Kui Jia

State-of-the-art methods largely rely on a general pipeline that first learns point-wise features discriminative at semantic and instance levels, followed by a separate step of point grouping for proposing object instances.

3D Instance Segmentation Scene Understanding +1

DualPoseNet: Category-level 6D Object Pose and Size Estimation Using Dual Pose Network with Refined Learning of Pose Consistency

1 code implementation ICCV 2021 Jiehong Lin, Zewei Wei, Zhihao LI, Songcen Xu, Kui Jia, Yuanqing Li

DualPoseNet stacks two parallel pose decoders on top of a shared pose encoder, where the implicit decoder predicts object poses with a working mechanism different from that of the explicit one; they thus impose complementary supervision on the training of pose encoder.

6D Pose Estimation using RGBD Decoder +2

Quality-Aware Network for Human Parsing

1 code implementation10 Mar 2021 Lu Yang, Qing Song, Zhihui Wang, Zhiwei Liu, Songcen Xu, Zhihao LI

How to estimate the quality of the network output is an important issue, and currently there is no effective solution in the field of human parsing.

Human Parsing Instance Segmentation +1

Illumination Estimation Challenge: experience of past two years

no code implementations31 Dec 2020 Egor Ershov, Alex Savchik, Ilya Semenkov, Nikola Banić, Karlo Koscević, Marko Subašić, Alexander Belokopytov, Zhihao LI, Arseniy Terekhin, Daria Senshina, Artem Nikonorov, Yanlin Qian, Marco Buzzelli, Riccardo Riva, Simone Bianco, Raimondo Schettini, Sven Lončarić, Dmitry Nikolaev

The main advantage of testing a method on a challenge over testing in on some of the known datasets is the fact that the ground-truth illuminations for the challenge test images are unknown up until the results have been submitted, which prevents any potential hyperparameter tuning that may be biased.

Color Constancy Vocal Bursts Valence Prediction

An LSTM-Based Autonomous Driving Model Using Waymo Open Dataset

2 code implementations14 Feb 2020 Zhicheng Gu, Zhihao LI, Xuan Di, Rongye Shi

The Waymo Open Dataset has been released recently, providing a platform to crowdsource some fundamental challenges for automated vehicles (AVs), such as 3D detection and tracking.

Autonomous Driving Self-Driving Cars

RETHINKING SELF-DRIVING : MULTI -TASK KNOWLEDGE FOR BETTER GENERALIZATION AND ACCIDENT EXPLANATION ABILITY

no code implementations ICLR 2019 Zhihao LI, Toshiyuki MOTOYOSHI, Kazuma Sasaki, Tetsuya OGATA, Shigeki SUGANO

Current end-to-end deep learning driving models have two problems: (1) Poor generalization ability of unobserved driving environment when diversity of train- ing driving dataset is limited (2) Lack of accident explanation ability when driving models don’t work as expected.

Diversity

Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability

1 code implementation28 Sep 2018 Zhihao Li, Toshiyuki Motoyoshi, Kazuma Sasaki, Tetsuya OGATA, Shigeki SUGANO

Current end-to-end deep learning driving models have two problems: (1) Poor generalization ability of unobserved driving environment when diversity of training driving dataset is limited (2) Lack of accident explanation ability when driving models don't work as expected.

Diversity

Geometry-Contrastive GAN for Facial Expression Transfer

1 code implementation6 Feb 2018 Fengchun Qiao, Naiming Yao, Zirui Jiao, Zhihao LI, Hui Chen, Hongan Wang

Geometry information is introduced into cGANs as continuous conditions to guide the generation of facial expressions.

Contrastive Learning Generative Adversarial Network

Transfer of View-manifold Learning to Similarity Perception of Novel Objects

no code implementations31 Mar 2017 Xingyu Lin, Hao Wang, Zhihao LI, Yimeng Zhang, Alan Yuille, Tai Sing Lee

We develop a model of perceptual similarity judgment based on re-training a deep convolution neural network (DCNN) that learns to associate different views of each 3D object to capture the notion of object persistence and continuity in our visual experience.

Metric Learning Object

Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking

no code implementations23 Aug 2016 Nannan Li, Dan Xu, Zhenqiang Ying, Zhihao LI, Ge Li

In this paper, we address the problem of searching action proposals in unconstrained video clips.

Cannot find the paper you are looking for? You can Submit a new open access paper.