Search Results for author: Liefeng Bo

Found 59 papers, 26 papers with code

Gaussian-Informed Continuum for Physical Property Identification and Simulation

no code implementations21 Jun 2024 Junhao Cai, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen

In addition to the extracted object surfaces, the Gaussian-informed continuum also enables the rendering of object masks during simulations, serving as implicit shape guidance for physical property estimation.

An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes Using Pre-Trained Text-to-Image Models

no code implementations22 Mar 2024 Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Zilong Dong, Liefeng Bo, QiXing Huang

In particular, the third and fourth stages are iterated, with the cuts obtained in the fourth stage encouraging non-rigid alignment in the third stage to focus on regions close to the cuts.

OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation

no code implementations19 Mar 2024 Junhao Cai, Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qifeng Chen

Derived from OmniObject3D, OO3D-9D is the largest and most diverse dataset in the field of category-level object pose and size estimation.

Object

VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model

no code implementations18 Mar 2024 Qi Zuo, Xiaodong Gu, Lingteng Qiu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Rui Peng, Siyu Zhu, Zilong Dong, Liefeng Bo, QiXing Huang

Images from video generative models are more suitable for multi-view generation because the underlying network architecture that generates them employs a temporal module to enforce frame consistency.

Denoising

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

no code implementations27 Feb 2024 Linrui Tian, Qi Wang, Bang Zhang, Liefeng Bo

In this work, we tackle the challenge of enhancing the realism and expressiveness in talking head video generation by focusing on the dynamic and nuanced relationship between audio cues and facial movements.

Video Generation

GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors

no code implementations CVPR 2024 Yuan Dong, Qi Zuo, Xiaodong Gu, Weihao Yuan, Zhengyi Zhao, Zilong Dong, Liefeng Bo, QiXing Huang

The key to our approach is a new diffusion procedure that combines the discrete empirical data distribution and a continuous distribution induced by the quality checker.

Denoising

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

no code implementations CVPR 2024 Yushuang Wu, Luyue Shi, Junhao Cai, Weihao Yuan, Lingteng Qiu, Zilong Dong, Liefeng Bo, Shuguang Cui, Xiaoguang Han

This approach treats the query points for implicit field learning as a noisy point cloud for iterative denoising allowing for their dynamic adaptation to the target object shape.

3D Object Reconstruction Denoising

Make-A-Character: High Quality Text-to-3D Character Generation within Minutes

no code implementations24 Dec 2023 Jianqiang Ren, Chao He, Lin Liu, Jiahao Chen, Yutong Wang, Yafei Song, Jianfang Li, Tangli Xue, Siqi Hu, Tao Chen, Kunkun Zheng, Jianjing Xiang, Liefeng Bo

There is a growing demand for customized and expressive 3D characters with the emergence of AI agents and Metaverse, but creating 3D characters using traditional computer graphics tools is a complex and time-consuming task.

3D Generation Text to 3D

MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing

no code implementations12 Dec 2023 Kangneng Zhou, Daiheng Gao, Xuan Wang, Jie Zhang, Peng Zhang, Xusen Sun, Longhao Zhang, Shiqi Yang, Bang Zhang, Liefeng Bo, Yaxing Wang, Ming-Ming Cheng

This enhances masked-based editing in local areas; second, we present a novel distillation strategy: Conditional Distillation on Geometry and Texture (CDGT).

VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior

no code implementations4 Dec 2023 Xusen Sun, Longhao Zhang, Hao Zhu, Peng Zhang, Bang Zhang, Xinya Ji, Kangneng Zhou, Daiheng Gao, Liefeng Bo, Xun Cao

Audio-driven talking head generation has drawn much attention in recent years, and many efforts have been made in lip-sync, expressive facial expressions, natural head pose generation, and high video quality.

Talking Head Generation

RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D

no code implementations CVPR 2024 Lingteng Qiu, GuanYing Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han

Lifting 2D diffusion for 3D generation is a challenging problem due to the lack of geometric prior and the complex entanglement of materials and lighting in natural images.

3D Generation Text to 3D

Cloth2Tex: A Customized Cloth Texture Generation Pipeline for 3D Virtual Try-On

no code implementations8 Aug 2023 Daiheng Gao, Xu Chen, Xindi Zhang, Qi Wang, Ke Sun, Bang Zhang, Liefeng Bo, QiXing Huang

Since traditional warping-based texture generation methods require a significant number of control points to be manually selected for each type of garment, which can be a time-consuming and tedious process.

Texture Synthesis Virtual Try-on

DiffHand: End-to-End Hand Mesh Reconstruction via Diffusion Models

no code implementations23 May 2023 Lijun Li, Li'an Zhuo, Bang Zhang, Liefeng Bo, Chen Chen

Hand mesh reconstruction from the monocular image is a challenging task due to its depth ambiguity and severe occlusion, there remains a non-unique mapping between the monocular image and hand mesh.

Decoder Denoising

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer

no code implementations CVPR 2024 Yuan Dong, Chuan Fang, Liefeng Bo, Zilong Dong, Ping Tan

Panoramic image enables deeper understanding and more holistic perception of $360^\circ$ surrounding environment, which can naturally encode enriched scene context information compared to standard perspective image.

3D Object Detection object-detection +1

Evaluate Geometry of Radiance Fields with Low-frequency Color Prior

1 code implementation10 Apr 2023 Qihang Fang, Yafei Song, Keqiang Li, Li Shen, Huaiyu Wu, Gang Xiong, Liefeng Bo

From this insight, given a reconstructed density field and observation images, we design a closed-form method to approximate the color field with low-frequency spherical harmonics, and compute the inverse mean residual color.

3D Reconstruction Novel View Synthesis

Multi-Behavior Graph Neural Networks for Recommender System

no code implementations17 Feb 2023 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Liefeng Bo

Recent years have witnessed the emerging success of many deep learning-based recommendation models for augmenting collaborative filtering architectures with various neural network architectures, such as multi-layer perceptron and autoencoder.

Collaborative Filtering Graph Neural Network +2

Towards Stable Human Pose Estimation via Cross-View Fusion and Foot Stabilization

no code implementations CVPR 2023 Li’an Zhuo, Jian Cao, Qi Wang, Bang Zhang, Liefeng Bo

Then the optimization-based method is introduced to reconstruct the foot pose and foot-ground contact for the general multi-view datasets including AIST++ and Human3. 6M.

Pose Estimation

DG3D: Generating High Quality 3D Textured Shapes by Learning to Discriminate Multi-Modal Diffusion-Renderings

no code implementations ICCV 2023 Qi Zuo, Yafei Song, Jianfang Li, Lin Liu, Liefeng Bo

Many virtual reality applications require massive 3D content, which impels the need for low-cost and efficient modeling tools in terms of quality and quantity.

4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions

1 code implementation9 Dec 2022 Zhongshu Wang, Lingzhi Li, Zhen Shen, Li Shen, Liefeng Bo

In this paper, we present a novel and effective framework, named 4K-NeRF, to pursue high fidelity view synthesis on the challenging scenarios of ultra high resolutions, building on the methodology of neural radiance fields (NeRF).

4k Decoder +1

Compressing Volumetric Radiance Fields to 1 MB

1 code implementation CVPR 2023 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Liefeng Bo

Approximating radiance fields with volumetric grids is one of promising directions for improving NeRF, represented by methods like Plenoxels and DVGO, which achieve super-fast training convergence and real-time rendering.

Model Compression Neural Rendering +1

A-ACT: Action Anticipation through Cycle Transformations

no code implementations2 Apr 2022 Akash Gupta, Jingen Liu, Liefeng Bo, Amit K. Roy-Chowdhury, Tao Mei

To incorporate this ability in intelligent systems a question worth pondering upon is how exactly do we anticipate?

Action Anticipation

An Efficient and Robust System for Vertically Federated Random Forest

no code implementations26 Jan 2022 Houpu Yao, Jiazhou Wang, Peng Dai, Liefeng Bo, Yanqing Chen

As there is a growing interest in utilizing data across multiple resources to build better machine learning models, many vertically federated learning algorithms have been proposed to preserve the data privacy of the participating organizations.

Federated Learning

Multi-Behavior Enhanced Recommendation with Cross-Interaction Collaborative Relation Modeling

1 code implementation7 Jan 2022 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Mengyin Lu, Liefeng Bo

Due to the overlook of user's multi-behavioral patterns over different items, existing recommendation methods are insufficient to capture heterogeneous collaborative signals from user multi-behavior data.

Collaborative Filtering Recommendation Systems +1

Spatial-Temporal Sequential Hypergraph Network for Crime Prediction with Dynamic Multiplex Relation Learning

1 code implementation IJCAI 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Liefeng Bo, Xiyue Zhang, Tianyi Chen

Crime prediction is crucial for public safety and resource optimization, yet is very challenging due to two aspects: i) the dynamics of criminal patterns across time and space, crime events are distributed unevenly on both spatial and temporal domains; ii) time-evolving dependencies between different types of crimes (e. g., Theft, Robbery, Assault, Damage) which reveal fine-grained semantics of crimes.

Crime Prediction Relation

Social Recommendation with Self-Supervised Metagraph Informax Network

1 code implementation8 Oct 2021 Xiaoling Long, Chao Huang, Yong Xu, Huance Xu, Peng Dai, Lianghao Xia, Liefeng Bo

To model relation heterogeneity, we design a metapath-guided heterogeneous graph neural network to aggregate feature embeddings from different types of meta-relations across users and items, em-powering SMIN to maintain dedicated representations for multi-faceted user- and item-wise dependencies.

Collaborative Filtering Graph Neural Network +1

Graph Meta Network for Multi-Behavior Recommendation

1 code implementation8 Oct 2021 Lianghao Xia, Yong Xu, Chao Huang, Peng Dai, Liefeng Bo

Modern recommender systems often embed users and items into low-dimensional latent representations, based on their observed interactions.

Meta-Learning Recommendation Systems +1

Knowledge-aware Coupled Graph Neural Network for Social Recommendation

1 code implementation8 Oct 2021 Chao Huang, Huance Xu, Yong Xu, Peng Dai, Lianghao Xia, Mengyin Lu, Liefeng Bo, Hao Xing, Xiaoping Lai, Yanfang Ye

While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider users' social connections, while ignoring the inter-dependent knowledge across items; (ii) Most of existing solutions are designed for singular type of user-item interactions, making them infeasible to capture the interaction heterogeneity; (iii) The dynamic nature of user-item interactions has been less explored in many social-aware recommendation techniques.

Collaborative Filtering Graph Neural Network +1

Traffic Flow Forecasting with Spatial-Temporal Graph Diffusion Network

1 code implementation8 Oct 2021 Xiyue Zhang, Chao Huang, Yong Xu, Lianghao Xia, Peng Dai, Liefeng Bo, Junbo Zhang, Yu Zheng

Accurate forecasting of citywide traffic flow has been playing critical role in a variety of spatial-temporal mining applications, such as intelligent traffic control and public risk assessment.

Traffic Prediction

Multiplex Behavioral Relation Learning for Recommendation via Memory Augmented Transformer Network

1 code implementation8 Oct 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Bo Zhang, Liefeng Bo

The overlook of multiplex behavior relations can hardly recognize the multi-modal contextual signals across different types of interactions, which limit the feasibility of current recommendation methods.

Recommendation Systems Relation +1

Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation

1 code implementation8 Oct 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Xiyue Zhang, Hongsheng Yang, Jian Pei, Liefeng Bo

In particular: i) complex inter-dependencies across different types of user behaviors; ii) the incorporation of knowledge-aware item relations into the multi-behavior recommendation framework; iii) dynamic characteristics of multi-typed user-item interactions.

Graph Attention Recommendation Systems

Graph-Enhanced Multi-Task Learning of Multi-Level Transition Dynamics for Session-based Recommendation

1 code implementation8 Oct 2021 Chao Huang, Jiahui Chen, Lianghao Xia, Yong Xu, Peng Dai, Yanqing Chen, Liefeng Bo, Jiashu Zhao, Jimmy Xiangji Huang

The learning process of intra- and inter-session transition dynamics are integrated, to preserve the underlying low- and high-level item relationships in a common latent space.

Graph Neural Network Multi-Task Learning +2

AsySQN: Faster Vertical Federated Learning Algorithms with Better Computation Resource Utilization

no code implementations26 Sep 2021 Qingsong Zhang, Bin Gu, Cheng Deng, Songxiang Gu, Liefeng Bo, Jian Pei, Heng Huang

To address the challenges of communication and computation resource utilization, we propose an asynchronous stochastic quasi-Newton (AsySQN) framework for VFL, under which three algorithms, i. e. AsySQN-SGD, -SVRG and -SAGA, are proposed.

Privacy Preserving Vertical Federated Learning

Memory-Augmented Non-Local Attention for Video Super-Resolution

1 code implementation CVPR 2022 Jiyang Yu, Jingen Liu, Liefeng Bo, Tao Mei

Those methods achieve limited performance as they suffer from the challenge in spatial frame alignment and the lack of useful information from similar LR neighbor frames.

Analog Video Restoration Video Super-Resolution

Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark

1 code implementation CVPR 2021 Longyin Wen, Dawei Du, Pengfei Zhu, QinGhua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

To promote the developments of object detection, tracking and counting algorithms in drone-captured videos, we construct a benchmark with a new drone-captured largescale dataset, named as DroneCrowd, formed by 112 video clips with 33, 600 HD frames in various scenarios.

object-detection Object Detection +1

Data Augmentation for Object Detection via Differentiable Neural Rendering

1 code implementation4 Mar 2021 Guanghan Ning, Guang Chen, Chaowei Tan, Si Luo, Liefeng Bo, Heng Huang

We propose a new offline data augmentation method for object detection, which semantically interpolates the training data with novel views.

Data Augmentation Neural Rendering +4

Outline to Story: Fine-grained Controllable Story Generation from Cascaded Events

1 code implementation4 Jan 2021 Le Fang, Tao Zeng, Chaochun Liu, Liefeng Bo, Wen Dong, Changyou Chen

Our paper is among the first ones by our knowledge to propose a model and to create datasets for the task of "outline to story".

Keyword Extraction Language Modelling +1

Transformer-based Conditional Variational Autoencoder for Controllable Story Generation

2 code implementations4 Jan 2021 Le Fang, Tao Zeng, Chaochun Liu, Liefeng Bo, Wen Dong, Changyou Chen

In this paper, we advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers to enhance controllability without hurting state-of-the-art generation effectiveness.

Decoder Representation Learning +1

Efficient Pig Counting in Crowds with Keypoints Tracking and Spatial-aware Temporal Response Filtering

no code implementations27 May 2020 Guang Chen, Shiwen Shen, Longyin Wen, Si Luo, Liefeng Bo

Existing methods only focused on pig counting using single image, and its accuracy is challenged by several factors, including pig movements, occlusion and overlapping.

Edge-computing

EnsemFDet: An Ensemble Approach to Fraud Detection based on Bipartite Graph

no code implementations23 Dec 2019 Yuxiang Ren, Hao Zhu, Jiawei Zhang, Peng Dai, Liefeng Bo

Existing fraud detection methods try to identify unexpected dense subgraphs and treat related nodes as suspicious.

Fraud Detection

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

1 code implementation4 Dec 2019 Longyin Wen, Dawei Du, Pengfei Zhu, QinGhua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

This paper proposes a space-time multi-scale attention network (STANet) to solve density map estimation, localization and tracking in dense crowds of video clips captured by drones with arbitrary crowd density, perspective, and flight altitude.

Crowd Counting

Heterogeneous Deep Graph Infomax

1 code implementation19 Nov 2019 Yuxiang Ren, Bo Liu, Chao Huang, Peng Dai, Liefeng Bo, Jiawei Zhang

The derived node representations can be used to serve various downstream tasks, such as node classification and node clustering.

Classification Clustering +5

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression

7 code implementations ICCV 2019 Xinyao Wang, Liefeng Bo, Li Fuxin

Then we propose a novel loss function, named Adaptive Wing loss, that is able to adapt its shape to different types of ground truth heatmap pixels.

Face Alignment regression +1

Spatiotemporal CNN for Video Object Segmentation

1 code implementation CVPR 2019 Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang

Specifically, the temporal coherence branch pretrained in an adversarial fashion from unlabeled video data, is designed to capture the dynamic appearance and motion cues of video sequences to guide object segmentation.

Object Segmentation +5

ScratchDet: Training Single-Shot Object Detectors from Scratch

1 code implementation CVPR 2019 Rui Zhu, Shifeng Zhang, Xiaobo Wang, Longyin Wen, Hailin Shi, Liefeng Bo, Tao Mei

Taking this advantage, we are able to explore various types of networks for object detection, without suffering from the poor convergence.

General Classification Object +2

Multipath Sparse Coding Using Hierarchical Matching Pursuit

no code implementations CVPR 2013 Liefeng Bo, Xiaofeng Ren, Dieter Fox

Complex real-world signals, such as images, contain discriminative structures that differ in many aspects including scale, invariance, and data channel.

Image Classification

Discriminatively Trained Sparse Code Gradients for Contour Detection

no code implementations NeurIPS 2012 Ren Xiaofeng, Liefeng Bo

Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition.

Contour Detection Dictionary Learning +3

Unsupervised Template Learning for Fine-Grained Object Recognition

no code implementations NeurIPS 2012 Shulin Yang, Liefeng Bo, Jue Wang, Linda G. Shapiro

It differs from recognition of basic categories, such as humans, tables, and computers, in that there are global similarities in shape or structure shared within a category, and the differences are in the details of the object parts.

Object Object Recognition

A Joint Model of Language and Perception for Grounded Attribute Learning

no code implementations27 Jun 2012 Cynthia Matuszek, Nicholas FitzGerald, Luke Zettlemoyer, Liefeng Bo, Dieter Fox

As robots become more ubiquitous and capable, it becomes ever more important to enable untrained users to easily interact with them.

Attribute Language Modelling

Kernel Descriptors for Visual Recognition

no code implementations NeurIPS 2010 Liefeng Bo, Xiaofeng Ren, Dieter Fox

We highlight the kernel view of orientation histograms, and show that they are equivalent to a certain type of match kernels over image patches.

Attribute Image Classification +1

Twin gaussian processes for structured prediction

no code implementations International Journal of Computer Vision 2010 Liefeng Bo, Cristian Sminchisescu

We describe twin Gaussian processes (TGP), a generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the Kullback-Leibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examples, emphasizing the goal that similar inputs should produce similar percepts and this should hold, on average, between their marginal distributions.

3D Human Pose Estimation Camera Calibration +2

Efficient Match Kernel between Sets of Features for Visual Recognition

no code implementations NeurIPS 2009 Liefeng Bo, Cristian Sminchisescu

To address this problem, we propose an efficient match kernel (EMK), which maps local features to a low dimensional feature space, average the resulting feature vectors to form a set-level feature, then apply a linear classifier.

Quantization

Conditional Neural Fields

no code implementations NeurIPS 2009 Jian Peng, Liefeng Bo, Jinbo Xu

To model the nonlinear relationship between input features and outputs we propose Conditional Neural Fields (CNF), a new conditional probabilistic graphical model for sequence labeling.

Handwriting Recognition Hyperparameter Optimization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.