Search Results for author: Peng Dai

Found 42 papers, 24 papers with code

Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition

no code implementations ECCV 2020 Niamul Quader, Juwei Lu, Peng Dai, Wei Li

State-of-the-art approaches to video-based action and gesture recognition often employ two key concepts: First, they employ multistream processing; second, they use an ensemble of convolutional networks.

3D Action Recognition Action Classification +3

Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks

1 code implementation ECCV 2020 Niamul Quader, Md Mafijul Islam Bhuiyan, Juwei Lu, Peng Dai, Wei Li

We propose novel approaches for simultaneously identifying important weights of a convolutional neural network (ConvNet) and providing more attention to the important weights during training.

3D Action Recognition 3D Object Classification +7

GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction

no code implementations5 Jul 2024 Yuxuan Mu, Xinxin Zuo, Chuan Guo, Yilin Wang, Juwei Lu, Xiaofeng Wu, Songcen Xu, Peng Dai, Youliang Yan, Li Cheng

We present GSD, a diffusion model approach based on Gaussian Splatting (GS) representation for 3D object reconstruction from a single view.

3D Object Reconstruction 3D Reconstruction +1

SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix

no code implementations29 Jun 2024 Peng Dai, Feitong Tan, Qiangeng Xu, David Futschik, Ruofei Du, Sean Fanello, Xiaojuan Qi, yinda zhang

We propose a pose-free and training-free approach for generating 3D stereoscopic videos using an off-the-shelf monocular video generation model.

Denoising Video Generation +1

EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal LLM

no code implementations13 Apr 2024 Henry Peng Zou, Gavin Heqing Yu, Ziwei Fan, Dan Bu, Han Liu, Peng Dai, Dongmei Jia, Cornelia Caragea

To address these issues, we introduce EIVEN, a data- and parameter-efficient generative framework that pioneers the use of multimodal LLM for implicit attribute value extraction.

Attribute Attribute Value Extraction

HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations

no code implementations CVPR 2024 Peng Dai, Yang Zhang, Tao Liu, Zhen Fan, Tianyuan Du, Zhuo Su, Xiaozheng Zheng, Zeming Li

It is especially challenging to achieve real-time human motion tracking on a standalone VR Head-Mounted Display (HMD) such as Meta Quest and PICO.


Generative Human Motion Stylization in Latent Space

no code implementations24 Jan 2024 Chuan Guo, Yuxuan Mu, Xinxin Zuo, Peng Dai, Youliang Yan, Juwei Lu, Li Cheng

Building upon this, we present a novel generative model that produces diverse stylization results of a single motion (latent) code.

GO-NeRF: Generating Virtual Objects in Neural Radiance Fields

no code implementations11 Jan 2024 Peng Dai, Feitong Tan, Xin Yu, yinda zhang, Xiaojuan Qi

To this end, we propose a new method, GO-NeRF, capable of utilizing scene context for high-quality and harmonious 3D object generation within an existing NeRF.

3D Generation Object

Random resistive memory-based deep extreme point learning machine for unified visual processing

no code implementations14 Dec 2023 Shaocong Wang, Yizhao Gao, Yi Li, Woyu Zhang, Yifei Yu, Bo wang, Ning Lin, Hegan Chen, Yue Zhang, Yang Jiang, Dingchen Wang, Jia Chen, Peng Dai, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Xiaoxin Xu, Hayden So, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.

DreamStone: Image as Stepping Stone for Text-Guided 3D Shape Generation

2 code implementations24 Mar 2023 Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu

The core of our approach is a two-stage feature-space alignment strategy that leverages a pre-trained single-view reconstruction (SVR) model to map CLIP features to shapes: to begin with, map the CLIP image feature to the detail-rich 3D shape space of the SVR model, then map the CLIP text feature to the 3D shape space through encouraging the CLIP-consistency between rendered images and the input text.

3D Shape Generation

Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation

1 code implementation ICCV 2023 Xiaoyang Lyu, Peng Dai, Zizhang Li, Dongyu Yan, Yi Lin, Yifan Peng, Xiaojuan Qi

We found that the color rendering loss results in optimization bias against low-intensity areas, causing gradient vanishing and leaving these areas unoptimized.

Neural Rendering Surface Reconstruction

Multi-Behavior Graph Neural Networks for Recommender System

no code implementations17 Feb 2023 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Liefeng Bo

Recent years have witnessed the emerging success of many deep learning-based recommendation models for augmenting collaborative filtering architectures with various neural network architectures, such as multi-layer perceptron and autoencoder.

Collaborative Filtering Graph Neural Network +2

ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation

2 code implementations9 Sep 2022 Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu

Text-guided 3D shape generation remains challenging due to the absence of large paired text-shape data, the substantial semantic gap between these two modalities, and the structural complexity of 3D shapes.

3D Shape Generation

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

1 code implementation20 Jul 2022 Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi

With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.

4k Image Enhancement +2

Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking

1 code implementation31 May 2022 Peng Dai, Yiqiang Feng, Renliang Weng, ChangShui Zhang

The recent trend in multiple object tracking (MOT) is heading towards leveraging deep learning to boost the tracking performance.

Decoder Multiple Object Tracking

Video Demoireing with Relation-Based Temporal Consistency

1 code implementation CVPR 2022 Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi

Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.


An Efficient and Robust System for Vertically Federated Random Forest

no code implementations26 Jan 2022 Houpu Yao, Jiazhou Wang, Peng Dai, Liefeng Bo, Yanqing Chen

As there is a growing interest in utilizing data across multiple resources to build better machine learning models, many vertically federated learning algorithms have been proposed to preserve the data privacy of the participating organizations.

Federated Learning

Multi-Behavior Enhanced Recommendation with Cross-Interaction Collaborative Relation Modeling

1 code implementation7 Jan 2022 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Mengyin Lu, Liefeng Bo

Due to the overlook of user's multi-behavioral patterns over different items, existing recommendation methods are insufficient to capture heterogeneous collaborative signals from user multi-behavior data.

Collaborative Filtering Recommendation Systems +1

Spatial-Temporal Sequential Hypergraph Network for Crime Prediction with Dynamic Multiplex Relation Learning

1 code implementation IJCAI 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Liefeng Bo, Xiyue Zhang, Tianyi Chen

Crime prediction is crucial for public safety and resource optimization, yet is very challenging due to two aspects: i) the dynamics of criminal patterns across time and space, crime events are distributed unevenly on both spatial and temporal domains; ii) time-evolving dependencies between different types of crimes (e. g., Theft, Robbery, Assault, Damage) which reveal fine-grained semantics of crimes.

Crime Prediction Relation

Decompose the Sounds and Pixels, Recompose the Events

no code implementations21 Dec 2021 Varshanth R. Rao, Md Ibrahim Khalil, Haoda Li, Peng Dai, Juwei Lu

In this paper, we propose a framework centering around a novel architecture called the Event Decomposition Recomposition Network (EDRNet) to tackle the Audio-Visual Event (AVE) localization problem in the supervised and weakly supervised settings.

Self-supervised Spatiotemporal Representation Learning by Exploiting Video Continuity

no code implementations11 Dec 2021 Hanwen Liang, Niamul Quader, Zhixiang Chi, Lizhe Chen, Peng Dai, Juwei Lu, Yang Wang

Recent self-supervised video representation learning methods have found significant success by exploring essential properties of videos, e. g. speed, temporal order, etc.

Action Localization Action Recognition +3

Social Recommendation with Self-Supervised Metagraph Informax Network

1 code implementation8 Oct 2021 Xiaoling Long, Chao Huang, Yong Xu, Huance Xu, Peng Dai, Lianghao Xia, Liefeng Bo

To model relation heterogeneity, we design a metapath-guided heterogeneous graph neural network to aggregate feature embeddings from different types of meta-relations across users and items, em-powering SMIN to maintain dedicated representations for multi-faceted user- and item-wise dependencies.

Collaborative Filtering Graph Neural Network +1

Graph Meta Network for Multi-Behavior Recommendation

1 code implementation8 Oct 2021 Lianghao Xia, Yong Xu, Chao Huang, Peng Dai, Liefeng Bo

Modern recommender systems often embed users and items into low-dimensional latent representations, based on their observed interactions.

Diversity Meta-Learning +2

Graph-Enhanced Multi-Task Learning of Multi-Level Transition Dynamics for Session-based Recommendation

1 code implementation8 Oct 2021 Chao Huang, Jiahui Chen, Lianghao Xia, Yong Xu, Peng Dai, Yanqing Chen, Liefeng Bo, Jiashu Zhao, Jimmy Xiangji Huang

The learning process of intra- and inter-session transition dynamics are integrated, to preserve the underlying low- and high-level item relationships in a common latent space.

Graph Neural Network Multi-Task Learning +2

Traffic Flow Forecasting with Spatial-Temporal Graph Diffusion Network

1 code implementation8 Oct 2021 Xiyue Zhang, Chao Huang, Yong Xu, Lianghao Xia, Peng Dai, Liefeng Bo, Junbo Zhang, Yu Zheng

Accurate forecasting of citywide traffic flow has been playing critical role in a variety of spatial-temporal mining applications, such as intelligent traffic control and public risk assessment.

Traffic Prediction

Multiplex Behavioral Relation Learning for Recommendation via Memory Augmented Transformer Network

1 code implementation8 Oct 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Bo Zhang, Liefeng Bo

The overlook of multiplex behavior relations can hardly recognize the multi-modal contextual signals across different types of interactions, which limit the feasibility of current recommendation methods.

Recommendation Systems Relation +1

Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation

1 code implementation8 Oct 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Xiyue Zhang, Hongsheng Yang, Jian Pei, Liefeng Bo

In particular: i) complex inter-dependencies across different types of user behaviors; ii) the incorporation of knowledge-aware item relations into the multi-behavior recommendation framework; iii) dynamic characteristics of multi-typed user-item interactions.

Graph Attention Recommendation Systems

Knowledge-aware Coupled Graph Neural Network for Social Recommendation

1 code implementation8 Oct 2021 Chao Huang, Huance Xu, Yong Xu, Peng Dai, Lianghao Xia, Mengyin Lu, Liefeng Bo, Hao Xing, Xiaoping Lai, Yanfang Ye

While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider users' social connections, while ignoring the inter-dependent knowledge across items; (ii) Most of existing solutions are designed for singular type of user-item interactions, making them infeasible to capture the interaction heterogeneity; (iii) The dynamic nature of user-item interactions has been less explored in many social-aware recommendation techniques.

Collaborative Filtering Graph Neural Network +1

Class Semantics-based Attention for Action Detection

no code implementations ICCV 2021 Deepak Sridhar, Niamul Quader, Srikanth Muralidharan, Yaoxin Li, Peng Dai, Juwei Lu

Our attention mechanism outperforms prior self-attention modules such as the squeeze-and-excitation in action detection task.

Action Detection Action Localization

Boosting the Generalization Capability in Cross-Domain Few-shot Learning via Noise-enhanced Supervised Autoencoder

1 code implementation ICCV 2021 Hanwen Liang, Qiong Zhang, Peng Dai, Juwei Lu

State of the art (SOTA) few-shot learning (FSL) methods suffer significant performance drop in the presence of domain differences between source and target datasets.

cross-domain few-shot learning

Learning a Proposal Classifier for Multiple Object Tracking

1 code implementation CVPR 2021 Peng Dai, Renliang Weng, Wongun Choi, ChangShui Zhang, Zhangping He, Wei Ding

In this paper, we propose a novel proposal-based learnable framework, which models MOT as a proposal generation, proposal scoring and trajectory inference paradigm on an affinity graph.

Clustering Graph Clustering +2

EnsemFDet: An Ensemble Approach to Fraud Detection based on Bipartite Graph

no code implementations23 Dec 2019 Yuxiang Ren, Hao Zhu, Jiawei Zhang, Peng Dai, Liefeng Bo

Existing fraud detection methods try to identify unexpected dense subgraphs and treat related nodes as suspicious.

Fraud Detection

Neural Point Cloud Rendering via Multi-Plane Projection

1 code implementation CVPR 2020 Peng Dai, yinda zhang, Zhuwen Li, Shuaicheng Liu, Bing Zeng

The input to the network is the raw point cloud of a scene and the output are image or image sequences from a novel view or along a novel camera trajectory.

Heterogeneous Deep Graph Infomax

1 code implementation19 Nov 2019 Yuxiang Ren, Bo Liu, Chao Huang, Peng Dai, Liefeng Bo, Jiawei Zhang

The derived node representations can be used to serve various downstream tasks, such as node classification and node clustering.

Classification Clustering +5

An Adaptive Psychoacoustic Model for Automatic Speech Recognition

no code implementations14 Sep 2016 Peng Dai, Xue Teng, Frank Rudzicz, Ing Yann Soon

Experiments are carried out on the AURORA2 database and show that the word recognition rate using our proposed feature extraction method is significantly increased over the baseline.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Topological Value Iteration Algorithms

no code implementations16 Jan 2014 Peng Dai, Mausam, Daniel Sabby Weld, Judy Goldsmith

Value iteration is a powerful yet inefficient algorithm for Markov decision processes (MDPs) because it puts the majority of its effort into backing up the entire state space, which turns out to be unnecessary in many cases.

Cannot find the paper you are looking for? You can Submit a new open access paper.