Search Results for author: Xiaodong Chen

Found 16 papers, 6 papers with code

Scaling Law for Post-training after Model Pruning

no code implementations15 Nov 2024 Xiaodong Chen, Yuxuan Hu, Jing Zhang, Xiaokang Zhang, Cuiping Li, Hong Chen

Despite this, post-training after pruning is crucial for performance recovery and can be resource-intensive.

ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models

no code implementations1 Oct 2024 Mengxue Qu, Xiaodong Chen, Wu Liu, Alicia Li, Yao Zhao

Video Temporal Grounding (VTG) aims to ground specific segments within an untrimmed video corresponding to the given natural language query.

Motion Capture from Inertial and Vision Sensors

no code implementations23 Jul 2024 Xiaodong Chen, Wu Liu, Qian Bao, Xinchen Liu, Quanwei Yang, Ruoli Dai, Tao Mei

With the proposed MINIONS, we conduct experiments on multi-modal motion capture and explore the possibilities of consumer-affordable motion capture using a monocular camera and very few IMUs.

Streamlining Redundant Layers to Compress Large Language Models

no code implementations28 Mar 2024 Xiaodong Chen, Yuxuan Hu, Jing Zhang, Yanling Wang, Cuiping Li, Hong Chen

This paper introduces LLM-Streamline, a pioneer work on layer pruning for large language models (LLMs).

Model Compression

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion

no code implementations5 Feb 2024 Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao

In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.

Object Video Generation

Measuring the Discrepancy between 3D Geometric Models using Directional Distance Fields

no code implementations18 Jan 2024 Siyu Ren, Junhui Hou, Xiaodong Chen, Hongkai Xiong, Wenping Wang

We then transfer the discrepancy between two 3D geometric models as the discrepancy between their DDFs defined on an identical domain, naturally establishing model correspondence.

3D geometry Scene Flow Estimation

Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model

1 code implementation11 Oct 2023 Shiyuan Yang, Xiaodong Chen, Jing Liao

Recently, text-to-image denoising diffusion probabilistic models (DDPMs) have demonstrated impressive image generation capabilities and have also been successfully applied to image inpainting.

Image Denoising Image Inpainting

$\rm SP^3$: Enhancing Structured Pruning via PCA Projection

1 code implementation31 Aug 2023 Yuxuan Hu, Jing Zhang, Zhe Zhao, Chen Zhao, Xiaodong Chen, Cuiping Li, Hong Chen

Structured pruning is a widely used technique for reducing the size of pre-trained language models (PLMs), but current methods often overlook the potential of compressing the hidden dimension (d) in PLMs, a dimension critical to model size and efficiency.

GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation

1 code implementation ICCV 2023 Siyu Ren, Junhui Hou, Xiaodong Chen, Ying He, Wenping Wang

We present a learning-based method, namely GeoUDF, to tackle the long-standing and challenging problem of reconstructing a discrete surface from a sparse point cloud. To be specific, we propose a geometry-guided learning method for UDF and its gradient estimation that explicitly formulates the unsigned distance of a query point as the learnable affine averaging of its distances to the tangent planes of neighboring points on the surface.

Surface Reconstruction

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition

no code implementations1 Sep 2022 Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei

In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features.

Action Recognition

CorrI2P: Deep Image-to-Point Cloud Registration via Dense Correspondence

1 code implementation12 Jul 2022 Siyu Ren, Yiming Zeng, Junhui Hou, Xiaodong Chen

Motivated by the intuition that the critical step of localizing a 2D image in the corresponding 3D point cloud is establishing 2D-3D correspondence between them, we propose the first feature-based dense correspondence framework for addressing the image-to-point cloud registration problem, dubbed CorrI2P, which consists of three modules, i. e., feature embedding, symmetric overlapping region detection, and pose estimation through the established correspondence.

Image to Point Cloud Registration Pose Estimation

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

no code implementations9 Mar 2022 Xiaodong Chen, Xinchen Liu, Wu Liu, Kun Liu, Dong Wu, Yongdong Zhang, Tao Mei

Therefore, researchers start to focus on a new task, Part-level Action Parsing (PAP), which aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video.

Action Parsing Action Recognition

A Baseline Framework for Part-level Action Parsing and Action Recognition

no code implementations7 Oct 2021 Xiaodong Chen, Xinchen Liu, Kun Liu, Wu Liu, Tao Mei

This technical report introduces our 2nd place solution to Kinetics-TPS Track on Part-level Action Parsing in ICCV DeeperAction Workshop 2021.

Action Parsing Action Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.