Search Results for author: Mingze Xu

Found 22 papers, 11 papers with code

SkeleTR: Towrads Skeleton-based Action Recognition in the Wild

no code implementations20 Sep 2023 Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph Tighe, Alessandro Bergamo

It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios.

Action Classification Action Recognition +2

SkeleTR: Towards Skeleton-based Action Recognition in the Wild

no code implementations ICCV 2023 Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph Tighe, Alessandro Bergamo

It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in the wild.

Action Classification Action Recognition +3

An In-depth Study of Stochastic Backpropagation

1 code implementation30 Sep 2022 Jun Fang, Mingze Xu, Hao Chen, Bing Shuai, Zhuowen Tu, Joseph Tighe

In this paper, we provide an in-depth study of Stochastic Backpropagation (SBP) when training deep neural networks for standard image classification and object detection tasks.

Image Classification object-detection +1

MeMOT: Multi-Object Tracking with Memory

no code implementations CVPR 2022 Jiarui Cai, Mingze Xu, Wei Li, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after a long time span.

Multi-Object Tracking Object +2

Long Short-Term Transformer for Online Action Detection

2 code implementations NeurIPS 2021 Mingze Xu, Yuanjun Xiong, Hao Chen, Xinyu Li, Wei Xia, Zhuowen Tu, Stefano Soatto

We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs a long- and short-term memory mechanism to model prolonged sequence data.

Online Action Detection Playing the Game of 2048

Semi-TCL: Semi-Supervised Track Contrastive Representation Learning

no code implementations6 Jul 2021 Wei Li, Yuanjun Xiong, Shuo Yang, Mingze Xu, Yongxin Wang, Wei Xia

We design a new instance-to-track matching objective to learn appearance embedding that compares a candidate detection to the embedding of the tracks persisted in the tracker.

Multiple Object Tracking Object +1

Stepwise Goal-Driven Networks for Trajectory Prediction

1 code implementation25 Mar 2021 Chuhua Wang, Yuchen Wang, Mingze Xu, David J. Crandall

We propose to predict the future trajectories of observed agents (e. g., pedestrians or vehicles) by estimating and using their goals at multiple time scales.

Multi-future Trajectory Prediction Trajectory Prediction

Learning Self-Consistency for Deepfake Detection

1 code implementation ICCV 2021 Tianchen Zhao, Xiang Xu, Mingze Xu, Hui Ding, Yuanjun Xiong, Wei Xia

We propose a new method to detect deepfake images using the cue of the source feature inconsistency within the forged images.

DeepFake Detection Face Swapping +2

Deep Tiered Image Segmentation For Detecting Internal Ice Layers in Radar Imagery

no code implementations8 Oct 2020 Yuchen Wang, Mingze Xu, John Paden, Lora Koenig, Geoffrey Fox, David Crandall

Understanding the structure of Earth's polar ice sheets is important for modeling how global warming will impact polar ice and, in turn, the Earth's climate.

Image Segmentation Semantic Segmentation

Embodied Visual Recognition

no code implementations9 Apr 2019 Jianwei Yang, Zhile Ren, Mingze Xu, Xinlei Chen, David Crandall, Devi Parikh, Dhruv Batra

Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded.

Object Object Localization +1

StartNet: Online Detection of Action Start in Untrimmed Videos

no code implementations ICCV 2019 Mingfei Gao, Mingze Xu, Larry S. Davis, Richard Socher, Caiming Xiong

We propose StartNet to address Online Detection of Action Start (ODAS) where action starts and their associated categories are detected in untrimmed, streaming videos.

Action Classification Policy Gradient Methods

Unsupervised Traffic Accident Detection in First-Person Videos

3 code implementations2 Mar 2019 Yu Yao, Mingze Xu, Yuchen Wang, David J. Crandall, Ella M. Atkins

Recognizing abnormal events such as traffic violations and accidents in natural driving scenes is essential for successful autonomous driving and advanced driver assistance systems.

Autonomous Driving Object Localization +4

Temporal Recurrent Networks for Online Action Detection

2 code implementations ICCV 2019 Mingze Xu, Mingfei Gao, Yi-Ting Chen, Larry S. Davis, David J. Crandall

Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed.

Online Action Detection

Egocentric Vision-based Future Vehicle Localization for Intelligent Driving Assistance Systems

2 code implementations19 Sep 2018 Yu Yao, Mingze Xu, Chiho Choi, David J. Crandall, Ella M. Atkins, Behzad Dariush

Predicting the future location of vehicles is essential for safety-critical applications such as advanced driver assistance systems (ADAS) and autonomous driving.

Autonomous Driving Motion Planning +1

Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos

no code implementations ECCV 2018 Mingze Xu, Chenyou Fan, Yuchen Wang, Michael S. Ryoo, David J. Crandall

In this paper, we wish to solve two specific problems: (1) given two or more synchronized third-person videos of a scene, produce a pixel-level segmentation of each visible person and identify corresponding people across different views (i. e., determine who in camera A corresponds with whom in camera B), and (2) given one or more synchronized third-person videos as well as a first-person video taken by a mobile or wearable camera, segment and identify the camera wearer in the third-person videos.

Segmentation

Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction

1 code implementation11 Jan 2018 Mingze Xu, Chenyou Fan, John D Paden, Geoffrey C. Fox, David J. Crandall

Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical.

Structured Prediction Surface Reconstruction

Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition

no code implementations11 Jan 2018 Mingze Xu, Aidean Sharghi, Xin Chen, David J. Crandall

A major emerging challenge is how to protect people's privacy as cameras and computer vision are increasingly integrated into our daily lives, including in smart devices inside homes.

Action Recognition Temporal Action Localization

Automatic Estimation of Ice Bottom Surfaces from Radar Imagery

no code implementations21 Dec 2017 Mingze Xu, David J. Crandall, Geoffrey C. Fox, John D Paden

Ground-penetrating radar on planes and satellites now makes it practical to collect 3D observations of the subsurface structure of the polar ice sheets, providing crucial data for understanding and tracking global climate change.

Identifying First-person Camera Wearers in Third-person Videos

no code implementations CVPR 2017 Chenyou Fan, Jang-Won Lee, Mingze Xu, Krishna Kumar Singh, Yong Jae Lee, David J. Crandall, Michael S. Ryoo

We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in environments in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene.

Activity Recognition Object Tracking +1

Cannot find the paper you are looking for? You can Submit a new open access paper.