Search Results for author: Bing Shuai

Found 33 papers, 8 papers with code

Self-Supervised Multi-Object Tracking with Path Consistency

1 code implementation8 Apr 2024 Zijia Lu, Bing Shuai, Yanbei Chen, Zhenlin Xu, Davide Modolo

In this paper, we propose a novel concept of path consistency to learn robust object matching without using manual object identity supervision.

Multi-Object Tracking Object

SkeleTR: Towrads Skeleton-based Action Recognition in the Wild

no code implementations20 Sep 2023 Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph Tighe, Alessandro Bergamo

It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios.

Action Classification Action Recognition +2

Object-Centric Multiple Object Tracking

1 code implementation ICCV 2023 Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun Xiao

Unsupervised object-centric learning methods allow the partitioning of scenes into entities without additional localization information and are excellent candidates for reducing the annotation burden of multiple-object tracking (MOT) pipelines.

Multiple Object Tracking Object +3

SkeleTR: Towards Skeleton-based Action Recognition in the Wild

no code implementations ICCV 2023 Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph Tighe, Alessandro Bergamo

It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in the wild.

Action Classification Action Recognition +3

Large Scale Real-World Multi-Person Tracking

1 code implementation ECCV 2022 Bing Shuai, Alessandro Bergamo, Uta Buechler, Andrew Berneshawi, Alyssa Boden, Joseph Tighe

This paper presents a new large scale multi-person tracking dataset -- \texttt{PersonPath22}, which is over an order of magnitude larger than currently available high quality multi-object tracking datasets such as MOT17, HiEve, and MOT20 datasets.

Multi-Object Tracking

An In-depth Study of Stochastic Backpropagation

1 code implementation30 Sep 2022 Jun Fang, Mingze Xu, Hao Chen, Bing Shuai, Zhuowen Tu, Joseph Tighe

In this paper, we provide an in-depth study of Stochastic Backpropagation (SBP) when training deep neural networks for standard image classification and object detection tasks.

Image Classification object-detection +1

Transfer of Representations to Video Label Propagation: Implementation Factors Matter

no code implementations10 Mar 2022 Daniel McKee, Zitong Zhan, Bing Shuai, Davide Modolo, Joseph Tighe, Svetlana Lazebnik

This work studies feature representations for dense label propagation in video, with a focus on recently proposed methods that learn video correspondence using self-supervised signals such as colorization or temporal cycle consistency.

Colorization

Id-Free Person Similarity Learning

no code implementations CVPR 2022 Bing Shuai, Xinyu Li, Kaustav Kundu, Joseph Tighe

In this work, we explore training such a model by only using person box annotations, thus removing the necessity of manually labeling a training dataset with additional person identity annotation as these are expensive to collect.

Contrastive Learning Human Detection +2

Multi-Object Tracking with Hallucinated and Unlabeled Videos

no code implementations19 Aug 2021 Daniel McKee, Bing Shuai, Andrew Berneshawi, Manchen Wang, Davide Modolo, Svetlana Lazebnik, Joseph Tighe

Next, to tackle harder tracking cases, we mine hard examples across an unlabeled pool of real videos with a tracker trained on our hallucinated video data.

Multi-Object Tracking Object

VidTr: Video Transformer Without Convolutions

no code implementations ICCV 2021 Yanyi Zhang, Xinyu Li, Chunhui Liu, Bing Shuai, Yi Zhu, Biagio Brattoli, Hao Chen, Ivan Marsic, Joseph Tighe

We first introduce the vanilla video transformer and show that transformer module is able to perform spatio-temporal modeling from raw pixels, but with heavy memory usage.

Action Classification Action Recognition +1

NUTA: Non-uniform Temporal Aggregation for Action Recognition

no code implementations15 Dec 2020 Xinyu Li, Chunhui Liu, Bing Shuai, Yi Zhu, Hao Chen, Joseph Tighe

In the world of action recognition research, one primary focus has been on how to construct and train networks to model the spatial-temporal volume of an input video.

Action Recognition

Directional Temporal Modeling for Action Recognition

no code implementations ECCV 2020 Xinyu Li, Bing Shuai, Joseph Tighe

Many current activity recognition models use 3D convolutional neural networks (e. g. I3D, I3D-NL) to generate local spatial-temporal features.

Action Recognition

Multi-Object Tracking with Siamese Track-RCNN

no code implementations16 Apr 2020 Bing Shuai, Andrew G. Berneshawi, Davide Modolo, Joseph Tighe

Multi-object tracking systems often consist of a combination of a detector, a short term linker, a re-identification feature extractor and a solver that takes the output from these separate components and makes a final prediction.

Multi-Object Tracking Object

Understanding the impact of mistakes on background regions in crowd counting

no code implementations30 Mar 2020 Davide Modolo, Bing Shuai, Rahul Rama Varior, Joseph Tighe

Our results show that (i) mistakes on background are substantial and they are responsible for 18-49% of the total error, (ii) models do not generalize well to different kinds of backgrounds and perform poorly on completely background images, and (iii) models make many more mistakes than those captured by the standard Mean Absolute Error (MAE) metric, as counting on background compensates considerably for misses on foreground.

Crowd Counting

Semantic Correlation Promoted Shape-Variant Context for Segmentation

1 code implementation CVPR 2019 Henghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, Gang Wang

In this way, the proposed network aggregates the context information of a pixel from its semantic-correlated region instead of a predefined fixed region.

Denoising Segmentation +1

Toward Achieving Robust Low-Level and High-Level Scene Parsing

1 code implementation journal 2019 Bing Shuai, Henghui Ding, Ting Liu, Gang Wang, Xudong Jiang

Furthermore, we introduce a “dense skip” architecture to retain a rich set of low-level information from the pre-trained CNN, which is essential to improve the low-level parsing performance.

Scene Parsing Scene Segmentation +2

Multi-Scale Attention Network for Crowd Counting

no code implementations17 Jan 2019 Rahul Rama Varior, Bing Shuai, Joseph Tighe, Davide Modolo

In crowd counting datasets, people appear at different scales, depending on their distance from the camera.

Crowd Counting

Improving Fast Segmentation With Teacher-student Learning

no code implementations19 Oct 2018 Jiafeng Xie, Bing Shuai, Jian-Fang Hu, Jingyang Lin, Wei-Shi Zheng

Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks.

Segmentation

Context Contrasted Feature and Gated Multi-Scale Aggregation for Scene Segmentation

1 code implementation CVPR 2018 Henghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, Gang Wang

In this paper, we first propose a novel context contrasted local feature that not only leverages the informative context but also spotlights the local information in contrast to the context.

Scene Segmentation Segmentation

Deep Level Sets for Salient Object Detection

no code implementations CVPR 2017 Ping Hu, Bing Shuai, Jun Liu, Gang Wang

Our method drives the network to learn a Level Set function for salient objects so it can output more accurate boundaries and compact saliency.

Object object-detection +3

Episodic CAMN: Contextual Attention-Based Memory Networks With Iterative Feedback for Scene Labeling

no code implementations CVPR 2017 Abrar H. Abdulnabi, Bing Shuai, Stefan Winkler, Gang Wang

Scene labeling can be seen as a sequence-sequence prediction task (pixels-labels), and it is quite important to leverage relevant context to enhance the performance of pixel classification.

General Classification Scene Labeling

Improving Fully Convolution Network for Semantic Segmentation

no code implementations28 Nov 2016 Bing Shuai, Ting Liu, Gang Wang

In addition, dense skip connections are added so that the context network can be effectively optimized.

Scene Parsing Segmentation +1

A Siamese Long Short-Term Memory Architecture for Human Re-Identification

no code implementations European Conference on Computer Vision 2016 Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang

Matching pedestrians across multiple camera views known as human re-identification (re-identification) is a challenging problem in visual surveillance.

Person Re-Identification

Joint Learning of Siamese CNNs and Temporally Constrained Metrics for Tracklet Association

no code implementations15 May 2016 Bing Wang, Li Wang, Bing Shuai, Zhen Zuo, Ting Liu, Kap Luk Chan, Gang Wang

Then the Siamese CNN and temporally constrained metrics are jointly learned online to construct the appearance-based tracklet affinity models.

Multi-Object Tracking Multi-Task Learning

Recent Advances in Convolutional Neural Networks

no code implementations22 Dec 2015 Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Li Wang, Gang Wang, Jianfei Cai, Tsuhan Chen

In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing.

speech-recognition Speech Recognition

Learning Contextual Dependencies with Convolutional Hierarchical Recurrent Neural Networks

no code implementations13 Sep 2015 Zhen Zuo, Bing Shuai, Gang Wang, Xiao Liu, Xingxing Wang, Bing Wang

In this manuscript, we integrate CNNs with HRNNs, and develop end-to-end convolutional hierarchical recurrent neural networks (C-HRNNs).

General Classification Image Classification

DAG-Recurrent Neural Networks For Scene Labeling

no code implementations CVPR 2016 Bing Shuai, Zhen Zuo, Gang Wang, Bing Wang

In image labeling, local representations for image units are usually generated from their surrounding image patches, thus long-range contextual information is not effectively encoded.

General Classification Scene Labeling +1

Exemplar Based Deep Discriminative and Shareable Feature Learning for Scene Image Classification

no code implementations21 Aug 2015 Zhen Zuo, Gang Wang, Bing Shuai, Lifan Zhao, Qingxiong Yang

In order to encode the class correlation and class specific information in image representation, we propose a new local feature learning approach named Deep Discriminative and Shareable Feature Learning (DDSFL).

General Classification Image Classification

Integrating Parametric and Non-Parametric Models For Scene Labeling

no code implementations CVPR 2015 Bing Shuai, Gang Wang, Zhen Zuo, Bing Wang, Lifan Zhao

We adopt Convolutional Neural Networks (CNN) as our parametric model to learn discriminative features and classifiers for local patch classification.

General Classification Metric Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.