Search Results for author: Ivan Marsic

Found 16 papers, 3 papers with code

Concurrent Activity Recognition with Multimodal CNN-LSTM Structure

no code implementations6 Feb 2017 Xinyu Li, Yanyi Zhang, Jianyu Zhang, Shuhong Chen, Ivan Marsic, Richard A. Farneth, Randall S. Burd

Our system is the first to address the concurrent activity recognition with multisensory data using a single model, which is scalable, simple to train and easy to deploy.

Concurrent Activity Recognition Decision Making

Online People Tracking and Identification with RFID and Kinect

no code implementations10 Feb 2017 Xinyu Li, Yanyi Zhang, Ivan Marsic, Randall S. Burd

We introduce a novel, accurate and practical system for real-time people tracking and identification.

Position TAG

Progress Estimation and Phase Detection for Sequential Processes

no code implementations28 Feb 2017 Xinyu Li, Yanyi Zhang, Jianyu Zhang, Yueyang Chen, Shuhong Chen, Yue Gu, Moliang Zhou, Richard A. Farneth, Ivan Marsic, Randall S. Burd

For the Olympic swimming dataset, our system achieved an accuracy of 88%, an F1-score of 0. 58, a completeness estimation error of 6. 3% and a remaining-time estimation error of 2. 9 minutes.

Activity Recognition Multimodal Deep Learning

Process-oriented Iterative Multiple Alignment for Medical Process Mining

no code implementations16 Sep 2017 Shuhong Chen, Sen yang, Moliang Zhou, Randall S. Burd, Ivan Marsic

We applied PIMA to analyzing medical workflow data, showing how iterative alignment can better represent the data and facilitate the extraction of insights from data visualization.

Data Visualization

Deep Multimodal Learning for Emotion Recognition in Spoken Language

no code implementations22 Feb 2018 Yue Gu, Shuhong Chen, Ivan Marsic

In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language.

Emotion Recognition Sentence

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

no code implementations ACL 2018 Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic

Multimodal affective computing, learning to recognize and interpret human affects and subjective information from multiple data sources, is still challenging because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract level, ignoring time-dependent interactions between modalities.

Hybrid Attention based Multimodal Network for Spoken Language Classification

no code implementations COLING 2018 Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic

The proposed hybrid attention architecture helps the system focus on learning informative representations for both modality-specific feature extraction and model fusion.

Classification Emotion Recognition +4

RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement

2 code implementations15 Apr 2019 Jalal Abdulbaqi, Yue Gu, Ivan Marsic

Most current speech enhancement models use spectrogram features that require an expensive transformation and result in phase information loss.

Speech Enhancement

Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations

no code implementations CVPR 2021 Yanyi Zhang, Xinyu Li, Ivan Marsic

Multi-label activity recognition is designed for recognizing multiple activities that are performed simultaneously or sequentially in each video.

Activity Recognition Video Classification

VidTr: Video Transformer Without Convolutions

no code implementations ICCV 2021 Yanyi Zhang, Xinyu Li, Chunhui Liu, Bing Shuai, Yi Zhu, Biagio Brattoli, Hao Chen, Ivan Marsic, Joseph Tighe

We first introduce the vanilla video transformer and show that transformer module is able to perform spatio-temporal modeling from raw pixels, but with heavy memory usage.

Action Classification Action Recognition +1

Progressive Learning for Stabilizing Label Selection in Speech Separation with Mapping-based Method

no code implementations20 Oct 2021 Chenyang Gao, Yue Gu, Ivan Marsic

We investigate the use of the mapping-based method in the time domain and show that it can perform better on a large training set than the masking-based method.

Speech Recognition Speech Separation

Generating Privacy-Preserving Process Data with Deep Generative Models

1 code implementation15 Mar 2022 Keyi Li, Sen yang, Travis M. Sullivan, Randall S. Burd, Ivan Marsic

We experimented with different models of representation learning and used the learned model to generate synthetic process data.

Privacy Preserving Representation Learning

Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech Separation

no code implementations20 Nov 2023 Chenyang Gao, Yue Gu, Ivan Marsic

Despite its success, previous studies showed that PIT is plagued by excessive label assignment switching in adjacent epochs, impeding the model to learn better label assignments.

Speech Separation

Cannot find the paper you are looking for? You can Submit a new open access paper.