FullTransNet: Full Transformer with Local-Global Attention for Video Summarization

no code implementations1 Jan 2025 Libin Lan, Lu Jiang, Tianshu Yu, Xiaojuan Liu, Zhongshi He

Based on this, we propose a transformer-like architecture, named FullTransNet, which has a full encoder-decoder structure with local-global sparse attention for video summarization.

Decoder Supervised Video Summarization

An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes

1 code implementation29 Sep 2024 Jiayu Hu, Senlin Shu, Beibei Li, Tao Xiang, Zhongshi He

To address this issue, in this paper, we focus on the problem of Partial Label Learning with Augmented Class (PLLAC), where one or more augmented classes are not visible in the training stage but appear in the inference stage.

Partial Label Learning Weakly-supervised Learning

A Hierarchical Interactive Network for Joint Span-based Aspect-Sentiment Analysis

1 code implementation COLING 2022 Wei Chen, Jinglong Du, Zhao Zhang, Fuzhen Zhuang, Zhongshi He

Recently, some span-based methods have achieved encouraging performances for joint aspect-sentiment analysis, which first extract aspects (aspect extraction) by detecting aspect boundaries and then classify the span-level sentiments (sentiment classification).

Aspect Extraction Sentiment Analysis +1

FHEDN: A based on context modeling Feature Hierarchy Encoder-Decoder Network for face detection

no code implementations11 Dec 2017 Zexun Zhou, Zhongshi He, Ziyu Chen, Yuanyuan Jia, HaiYan Wang, Jinglong Du, Dingding Chen

The proposed network is consist of multiple context modeling and prediction modules, which are in order to detect small, blur, occluded and diverse pose faces.

Decoder Face Detection +1

A breakthrough in Speech emotion recognition using Deep Retinal Convolution Neural Networks

no code implementations12 Jul 2017 Yafeng Niu, Dongsheng Zou, Yadong Niu, Zhongshi He, Hua Tan

Speech emotion recognition (SER) is to study the formation and change of speaker's emotional state from the speech signal perspective, so as to make the interaction between human and computer more intelligent.

Data Augmentation Speech Emotion Recognition

