TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Micro Expression Recognition	CASME3	HTNet	UF1	57.67	# 1
Micro Expression Recognition	CASME3	HTNet	UAR	54.15	# 1
Micro-Expression Recognition	CASME II	HTNet	UF1	95.32	# 1
Micro-Expression Recognition	CASME II	HTNet	UAR	95.16	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/htnet-for-micro-expression-recognition/micro-expression-recognition-on-casme3)](https://paperswithcode.com/sota/micro-expression-recognition-on-casme3?p=htnet-for-micro-expression-recognition)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/htnet-for-micro-expression-recognition/micro-expression-recognition-on-casme-ii-1)](https://paperswithcode.com/sota/micro-expression-recognition-on-casme-ii-1?p=htnet-for-micro-expression-recognition)`

HTNet for micro-expression recognition

27 Jul 2023 · Zhifeng Wang, Kaihao Zhang, Wenhan Luo, Ramesh Sankaranarayana ·

Facial expression is related to facial muscle contractions and different muscle movements correspond to different emotional states. For micro-expression recognition, the muscle movements are usually subtle, which has a negative impact on the performance of current facial emotion recognition algorithms. Most existing methods use self-attention mechanisms to capture relationships between tokens in a sequence, but they do not take into account the inherent spatial relationships between facial landmarks. This can result in sub-optimal performance on micro-expression recognition tasks.Therefore, learning to recognize facial muscle movements is a key challenge in the area of micro-expression recognition. In this paper, we propose a Hierarchical Transformer Network (HTNet) to identify critical areas of facial muscle movement. HTNet includes two major components: a transformer layer that leverages the local temporal features and an aggregation layer that extracts local and global semantical facial features. Specifically, HTNet divides the face into four different facial areas: left lip area, left eye area, right eye area and right lip area. The transformer layer is used to focus on representing local minor muscle movement with local self-attention in each area. The aggregation layer is used to learn the interactions between eye areas and lip areas. The experiments on four publicly available micro-expression datasets show that the proposed approach outperforms previous methods by a large margin. The codes and models are available at: \url{https://github.com/wangzhifengharrison/HTNet}

PDF Abstract

Code

Add Remove Mark official

wangzhifengharrison/htnet official

Tasks

Add Remove

Emotion Recognition

Facial Emotion Recognition

Micro Expression Recognition

Micro-Expression Recognition

Datasets

SAMM Long Videos CASME II

Results from the Paper

Edit

Ranked #1 on Micro-Expression Recognition on CASME II

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Micro Expression Recognition	CASME3	HTNet	UF1	57.67	# 1	Compare
Micro Expression Recognition	CASME3	HTNet	UAR	54.15	# 1	Compare
Micro-Expression Recognition	CASME II	HTNet	UF1	95.32	# 1	Compare
Micro-Expression Recognition	CASME II	HTNet	UAR	95.16	# 1	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Focus • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

HTNet for micro-expression recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove