Embedding neurophysiological signals

IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence, and Neural Engineering (MetroXRAINE) 2022 · Pierre Guetschel, Théodore Papadopoulo, Michael Tangermann ·

Neurophysiological time-series recordings of brain activity like the electroencephalogram (EEG) or local field potentials can be decoded by machine learning models in order to either control an application, e.g., for communication or rehabilitation after stroke, or to passively monitor the ongoing brain state of the subject, e.g., in a demanding work environment. A typical decoding challenge faced by a brain-computer interface (BCI) is the small dataset size compared to other domains of machine learning like computer vision or natural language processing. The possibilities to tackle classification or regression problems in BCI are to either train a regular model on the available small training data sets or through transfer learning, which utilizes data from other sessions, subjects, or even datasets to train a model. Transfer learning is non-trivial because of the non-stationary of EEG signals between subjects but also within subjects. This variability calls for explicit calibration phases at the start of every session, before BCI applications can be used online. In this study, we present arguments to BCI researchers to encourage the use of embeddings for EEG decoding. In particular, we introduce a simple domain adaptation technique involving both deep learning (when learning the embeddings from the source data) and classical machine learning (for fast calibration on the target data). This technique allows us to learn embeddings across subjects, which deliver a generalized data representation. These can then be fed into subject-specific classifiers in order to minimize their need for calibration data. We conducted offline experiments on the 14 subjects of the High Gamma EEG-BCI Dataset [1]. Embedding functions were obtained by training EEGNet [2] using a leave-one-subject-out (LOSO) protocol, and the embedding vectors were classified by the logistic regression algorithm. Our pipeline was compared to two baseline approaches: EEGNet without subject-specific calibration and the standard FBCSP pipeline in a within-subject training. We observed that the representations learned by the embedding functions were indeed non-stationary across subjects, justifying the need for an additional subject-specific calibration. We also observed that the subject-specific calibration indeed improved the score. Finally, our data suggest, that building upon embeddings requires fewer individual calibration data than the FBCSP baseline to reach satisfactory scores.

PDF Abstract