Scalable semi-supervised dimensionality reduction with GPU-accelerated EmbedSOM

3 Jan 2022 · Adam Šmelko, Soňa Molnárová, Miroslav Kratochvíl, Abhishek Koladiya, Jan Musil, Martin Kruliš, Jiří Vondrášek ·

Dimensionality reduction methods have found vast application as visualization tools in diverse areas of science. Although many different methods exist, their performance is often insufficient for providing quick insight into many contemporary datasets, and the unsupervised mode of use prevents the users from utilizing the methods for dataset exploration and fine-tuning the details for improved visualization quality. We present BlosSOM, a high-performance semi-supervised dimensionality reduction software for interactive user-steerable visualization of high-dimensional datasets with millions of individual data points. BlosSOM builds on a GPU-accelerated implementation of the EmbedSOM algorithm, complemented by several landmark-based algorithms for interfacing the unsupervised model learning algorithms with the user supervision. We show the application of BlosSOM on realistic datasets, where it helps to produce high-quality visualizations that incorporate user-specified layout and focus on certain features. We believe the semi-supervised dimensionality reduction will improve the data visualization possibilities for science areas such as single-cell cytometry, and provide a fast and efficient base methodology for new directions in dataset exploration and annotation.

PDF Abstract