Using Multiple Instance Learning for Explainable Solar Flare Prediction

25 Mar 2022 · Cédric Huwyler, Martin Melchior ·

In this work we leverage a weakly-labeled dataset of spectral data from NASAs IRIS satellite for the prediction of solar flares using the Multiple Instance Learning (MIL) paradigm. While standard supervised learning models expect a label for every instance, MIL relaxes this and only considers bags of instances to be labeled. This is ideally suited for flare prediction with IRIS data that consists of time series of bags of UV spectra measured along the instrument slit. In particular, we consider the readout window around the Mg II h&k lines that encodes information on the dynamics of the solar chromosphere. Our MIL models are not only able to predict whether flares occur within the next $\sim$25 minutes with accuracies of around 90%, but are also able to explain which spectral profiles were particularly important for their bag-level prediction. This information can be used to highlight regions of interest in ongoing IRIS observations in real-time and to identify candidates for typical flare precursor spectral profiles. We use k-means clustering to extract groups of spectral profiles that appear relevant for flare prediction. The recovered groups show high intensity, triplet red wing emission and single-peaked h and k lines, as found by previous works. They seem to be related to small-scale explosive events that have been reported to occur tens of minutes before a flare.

PDF Abstract