Based on these findings, we propose a principle method to improve the robustness of Transformer models by automatically searching for an optimal fusion strategy regarding input data.
A common assumption in multimodal learning is the completeness of training data, i. e., full modalities are available in all training examples.
By employing hierarchical feature selection, we can compress the scale and dimension of global dictionary, which directly contributes to the decrease of computational cost in sparse representation that our approach is strongly rooted in.
It becomes a potential framework to solve robustness issue of ELM for high-dimensional blended data in the future.
Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data.