Semi-supervised Learning with Missing Values Imputation

3 Jun 2021  ·  Buliao Huang, Yunhui Zhu, Muhammad Usman, Huanhuan Chen ·

Incomplete instances with various missing attributes in many real-world applications have brought challenges to the classification tasks. Missing values imputation methods are often employed to replace the missing values with substitute values. However, this process often separates the imputation and classification, which may lead to inferior performance since label information are often ignored during imputation. Moreover, traditional methods may rely on improper assumptions to initialize the missing values, whereas the unreliability of such initialization might lead to inferior performance. To address these problems, a novel semi-supervised conditional normalizing flow (SSCFlow) is proposed in this paper. SSCFlow explicitly utilizes the label information to facilitate the imputation and classification simultaneously by estimating the conditional distribution of incomplete instances with a novel semi-supervised normalizing flow. Moreover, SSCFlow treats the initialized missing values as corrupted initial imputation and iteratively reconstructs their latent representations with an overcomplete denoising autoencoder to approximate their true conditional distribution. Experiments on real-world datasets demonstrate the robustness and effectiveness of the proposed algorithm.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods