ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data

7 Jan 2023  ·  Jintai Chen, Jiahuan Yan, Danny Ziyi Chen, Jian Wu ·

Though deep neural networks have gained enormous successes in various fields (e.g., computer vision) with supervised learning, they have so far been still trailing after the performances of GBDTs on tabular data. Delving into this task, we determine that a judicious handling of feature interactions and feature representation is crucial to the effectiveness of neural networks on tabular data. We develop a novel neural network called ExcelFormer, which alternates in turn between two attention modules that shrewdly manipulate feature interactions and feature representation updates, respectively. A bespoke training methodology is jointly introduced to facilitate model performances. Specifically, by initializing parameters with minuscule values, these attention modules are attenuated when the training begins, and the effects of feature interactions and representation updates grow progressively up to optimum levels under the guidance of our proposed specific regularization schemes Feat-Mix and Hidden-Mix as the training proceeds. Experiments on 28 public tabular datasets show that our ExcelFormer approach is superior to extensively-tuned GBDTs, which is an unprecedented progress of deep neural networks on supervised tabular learning.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here