no code implementations • 22 Sep 2020 • Weijian Luo, Yongxian Long
A vital problem in solving classification or regression problem is to apply feature engineering and variable selection on data before fed into models. One of a most popular feature engineering method is to discretisize continous variable with some cutting points, which is refered to as bining processing. Good cutting points are important for improving model's ability, because wonderful bining may ignore some noisy variance in continous variable range and keep useful leveled information with good ordered encodings. However, to our best knowledge a majority of cutting point selection is done via researchers domain knownledge or some naive methods like equal-width cutting or equal-frequency cutting. In this paper we propose an end-to-end supervised cutting point selection method based on group and fused lasso along with the automatically variable selection effect. We name our method \textbf{ABM}(automatic bining machine).