Sampling is a critical operation in Graph Neural Network (GNN) training that helps reduce the cost. Previous literature has explored improving sampling algorithms via mathematical and statistical methods. However, there is a gap between sampling algorithms and hardware. Without consideration of hardware, algorithm designers merely optimize sampling at the algorithm level, missing the great potential of promoting the efficiency of existing sampling algorithms by leveraging hardware features. In this paper, we pioneer to propose a unified programming model for mainstream sampling algorithms, termed GNNSampler, covering the critical processes of sampling algorithms in various categories. Second, to leverage the hardware feature, we choose the data locality as a case study, and explore the data locality among nodes and their neighbors in a graph to alleviate irregular memory access in sampling. Third, we implement locality-aware optimizations in GNNSampler for various sampling algorithms to optimize the general sampling process. Finally, we emphatically conduct experiments on large graph datasets to analyze the relevance among training time, accuracy, and hardware-level metrics. Extensive experiments show that our method is universal to mainstream sampling algorithms and helps significantly reduce the training time, especially in large-scale graphs.