no code implementations • 10 Nov 2021 • Seung Won Min, Kun Wu, Mert Hidayetoğlu, JinJun Xiong, Xiang Song, Wen-mei Hwu
With our data tiering method, we additionally provide a new data placement and access strategy to further minimize the CPU-GPU communication overhead.
1 code implementation • 4 Mar 2021 • Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, JinJun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu
In this work, we propose a novel GPU-oriented data communication approach for GCN training, where GPU threads directly access sparse features in host memory through zero-copy accesses without much CPU help.
1 code implementation • 20 Jan 2021 • Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, JinJun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu
While this process accounts for a significant portion of the training time, we find existing GNN implementations using popular deep neural network (DNN) libraries such as PyTorch are limited to a CPU-centric approach for the entire data preparation step.