no code implementations • 19 Aug 2024 • Kun Wu, Jeongmin Brian Park, Xiaofan Zhang, Mert Hidayetoğlu, Vikram Sharma Mailthody, Sitao Huang, Steven Sam Lumetta, Wen-mei Hwu
Results demonstrate that TBA effectively reduces 47% of the activation peak memory usage.
no code implementations • 16 Jan 2023 • Kun Wu, Mert Hidayetoğlu, Xiang Song, Sitao Huang, Da Zheng, Israt Nisa, Wen-mei Hwu
Relational graph neural networks (RGNNs) are graph neural networks with dedicated structures for modeling the different types of nodes and edges in heterogeneous graphs.
no code implementations • 10 Nov 2021 • Seung Won Min, Kun Wu, Mert Hidayetoğlu, JinJun Xiong, Xiang Song, Wen-mei Hwu
With our data tiering method, we additionally provide a new data placement and access strategy to further minimize the CPU-GPU communication overhead.
1 code implementation • 4 Mar 2021 • Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, JinJun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu
In this work, we propose a novel GPU-oriented data communication approach for GCN training, where GPU threads directly access sparse features in host memory through zero-copy accesses without much CPU help.
1 code implementation • 20 Jan 2021 • Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, JinJun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu
While this process accounts for a significant portion of the training time, we find existing GNN implementations using popular deep neural network (DNN) libraries such as PyTorch are limited to a CPU-centric approach for the entire data preparation step.