no code implementations • 17 Apr 2019 • Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Logan Weber, Josh Pollock, Luis Vega, Ziheng Jiang, Tianqi Chen, Thierry Moreau, Zachary Tatlock
Using these extension mechanisms, Relay supports a unified compiler that can target a variety of hardware platforms.
no code implementations • 25 Oct 2018 • Meghan Cowan, Thierry Moreau, Tianqi Chen, Luis Ceze
To date, none of the popular deep learning directly support low precision operators, partly due to a lack of optimized low precision libraries.
no code implementations • 11 Jul 2018 • Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility.
no code implementations • NeurIPS 2018 • Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems.
1 code implementation • 12 Feb 2018 • Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.
7 code implementations • 19 Jan 2018 • Thierry Moreau, Anton Lokhmotov, Grigori Fursin
Co-designing efficient machine learning based systems across the whole hardware/software stack to trade off speed, accuracy, energy and costs is becoming extremely complex and time consuming.
no code implementations • 14 Jun 2017 • Sung Kim, Patrick Howe, Thierry Moreau, Armin Alaghi, Luis Ceze, Visvesh Sathe
As a result of the increasing demand for deep neural network (DNN)-based services, efforts to develop dedicated hardware accelerators for DNNs are growing rapidly.