Search Results for author: Chris Cummins

Found 14 papers, 7 papers with code

LoopTune: Optimizing Tensor Computations with Reinforcement Learning

no code implementations4 Sep 2023 Dejan Grubisic, Bram Wasti, Chris Cummins, John Mellor-Crummey, Aleksandar Zlateski

Advanced compiler technology is crucial for enabling machine learning applications to run on novel hardware, but traditional compilers fail to deliver performance, popular auto-tuners have long search times and expert-optimized libraries introduce unsustainable costs.


SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly

no code implementations21 May 2023 Jordi Armengol-Estapé, Jackson Woodruff, Chris Cummins, Michael F. P. O'Boyle

SLaDe is up to 6 times more accurate than Ghidra, a state-of-the-art, industrial-strength decompiler and up to 4 times more accurate than the large language model ChatGPT and generates significantly more readable code than both.

Language Modelling Large Language Model

BenchDirect: A Directed Language Model for Compiler Benchmarks

no code implementations2 Mar 2023 Foivos Tsimpourlas, Pavlos Petoumenos, Min Xu, Chris Cummins, Kim Hazelwood, Ajitha Rajan, Hugh Leather

We improve this with BenchDirect which utilizes a directed LM that infills programs by jointly observing source code context and the compiler features that are targeted.

Active Learning Language Modelling

BenchPress: A Deep Active Benchmark Generator

1 code implementation13 Aug 2022 Foivos Tsimpourlas, Pavlos Petoumenos, Min Xu, Chris Cummins, Kim Hazelwood, Ajitha Rajan, Hugh Leather

We develop BenchPress, the first ML benchmark generator for compilers that is steerable within feature space representations of source code.

Active Learning

Profile Guided Optimization without Profiles: A Machine Learning Approach

1 code implementation24 Dec 2021 Nadav Rotem, Chris Cummins

We perform offline training using information that is collected from a large corpus of binaries that have branch probabilities information.

BIG-bench Machine Learning

CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research

1 code implementation17 Sep 2021 Chris Cummins, Bram Wasti, Jiadong Guo, Brandon Cui, Jason Ansel, Sahir Gomez, Somya Jain, Jia Liu, Olivier Teytaud, Benoit Steiner, Yuandong Tian, Hugh Leather

What is needed is an easy, reusable experimental infrastructure for real world compiler optimization tasks that can serve as a common benchmark for comparing techniques, and as a platform to accelerate progress in the field.

Compiler Optimization OpenAI Gym

Learning Space Partitions for Path Planning

2 code implementations NeurIPS 2021 Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner, Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian

Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function.

Value Function Based Performance Optimization of Deep Learning Workloads

no code implementations30 Nov 2020 Benoit Steiner, Chris Cummins, Horace He, Hugh Leather

As machine learning techniques become ubiquitous, the efficiency of neural network implementations is becoming correspondingly paramount.


Deep Data Flow Analysis

no code implementations21 Nov 2020 Chris Cummins, Hugh Leather, Zacharias Fisches, Tal Ben-Nun, Torsten Hoefler, Michael O'Boyle

Compiler architects increasingly look to machine learning when building heuristics for compiler optimization.

BIG-bench Machine Learning Compiler Optimization

ProGraML: Graph-based Deep Learning for Program Optimization and Analysis

2 code implementations23 Mar 2020 Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Hugh Leather

We introduce ProGraML - Program Graphs for Machine Learning - a novel graph-based program representation using a low level, language agnostic, and portable format; and machine learning models capable of performing complex downstream tasks over these graphs.

BIG-bench Machine Learning

Autotuning OpenCL Workgroup Size for Stencil Patterns

1 code implementation8 Nov 2015 Chris Cummins, Pavlos Petoumenos, Michel Steuwer, Hugh Leather

Selecting an appropriate workgroup size is critical for the performance of OpenCL kernels, and requires knowledge of the underlying hardware, the data being operated on, and the implementation of the kernel.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.