Search Results for author: Xiangxi Mo

Found 5 papers, 2 papers with code

Optimizing LLM Queries in Relational Workloads

no code implementations9 Mar 2024 Shu Liu, Asim Biswal, Audrey Cheng, Xiangxi Mo, Shiyi Cao, Joseph E. Gonzalez, Ion Stoica, Matei Zaharia

In this paper, we explore how to optimize LLM inference for analytical workloads that invoke LLMs within relational queries.

Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning

no code implementations12 Jun 2019 Xiangxi Mo, Ruizhe Cheng, Tianyi Fang

We propose an efficient transfer learning method for adapting ImageNet pre-trained Convolutional Neural Network (CNN) to fine-grained image classification task.

Fine-Grained Image Classification General Classification +1

The OoO VLIW JIT Compiler for GPU Inference

no code implementations28 Jan 2019 Paras Jain, Xiangxi Mo, Ajay Jain, Alexey Tumanov, Joseph E. Gonzalez, Ion Stoica

Current trends in Machine Learning~(ML) inference on hardware accelerated devices (e. g., GPUs, TPUs) point to alarmingly low utilization.

InferLine: ML Inference Pipeline Composition Framework

1 code implementation5 Dec 2018 Daniel Crankshaw, Gur-Eyal Sela, Corey Zumar, Xiangxi Mo, Joseph E. Gonzalez, Ion Stoica, Alexey Tumanov

The dominant cost in production machine learning workloads is not training individual models but serving predictions from increasingly complex prediction pipelines spanning multiple models, machine learning frameworks, and parallel hardware accelerators.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.