Search Results for author: Peter Pietzuch

Found 7 papers, 3 papers with code

TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs

no code implementations9 Jan 2025 Pedro F. Silvestre, Peter Pietzuch

We describe TimeRL, a system for executing dynamic DRL programs that combines the dynamism of eager execution with the whole-program optimizations and scheduling of graph-based execution.

Deep Reinforcement Learning reinforcement-learning +1

ExclaveFL: Providing Transparency to Federated Learning using Exclaves

no code implementations13 Dec 2024 Jinnan Guo, Kapil Vaswani, Andrew Paverd, Peter Pietzuch

While current solutions have explored the use of trusted execution environment (TEEs) to combat such attacks, there is a mismatch with the security needs of FL: TEEs offer confidentiality guarantees, which are unnecessary for FL and make them vulnerable to side-channel attacks, and focus on coarse-grained attestation, which does not capture the execution of FL training.

Federated Learning

Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections

1 code implementation8 Dec 2023 Marcel Wagenländer, Guo Li, Bo Zhao, Luo Mai, Peter Pietzuch

After a GPU change, Scalai uses the PTC to transform the job state: the PTC repartitions the dataset state under data parallelism and exposes it to DL workers through a virtual file system; and the PTC obtains the model state as partitioned checkpoints and transforms them to reflect the new parallelization configuration.

Deep Learning

Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness

1 code implementation18 May 2023 Zeyuan Tan, Xiulong Yuan, Congjie He, Man-Kit Sit, Guo Li, Xiaoze Liu, Baole Ai, Kai Zeng, Peter Pietzuch, Luo Mai

Quiver's key idea is to exploit workload metrics for predicting the irregular computation of GNN requests, and governing the use of GPUs for graph sampling and feature aggregation: (1) for graph sampling, Quiver calculates the probabilistic sampled graph size, a metric that predicts the degree of parallelism in graph sampling.

Graph Sampling

MSRL: Distributed Reinforcement Learning with Dataflow Fragments

no code implementations3 Oct 2022 Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter Pietzuch, Lei Chen

Yet, current distributed RL systems tie the definition of RL algorithms to their distributed execution: they hard-code particular distribution strategies and only accelerate specific parts of the computation (e. g. policy network updates) on GPU workers.

reinforcement-learning Reinforcement Learning +1

CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers

1 code implementation8 Jan 2019 Alexandros Koliousis, Pijika Watcharapichat, Matthias Weidlich, Luo Mai, Paolo Costa, Peter Pietzuch

Systems such as TensorFlow and Caffe2 train models with parallel synchronous stochastic gradient descent: they process a batch of training data at a time, partitioned across GPUs, and average the resulting partial gradients to obtain an updated global model.

Deep Learning

Teechan: Payment Channels Using Trusted Execution Environments

no code implementations22 Dec 2016 Joshua Lind, Ittay Eyal, Peter Pietzuch, Emin Gün Sirer

We present Teechan, a full-duplex payment channel framework that exploits trusted execution environments.

Cryptography and Security

Cannot find the paper you are looking for? You can Submit a new open access paper.