Search Results for author: Altan Haan

Found 2 papers, 1 papers with code

SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics

no code implementations29 May 2023 Arash Ardakani, Altan Haan, Shangyin Tan, Doru Thom Popovici, Alvin Cheung, Costin Iancu, Koushik Sen

This allows SlimFit to freeze up to 95% of layers and reduce the overall on-device GPU memory usage of transformer-based models such as ViT and BERT by an average of 2. 2x, across different NLP and CV benchmarks/datasets such as GLUE, SQuAD 2. 0, CIFAR-10, CIFAR-100 and ImageNet with an average degradation of 0. 2% in accuracy.

Quantization Scheduling

Dynamic Tensor Rematerialization

1 code implementation ICLR 2021 Marisa Kirisame, Steven Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared Roesch, Tianqi Chen, Zachary Tatlock

Checkpointing enables the training of deep learning models under restricted memory budgets by freeing intermediate activations from memory and recomputing them on demand.

Cannot find the paper you are looking for? You can Submit a new open access paper.