no code implementations • 25 Mar 2023 • Zirui Fu, Aleksandre Avaliani, Marco Donato
Executing machine learning inference tasks on resource-constrained edge devices requires careful hardware-software co-design optimizations.
no code implementations • 28 Nov 2020 • Thierry Tambe, Coleman Hooper, Lillian Pentecost, Tianyu Jia, En-Yu Yang, Marco Donato, Victor Sanh, Paul N. Whatmough, Alexander M. Rush, David Brooks, Gu-Yeon Wei
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
2 code implementations • 13 Jan 2020 • Paul Whatmough, Marco Donato, Glenn Ko, Sae-Kyu Lee, David Brooks, Gu-Yeon Wei
The current trend for domain-specific architectures (DSAs) has led to renewed interest in research test chips to demonstrate new specialized hardware.
Hardware Architecture
no code implementations • 23 Aug 2019 • Udit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry Tambe, Alexander M. Rush, Gu-Yeon Wei, David Brooks
The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs.