Silent Data Corruptions at Scale

no code implementations22 Feb 2021 Harish Dattatraya Dixit, Sneha Pendharkar, Matt Beadon, Chris Mason, Tejasvi Chakravarthy, Bharath Muthiah, Sriram Sankar

This has resulted in hundreds of CPUs detected for these errors, showing that SDCs are a systemic issue across generations.

Hardware Architecture Distributed, Parallel, and Cluster Computing

