no code implementations • 11 Jan 2024 • Qian Gong, Jieyang Chen, Ben Whitney, Xin Liang, Viktor Reshniak, Tania Banerjee, Jaemoon Lee, Anand Rangarajan, Lipeng Wan, Nicolas Vidal, Qing Liu, Ana Gainaru, Norbert Podhorszki, Richard Archibald, Sanjay Ranka, Scott Klasky
We describe MGARD, a software providing MultiGrid Adaptive Reduction for floating-point scientific data on structured and unstructured grids.
no code implementations • 6 Jan 2024 • Qian Gong, Chengzhu Zhang, Xin Liang, Viktor Reshniak, Jieyang Chen, Anand Rangarajan, Sanjay Ranka, Nicolas Vidal, Lipeng Wan, Paul Ullrich, Norbert Podhorszki, Robert Jacob, Scott Klasky
Additionally, we integrate spatiotemporal feature detection with data compression and demonstrate that performing adaptive error-bounded compression in higher dimensional space enables greater compression ratios, leveraging the error propagation theory of a transformation-based compressor.
1 code implementation • 21 Dec 2022 • Tania Banerjee, Jong Choi, Jaemoon Lee, Qian Gong, Jieyang Chen, Scott Klasky, Anand Rangarajan, Sanjay Ranka
Data compression is becoming critical for storing scientific data because many scientific applications need to store large amounts of data and post process this data for scientific discovery.
no code implementations • 16 Jul 2020 • Bingbing Li, Santosh Pandey, Haowen Fang, Yanjun Lyv, Ji Li, Jieyang Chen, Mimi Xie, Lipeng Wan, Hang Liu, Caiwen Ding
In natural language processing (NLP), the "Transformer" architecture was proposed as the first transduction model replying entirely on self-attention mechanisms without using sequence-aligned recurrent neural networks (RNNs) or convolution, and it achieved significant improvements for sequence to sequence tasks.
no code implementations • 27 Mar 2020 • Kai Zhao, Sheng Di, Sihuan Li, Xin Liang, Yujia Zhai, Jieyang Chen, Kaiming Ouyang, Franck Cappello, Zizhong Chen
(1) We propose several systematic ABFT schemes based on checksum techniques and analyze their fault protection ability and runtime thoroughly. Unlike traditional ABFT based on matrix-matrix multiplication, our schemes support any convolution implementations.
2 code implementations • 9 Feb 2020 • Cody Rivera, Jieyang Chen, Nan Xiong, Shuaiwen Leon Song, Dingwen Tao
Many works have been done on optimizing linear algebra operations on GPUs with regular-shaped input.
Distributed, Parallel, and Cluster Computing
1 code implementation • 11 Mar 2019 • Ang Li, Shuaiwen Leon Song, Jieyang Chen, Jiajia Li, Xu Liu, Nathan Tallent, Kevin Barker
High performance multi-GPU computing becomes an inevitable trend due to the ever-increasing demand on computation capability in emerging domains such as deep learning, big data and planet-scale simulations.
Hardware Architecture Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Performance