The transfer learning method can be applied to other problems that require transferring knowledge from simulations to the domain of real observations.
Large-scale numerical simulations are used across many scientific disciplines to facilitate experimental development and provide insights into underlying physical processes, but they come with a significant computational cost.
A crucial aspect of managing a public health crisis is to effectively balance prevention and mitigation strategies, while taking their socio-economic impact into account.
Here we present a new approach to calibrate an agent-based model -- EpiCast -- using a large set of simulation ensembles for different major metropolitan areas of the United States.
With increasing reliance on the outcomes of black-box models in critical applications, post-hoc explainability tools that do not require access to the model internals are often used to enable humans understand and trust these models.
Optimizing the performance of large-scale parallel codes is critical for efficient utilization of computing resources.
Distributed, Parallel, and Cluster Computing Performance
Predictive models that accurately emulate complex scientific processes can achieve exponential speed-ups over numerical simulators or experiments, and at the same time provide surrogates for improving the subsequent analysis.
Neural networks have become very popular in surrogate modeling because of their ability to characterize arbitrary, high dimensional functions in a data driven fashion.
no code implementations • 5 Dec 2019 • J. Luc Peterson, Ben Bay, Joe Koning, Peter Robinson, Jessica Semler, Jeremy White, Rushil Anirudh, Kevin Athey, Peer-Timo Bremer, Francesco Di Natale, David Fox, Jim A. Gaffney, Sam A. Jacobs, Bhavya Kailkhura, Bogdan Kustowski, Steven Langer, Brian Spears, Jayaraman Thiagarajan, Brian Van Essen, Jae-Seung Yeom
With the growing complexity of computational and experimental facilities, many scientific researchers are turning to machine learning (ML) techniques to analyze large scale ensemble data.
2 code implementations • 5 Oct 2019 • Sam Ade Jacobs, Brian Van Essen, David Hysom, Jae-Seung Yeom, Tim Moon, Rushil Anirudh, Jayaraman J. Thiagaranjan, Shusen Liu, Peer-Timo Bremer, Jim Gaffney, Tom Benson, Peter Robinson, Luc Peterson, Brian Spears
Training deep neural networks on large scientific data is a challenging task that requires enormous compute power, especially if no pre-trained models exist to initialize the process.
There is significant interest in using modern neural networks for scientific applications due to their effectiveness in modeling highly complex, non-linear problems in a data-driven fashion.
We present function preserving projections (FPP), a scalable linear projection technique for discovering interpretable relationships in high-dimensional data.
With rapid adoption of deep learning in critical applications, the question of when and how much to trust these models often arises, which drives the need to quantify the inherent uncertainties.
2 code implementations • 19 Jul 2019 • Shusen Liu, Di Wang, Dan Maljovec, Rushil Anirudh, Jayaraman J. Thiagarajan, Sam Ade Jacobs, Brian C. Van Essen, David Hysom, Jae-Seung Yeom, Jim Gaffney, Luc Peterson, Peter B. Robinson, Harsh Bhatia, Valerio Pascucci, Brian K. Spears, Peer-Timo Bremer
With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization.
This paper provides a general framework to study the effect of sampling properties of training data on the generalization error of the learned machine learning (ML) models.
Neural networks models have gained unprecedented popularity in natural language processing due to their state-of-the-art performance and the flexible end-to-end training scheme.
Techniques for understanding the functioning of complex machine learning models are becoming increasingly popular, not only to improve the validation process, but also to extract new insights about the data via exploratory analysis.
Unsupervised dimension selection is an important problem that seeks to reduce dimensionality of data, while preserving the most useful characteristics.
Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution.
Furthermore, we introduce a new approach to discover a diverse set of high quality linear projections and show that in practice the information of $k$ linear projections is often jointly encoded in $\sim k$ axis aligned plots.
Third, we propose an efficient estimator to evaluate the space-filling properties of sample designs in arbitrary dimensions and use it to develop an optimization framework to generate high quality space-filling designs.
Interpretability has emerged as a crucial aspect of building trust in machine learning systems, aimed at providing insights into the working of complex neural networks that are otherwise opaque to a user.
Tuning the models and parameters of common segmentation approaches is challenging especially in the presence of noise and artifacts.