Search Results for author: Ian Foster

Found 50 papers, 27 papers with code

Combining Language and Graph Models for Semi-structured Information Extraction on the Web

no code implementations21 Feb 2024 Zhi Hong, Kyle Chard, Ian Foster

Relation extraction is an efficient way of mining the extraordinary wealth of human knowledge on the Web.

Language Modelling Relation +1

Comprehensive Exploration of Synthetic Data Generation: A Survey

no code implementations4 Jan 2024 André Bauer, Simon Trapp, Michael Stenger, Robert Leppich, Samuel Kounev, Mark Leznik, Kyle Chard, Ian Foster

This work surveys 417 Synthetic Data Generation (SDG) models over the last decade, providing a comprehensive overview of model types, functionality, and improvements.

Decision Making Model Selection +2

Accelerating Electronic Stopping Power Predictions by 10 Million Times with a Combination of Time-Dependent Density Functional Theory and Machine Learning

1 code implementation1 Nov 2023 Logan Ward, Ben Blaiszik, Cheng-Wei Lee, Troy Martin, Ian Foster, André Schleife

Knowing the rate at which particle radiation releases energy in a material, the stopping power, is key to designing nuclear reactors, medical treatments, semiconductor and quantum materials, and many other technologies.

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations6 Oct 2023 Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Linking the Dynamic PicoProbe Analytical Electron-Optical Beam Line / Microscope to Supercomputers

no code implementations25 Aug 2023 Alexander Brace, Rafael Vescovi, Ryan Chard, Nickolaus D. Saint, Arvind Ramanathan, Nestor J. Zaluzec, Ian Foster

The Dynamic PicoProbe at Argonne National Laboratory is undergoing upgrades that will enable it to produce up to 100s of GB of data per day.

Hierarchical and Decentralised Federated Learning

no code implementations28 Apr 2023 Omer Rana, Theodoros Spyridopoulos, Nathaniel Hudson, Matt Baughman, Kyle Chard, Ian Foster, Aftab Khan

Hierarchical Federated Learning is likely to be a key enabler for a wide range of applications, such as smart farming and smart energy management, as it can improve performance and reduce costs, whilst also enabling FL workflows to be deployed in environments that are not well-suited to traditional FL.

energy management Federated Learning

Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources

2 code implementations15 Mar 2023 Logan Ward, J. Gregory Pauloski, Valerie Hayot-Sasson, Ryan Chard, Yadu Babuji, Ganesh Sivaraman, Sutanay Choudhury, Kyle Chard, Rajeev Thakur, Ian Foster

Applications that fuse machine learning and simulation can benefit from the use of multiple computing resources, with, for example, simulation codes running on highly parallel supercomputers and AI training and inference tasks on specialized accelerators.

Management

OpenHLS: High-Level Synthesis for Low-Latency Deep Neural Networks for Experimental Science

no code implementations13 Feb 2023 Maksim Levental, Arham Khan, Ryan Chard, Kazutomo Yoshii, Kyle Chard, Ian Foster

In many experiment-driven scientific domains, such as high-energy physics, material science, and cosmology, high data rate experiments impose hard constraints on data acquisition systems: collected data must either be indiscriminately stored for post-processing and analysis, thereby necessitating large storage capacity, or accurately filtered in real-time, thereby necessitating low-latency processing.

Low-latency processing

Insight into cloud processes from unsupervised classification with a rotationally invariant autoencoder

1 code implementation2 Nov 2022 Takuya Kurihana, James Franke, Ian Foster, Ziwei Wang, Elisabeth Moyer

Clouds play a critical role in the Earth's energy budget and their potential changes are one of the largest uncertainties in future climate projections.

AICCA: AI-driven Cloud Classification Atlas

1 code implementation29 Sep 2022 Takuya Kurihana, Elisabeth Moyer, Ian Foster

Clouds play an important role in the Earth's energy budget and their behavior is one of the largest uncertainties in future climate projections.

Classification

Masked Sinogram Model with Transformer for ill-Posed Computed Tomography Reconstruction: a Preliminary Study

1 code implementation3 Sep 2022 Zhengchun Liu, Rajkumar Kettimuthu, Ian Foster

Inspired by the success of transformer for natural language processing, the core idea of this preliminary study is to consider a projection of tomography as a word token, and the whole scan of the cross-section (A. K. A.

Computed Tomography (CT) Sentence

Globus Automation Services: Research process automation across the space-time continuum

no code implementations19 Aug 2022 Ryan Chard, Jim Pruyne, Kurt McKee, Josh Bryan, Brigitte Raumann, Rachana Ananthakrishnan, Kyle Chard, Ian Foster

We report here on new services within the Globus research data management platform that enable the specification of diverse research processes as reusable sets of actions, \emph{flows}, and the execution of such flows in heterogeneous research environments.

Management

FAIR principles for AI models with a practical application for accelerated high energy diffraction microscopy

1 code implementation1 Jul 2022 Nikil Ravi, Pranshu Chaturvedi, E. A. Huerta, Zhengchun Liu, Ryan Chard, Aristana Scourtas, K. J. Schmidt, Kyle Chard, Ben Blaiszik, Ian Foster

A concise and measurable set of FAIR (Findable, Accessible, Interoperable and Reusable) principles for scientific data is transforming the state-of-practice for data management and stewardship, supporting and enabling discovery and innovation.

Management

The Diminishing Returns of Masked Language Models to Science

no code implementations23 May 2022 Zhi Hong, Aswathy Ajith, Gregory Pauloski, Eamon Duede, Kyle Chard, Ian Foster

Transformer-based masked language models such as BERT, trained on general corpora, have shown impressive performance on downstream tasks.

Language Modelling

3D Convolutional Neural Networks for Dendrite Segmentation Using Fine-Tuning and Hyperparameter Optimization

no code implementations2 May 2022 Jim James, Nathan Pruyne, Tiberiu Stan, Marcus Schwarting, Jiwon Yeom, Seungbum Hong, Peter Voorhees, Ben Blaiszik, Ian Foster

The trained 3D CNNs are able to segment entire 852 x 852 x 250 voxel 3D volumes in only ~60 seconds, thus hastening the progress towards a deeper understanding of phase transformation phenomena such as dendritic solidification.

Hyperparameter Optimization

fairDMS: Rapid Model Training by Data and Model Reuse

1 code implementation20 Apr 2022 Ahsan Ali, Hemant Sharma, Rajkumar Kettimuthu, Peter Kenesei, Dennis Trujillo, Antonino Miceli, Ian Foster, Ryan Coffee, Jana Thayer, Zhengchun Liu

Extracting actionable information rapidly from data produced by instruments such as the Linac Coherent Light Source (LCLS-II) and Advanced Photon Source Upgrade (APS-U) is becoming ever more challenging due to high (up to TB/s) data rates.

Information Retrieval Retrieval

Ultrafast Focus Detection for Automated Microscopy

no code implementations26 Aug 2021 Maksim Levental, Ryan Chard, Kyle Chard, Ian Foster, Gregg A. Wildenberg

Technological advancements in modern scientific instruments, such as scanning electron microscopes (SEMs), have significantly increased data acquisition rates and image resolutions enabling new questions to be explored; however, the resulting data volumes and velocities, combined with automated experiments, are quickly overwhelming scientists as there remain crucial steps that require human intervention, for example reviewing image focus.

Semantic Segmentation

KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks

3 code implementations4 Jul 2021 J. Gregory Pauloski, Qi Huang, Lei Huang, Shivaram Venkataraman, Kyle Chard, Ian Foster, Zhao Zhang

Kronecker-factored Approximate Curvature (K-FAC) has recently been shown to converge faster in deep neural network (DNN) training than stochastic gradient descent (SGD); however, K-FAC's larger memory footprint hinders its applicability to large models.

BFTrainer: Low-Cost Training of Neural Networks on Unfillable Supercomputer Nodes

3 code implementations22 Jun 2021 Zhengchun Liu, Rajkumar Kettimuthu, Michael E. Papka, Ian Foster

We describe how the task of rescaling suitable DNN training tasks to fit dynamically changing holes can be formulated as a deterministic mixed integer linear programming (MILP)-based resource allocation algorithm, and show that this MILP problem can be solved efficiently at run time.

Scheduling

Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval

2 code implementations28 May 2021 Zhengchun Liu, Ahsan Ali, Peter Kenesei, Antonino Miceli, Hemant Sharma, Nicholas Schwarz, Dennis Trujillo, Hyunseung Yoo, Ryan Coffee, Naoufal Layad, Jana Thayer, Ryan Herbst, ChunHong Yoon, Ian Foster

Extremely high data rates at modern synchrotron and X-ray free-electron laser light source beamlines motivate the use of machine learning methods for data reduction, feature detection, and other purposes.

BIG-bench Machine Learning Edge-computing +2

Coupling streaming AI and HPC ensembles to achieve 100-1000x faster biomolecular simulations

no code implementations10 Apr 2021 Alexander Brace, Igor Yakushin, Heng Ma, Anda Trifan, Todd Munson, Ian Foster, Arvind Ramanathan, Hyungro Lee, Matteo Turilli, Shantenu Jha

The results establish DeepDriveMD as a high-performance framework for ML-driven HPC simulation scenarios, that supports diverse MD simulation and ML back-ends, and which enables new scientific insights by improving the length and time scales accessible with current computing capacity.

Protein Folding

Data-driven Cloud Clustering via a Rotationally Invariant Autoencoder

no code implementations8 Mar 2021 Takuya Kurihana, Elisabeth Moyer, Rebecca Willett, Davis Gilton, Ian Foster

Advanced satellite-born remote sensing instruments produce high-resolution multi-spectral data for much of the globe at a daily cadence.

Clustering

Fast and accurate learned multiresolution dynamical downscaling for precipitation

1 code implementation18 Jan 2021 Jiali Wang, Zhengchun Liu, Ian Foster, Won Chang, Rajkumar Kettimuthu, Rao Kotamarthi

We compare the four new CNN-derived high-resolution precipitation results with precipitation generated from original high resolution simulations, a bilinear interpolater and the state-of-the-art CNN-based super-resolution (SR) technique.

Generative Adversarial Network Super-Resolution

AI- and HPC-enabled Lead Generation for SARS-CoV-2: Models and Processes to Extract Druglike Molecules Contained in Natural Language Text

1 code implementation12 Jan 2021 Zhi Hong, J. Gregory Pauloski, Logan Ward, Kyle Chard, Ben Blaiszik, Ian Foster

Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

named-entity-recognition Named Entity Recognition +1

Infrastructure for Artificial Intelligence, Quantum and High Performance Computing

no code implementations16 Dec 2020 William Gropp, Sujata Banerjee, Ian Foster

High Performance Computing (HPC), Artificial Intelligence (AI)/Machine Learning (ML), and Quantum Computing (QC) and communications offer immense opportunities for innovation and impact on society.

Vocal Bursts Intensity Prediction

Accelerated, Scalable and Reproducible AI-driven Gravitational Wave Detection

no code implementations15 Dec 2020 E. A. Huerta, Asad Khan, Xiaobo Huang, Minyang Tian, Maksim Levental, Ryan Chard, Wei Wei, Maeve Heflin, Daniel S. Katz, Volodymyr Kindratenko, Dawei Mu, Ben Blaiszik, Ian Foster

The development of reusable artificial intelligence (AI) models for wider use and rigorous validation by the community promises to unlock new opportunities in multi-messenger astrophysics.

Distributed Computing Gravitational Wave Detection

The Rise of AI-Driven Simulators: Building a New Crystal Ball

no code implementations11 Dec 2020 Ian Foster, David Parkes, Stephan Zheng

These advances may lead to a new era in computational simulation, in which sensors of many kinds are used to produce vast quantities of data, AI methods identify patterns in those data, and new AI-driven simulators combine machine-learned and mathematical rules to make accurate and actionable predictions.

HydroNet: Benchmark Tasks for Preserving Intermolecular Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data

no code implementations30 Nov 2020 Sutanay Choudhury, Jenna A. Bilbrey, Logan Ward, Sotiris S. Xantheas, Ian Foster, Joseph P. Heindel, Ben Blaiszik, Marcus E. Schwarting

Intermolecular and long-range interactions are central to phenomena as diverse as gene regulation, topological states of quantum materials, electrolyte transport in batteries, and the universal solvation properties of water.

BIG-bench Machine Learning

Towards Online Steering of Flame Spray Pyrolysis Nanoparticle Synthesis

1 code implementation16 Oct 2020 Maksim Levental, Ryan Chard, Joseph A. Libera, Kyle Chard, Aarthi Koripelly, Jakob R. Elias, Marcus Schwarting, Ben Blaiszik, Marius Stan, Santanu Chaudhuri, Ian Foster

Flame Spray Pyrolysis (FSP) is a manufacturing technique to mass produce engineered nanoparticles for applications in catalysis, energy materials, composites, and more.

BraggNN: Fast X-ray Bragg Peak Analysis Using Deep Learning

2 code implementations18 Aug 2020 Zhengchun Liu, Hemant Sharma, Jun-Sang Park, Peter Kenesei, Antonino Miceli, Jonathan Almer, Rajkumar Kettimuthu, Ian Foster

When applied to a real experimental dataset, a 3D reconstruction that used peak positions computed by BraggNN yields 15% better results on average as compared to a reconstruction obtained using peak positions determined using conventional 2D pseudo-Voigt fitting.

3D Reconstruction

Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release

1 code implementation28 May 2020 Yadu Babuji, Ben Blaiszik, Tom Brettin, Kyle Chard, Ryan Chard, Austin Clyde, Ian Foster, Zhi Hong, Shantenu Jha, Zhuozhao Li, Xuefeng Liu, Arvind Ramanathan, Yi Ren, Nicholaus Saint, Marcus Schwarting, Rick Stevens, Hubertus van Dam, Rick Wagner

Researchers across the globe are seeking to rapidly repurpose existing drugs or discover new drugs to counter the the novel coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

BIG-bench Machine Learning

funcX: A Federated Function Serving Fabric for Science

no code implementations7 May 2020 Ryan Chard, Yadu Babuji, Zhuozhao Li, Tyler Skluzacek, Anna Woodard, Ben Blaiszik, Ian Foster, Kyle Chard

These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e. g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available.

Distributed, Parallel, and Cluster Computing

Scientific Image Restoration Anywhere

2 code implementations12 Nov 2019 Vibhatha Abeykoon, Zhengchun Liu, Rajkumar Kettimuthu, Geoffrey Fox, Ian Foster

We explore this question by evaluating the performance and accuracy of a scientific image restoration model, for which both model input and output are images, on edge computing devices.

Edge-computing Image Denoising +2

Deep Learning Accelerated Light Source Experiments

2 code implementations9 Oct 2019 Zhengchun Liu, Tekin Bicer, Rajkumar Kettimuthu, Ian Foster

Experimental protocols at synchrotron light sources typically process and validate data only after an experiment has completed, which can lead to undetected errors and cannot enable online steering.

IRNet: A General Purpose Deep Residual Regression Framework for Materials Discovery

2 code implementations7 Jul 2019 Dipendra Jha, Logan Ward, Zijiang Yang, Christopher Wolverton, Ian Foster, Wei-keng Liao, Alok Choudhary, Ankit Agrawal

We use the problem of learning properties of inorganic materials from numerical attributes derived from material composition and/or crystal structure to compare IRNet's performance against that of other machine learning techniques.

BIG-bench Machine Learning regression

Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations

1 code implementation7 Jun 2019 Logan Ward, Ben Blaiszik, Ian Foster, Rajeev S. Assary, Badri Narayanan, Larry Curtiss

Recent studies illustrate how machine learning (ML) can be used to bypass a core challenge of molecular modeling: the tradeoff between accuracy and computational cost.

BIG-bench Machine Learning

TomoGAN: Low-Dose Synchrotron X-Ray Tomography with Generative Adversarial Networks

3 code implementations20 Feb 2019 Zhengchun Liu, Tekin Bicer, Rajkumar Kettimuthu, Doga Gursoy, Francesco De Carlo, Ian Foster

We present \TOMOGAN{}, a denoising technique based on generative adversarial networks, for improving the quality of reconstructed images for low-dose imaging conditions.

Denoising

DLHub: Model and Data Serving for Science

no code implementations27 Nov 2018 Ryan Chard, Zhuozhao Li, Kyle Chard, Logan Ward, Yadu Babuji, Anna Woodard, Steve Tuecke, Ben Blaiszik, Michael J. Franklin, Ian Foster

Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities with a focus on science applications.

Distributed Computing

BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments

1 code implementation11 Jan 2018 Maria Luiza Mondelli, Thiago Magalhães, Guilherme Loss, Michael Wilde, Ian Foster, Marta Mattoso, Daniel S. Katz, Helio J. C. Barbosa, Ana Tereza R. Vasconcelos, Kary Ocaña, Luiz M. R. Gadelha Jr

This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application.

Distributed, Parallel, and Cluster Computing Databases

Cannot find the paper you are looking for? You can Submit a new open access paper.