Ian T. Foster

University of Chicago

H-index: 141

North America-United States

Description

Ian T. Foster, With an exceptional h-index of 141 and a recent h-index of 53 (since 2020), a distinguished researcher at University of Chicago, specializes in the field of Computer science, computational science, distributed computing, data science.

His recent articles reflect a diverse array of research interests and contributions to the field:

UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving

XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing

De Bello Homomorphico: Investigation of the extensibility of the OpenFHE library with basic mathematical functions by means of common approaches using the example of the CKKS …

Deep Learning for Molecular Orbitals

Combining Language and Graph Models for Semi-structured Information Extraction on the Web

Steering a Fleet: Adaptation for Large-Scale, Workflow-Based Experiments

Comprehensive exploration of synthetic data generation: A survey

MalleTrain: Deep Neural Network Training on Unfillable Supercomputer Nodes

Professor Information

University	University of Chicago
Position	and Argonne National Laboratory
Citations(all)	140568
Citations(since 2020)	19238
Cited By	128683
hIndex(all)	141
hIndex(since 2020)	53
i10Index(all)	648
i10Index(since 2020)	276
Email	Access Email
University Profile Page	University of Chicago

Research & Interests List

Computer science

computational science

distributed computing

data science

Top articles of Ian T. Foster

UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving

Modern scientific applications are increasingly decomposable into individual functions that may be deployed across distributed and diverse cyberinfrastructure such as supercomputers, clouds, and accelerators. Such applications call for new approaches to programming, distributed execution, and function-level management. We present UniFaaS, a parallel programming framework that relies on a federated function-as-a-service (FaaS) model to enable composition of distributed, scalable, and high-performance scientific workflows, and to support fine-grained function-level management. UniFaaS provides a unified programming interface to compose dynamic task graphs with transparent wide-area data management. UniFaaS exploits an observe-predict-decide approach to efficiently map workflow tasks to target heterogeneous and dynamic resources. We propose a dynamic heterogeneity-aware scheduling algorithm that employs a delay mechanism and a re-scheduling mechanism to accommodate dynamic resource capacity. Our experiments show that UniFaaS can efficiently execute workflows across computing resources with minimal scheduling overhead. We show that UniFaaS can improve the performance of a real-world drug screening workflow by as much as 22.99% when employing an additional 19.48% of resources and a montage workflow by 54.41% when employing an additional 47.83% of resources across multiple distributed clusters, in contrast to using a single cluster

Authors

Yifei Li,Ryan Chard,Yadu Babuji,Kyle Chard,Ian Foster,Zhuozhao Li

Journal

arXiv preprint arXiv:2403.19257

Published Date

2024/3/28

XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing

HPC and Cloud have evolved independently, specializing their innovations into performance or productivity. Acceleration as a Service (XaaS) is a recipe to empower both fields with a shared execution platform that provides transparent access to computing resources, regardless of the underlying cloud or HPC service provider. Bridging HPC and cloud advancements, XaaS presents a unified architecture built on performance-portable containers. Our converged model concentrates on low-overhead, high-performance communication and computing, targeting resource-intensive workloads from climate simulations to machine learning. XaaS lifts the restricted allocation model of Function-as-a-Service (FaaS), allowing users to benefit from the flexibility and efficient resource utilization of serverless while supporting long-running and performance-sensitive workloads from HPC.

Authors

Torsten Hoefler,Marcin Copik,Pete Beckman,Andrew Jones,Ian Foster,Manish Parashar,Daniel Reed,Matthias Troyer,Thomas Schulthess,Dan Ernst,Jack Dongarra

Journal

arXiv preprint arXiv:2401.04552

Published Date

2024/1/9

De Bello Homomorphico: Investigation of the extensibility of the OpenFHE library with basic mathematical functions by means of common approaches using the example of the CKKS …

Cloud computing has become increasingly popular due to its scalability, cost-effectiveness, and ability to handle large volumes of data. However, entrusting (sensitive) data to a third party raises concerns about data security and privacy. Homomorphic encryption is one solution that allows users to store and process data in a public cloud without the cloud provider having access to it. Currently, homomorphic encryption libraries only support addition and multiplication; other mathematical functions must be implemented by the user. To this end, we discuss and implement the division, exponential, square root, logarithm, minimum, and maximum function, using the CKKS cryptosystem of the OpenFHE library. To demonstrate that complex applications can be realized with this extended function set, we have used it to homomorphically realize the Box–Cox transform, which is used in many real-world applications, e.g., time …

Authors

Thomas Prantl,Lukas Horn,Simon Engel,Lukas Iffländer,Lukas Beierlieb,Christian Krupitzer,André Bauer,Mansi Sakarvadia,Ian Foster,Samuel Kounev

Journal

International Journal of Information Security

Published Date

2023/11/27

Deep Learning for Molecular Orbitals

The advancement of deep learning in chemistry has resulted in state-of-the-art models that incorporate an increasing number of concepts from standard quantum chemistry, such as orbitals and Hamiltonians. With an eye towards the future development of these deep learning approaches, we present here what we believe to be the first work focused on assigning labels to orbitals, namely energies and characterizations, given the real-space descriptions of these orbitals from standard electronic structure theories such as Hartree-Fock. In addition to providing a foundation for future development, we expect these models to have immediate impact in automatizing and interpreting the results of advanced electronic structure approaches for chemical reactivity and spectroscopy.

Authors

Daniel King,Daniel Grzenda,Ray Zhu,Nathaniel Hudson,Ian Foster,Laura Gagliardi

Published Date

2024/4/30

Combining Language and Graph Models for Semi-structured Information Extraction on the Web

Relation extraction is an efficient way of mining the extraordinary wealth of human knowledge on the Web. Existing methods rely on domain-specific training data or produce noisy outputs. We focus here on extracting targeted relations from semi-structured web pages given only a short description of the relation. We present GraphScholarBERT, an open-domain information extraction method based on a joint graph and language model structure. GraphScholarBERT can generalize to previously unseen domains without additional data or training and produces only clean extraction results matched to the search keyword. Experiments show that GraphScholarBERT can improve extraction F1 scores by as much as 34.8\% compared to previous work in a zero-shot domain and zero-shot website setting.

Authors

Zhi Hong,Kyle Chard,Ian Foster

Journal

arXiv preprint arXiv:2402.14129

Published Date

2024/2/21

Steering a Fleet: Adaptation for Large-Scale, Workflow-Based Experiments

Experimental science is increasingly driven by instruments that produce vast volumes of data and thus a need to manage, compute, describe, and index this data. High performance and distributed computing provide the means of addressing the computing needs; however, in practice, the variety of actions required and the distributed set of resources involved, requires sophisticated "flows" defining the steps to be performed on data. As each scan or measurement is performed by an instrument, a new instance of the flow is initiated resulting in a "fleet" of concurrently running flows, with the overall goal to process all the data collected during a potentially long-running experiment. During the course of the experiment, each flow may need to adapt its execution due to changes in the environment, such as computational or storage resource availability, or based on the progress of the fleet as a whole such as completion or discovery of an intermediate result leading to a change in subsequent flow's behavior. We introduce a cloud-based decision engine, Braid, which flows consult during execution to query their run-time environment and coordinate with other flows within their fleet. Braid accepts streams of measurements taken from the run-time environment or from within flow runs which can then be statistically aggregated and compared to other streams to determine a strategy to guide flow execution. For example, queue lengths in execution environments can be used to direct a flow to run computations in one environment or another, or experiment progress as measured by individual flows can be aggregated to determine the progress and subsequent …

Authors

Jim Pruyne,Valerie Hayot-Sasson,Weijian Zheng,Ryan Chard,Justin M Wozniak,Tekin Bicer,Kyle Chard,Ian T Foster

Journal

arXiv preprint arXiv:2403.06077

Published Date

2024/3/10

Comprehensive exploration of synthetic data generation: A survey

Recent years have witnessed a surge in the popularity of Machine Learning (ML), applied across diverse domains. However, progress is impeded by the scarcity of training data due to expensive acquisition and privacy legislation. Synthetic data emerges as a solution, but the abundance of released models and limited overview literature pose challenges for decision-making. This work surveys 417 Synthetic Data Generation (SDG) models over the last decade, providing a comprehensive overview of model types, functionality, and improvements. Common attributes are identified, leading to a classification and trend analysis. The findings reveal increased model performance and complexity, with neural network-based approaches prevailing, except for privacy-preserving data generation. Computer vision dominates, with GANs as primary generative models, while diffusion models, transformers, and RNNs compete. Implications from our performance evaluation highlight the scarcity of common metrics and datasets, making comparisons challenging. Additionally, the neglect of training and computational costs in literature necessitates attention in future research. This work serves as a guide for SDG model selection and identifies crucial areas for future exploration.

Authors

André Bauer,Simon Trapp,Michael Stenger,Robert Leppich,Samuel Kounev,Mark Leznik,Kyle Chard,Ian Foster

Published Date

2024/1/4

MalleTrain: Deep Neural Network Training on Unfillable Supercomputer Nodes

First-come first-serve scheduling can result in substantial (up to 10%) of transiently idle nodes on supercomputers. Recognizing that such unfilled nodes are well-suited for deep neural network (DNN) training, due to the flexible nature of DNN training tasks, Liu et al. proposed that the re-scaling DNN training tasks to fit gaps in schedules be formulated as a mixed-integer linear programming (MILP) problem, and demonstrated via simulation the potential benefits of the approach. Here, we introduce MalleTrain, a system that provides the first practical implementation of this approach and that furthermore generalizes it by allowing it use even for DNN training applications for which model information is unknown before runtime. Key to this latter innovation is the use of a lightweight online job profiling advisor (JPA) to collect critical scalability information for DNN jobs -- information that it then employs to optimize resource allocations dynamically, in real time. We describe the MalleTrain architecture and present the results of a detailed experimental evaluation on a supercomputer GPU cluster and several representative DNN training workloads, including neural architecture search and hyperparameter optimization. Our results not only confirm the practical feasibility of leveraging idle supercomputer nodes for DNN training but improve significantly on prior results, improving training throughput by up to 22.3\% without requiring users to provide job scalability information.

Authors

Xiaolong Ma,Feng Yan,Lei Yang,Ian Foster,Michael E Papka,Zhengchun Liu,Rajkumar Kettimuthu

Journal

arXiv preprint arXiv:2404.15668

Published Date

2024/4/24

Professor FAQs

What is Ian T. Foster's h-index at University of Chicago?

The h-index of Ian T. Foster has been 53 since 2020 and 141 in total.

What are Ian T. Foster's top articles?

The articles with the titles of

UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving

XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing

De Bello Homomorphico: Investigation of the extensibility of the OpenFHE library with basic mathematical functions by means of common approaches using the example of the CKKS …

Deep Learning for Molecular Orbitals

Combining Language and Graph Models for Semi-structured Information Extraction on the Web

Steering a Fleet: Adaptation for Large-Scale, Workflow-Based Experiments

Comprehensive exploration of synthetic data generation: A survey

MalleTrain: Deep Neural Network Training on Unfillable Supercomputer Nodes

...

are the top articles of Ian T. Foster at University of Chicago.