About
I'm available and eager to help with any of the following:
Software performance…
Services
Activity
-
Hey everyone! Salil Pradhan and my team at X, The Moonshot Factory is looking for an ML Engineer extraordinaire to help us decarbonize the world by…
Hey everyone! Salil Pradhan and my team at X, The Moonshot Factory is looking for an ML Engineer extraordinaire to help us decarbonize the world by…
Liked by Phil Miller
-
How do we, as a community, check citations for correctness to prevent hallucinations from being published? Some papers have >100 citations, reviewers…
How do we, as a community, check citations for correctness to prevent hallucinations from being published? Some papers have >100 citations, reviewers…
Liked by Phil Miller
Experience
Education
Publications
-
Position Paper: A Multi-resolution Emulation + Simulation Methodology
MODSIM 2013
As we design exascale applications and machines, it becomes important to be able to analyze and experiment with alternate designs of both machines and applications. These experiments have to be done before the machines are built since it will be too expensive to build a large number of alternate designs. One of the challenges in this process is how to represent application behavior in such machines. For analyzing network performance via simulations, for example, one can use pre-designed…
As we design exascale applications and machines, it becomes important to be able to analyze and experiment with alternate designs of both machines and applications. These experiments have to be done before the machines are built since it will be too expensive to build a large number of alternate designs. One of the challenges in this process is how to represent application behavior in such machines. For analyzing network performance via simulations, for example, one can use pre-designed injection patterns, but they do not capture the feedback that occurs naturally in applications: if an incoming message is late, the ordering of events may change, and outgoing message injection will also change. To achieve a high fidelity simulation is therefore challenging. One method that has shown promise is that of emulation- followed-by-simulation: one carries out a full-scale emulation of the application with the correct number of nodes and control threads, facilitated by some overdecomposition based system such as Charm++ [1], FG-MPI[2], or AMPI [3]. The emulation captures dependencies between sequential computations and remote data in traces. The traces generated by emulation can then be fed to a multi-component simulator, where a variable resolution simulation can be carried out to predict performance and other attributes. We advocate this methodology and elaborate on research challenges involved in following it in exascale design. At exascale, we expect the components, which are pluggable entities similar to those used in existing frame- works such as BigSim [4, 5], SST [6], to simulate network, resilience support, power management, thermal constraints, operating system and file system. In addition, the adaptive runtime system, essential for scalable execution at exascale, needs to be (and can be) simulated in detail, with realistic code and strategies, in order to attain high fidelity.
Other authorsSee publication -
`Cool' Load Balancing for High Performance Computing Data Centers
IEEE Transactions on Computers
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose a scheme based on a combination of limiting processor temperatures using Dynamic Voltage and Frequency Scaling (DVFS) and frequency-aware load balancing that reduces cooling energy consumption and prevents hot spot formation. Our approach is…
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose a scheme based on a combination of limiting processor temperatures using Dynamic Voltage and Frequency Scaling (DVFS) and frequency-aware load balancing that reduces cooling energy consumption and prevents hot spot formation. Our approach is particularly designed for parallel applications, which are typically tightly coupled, and tries to minimize the timing penalty associated with temperature control. This paper describes results from experiments using five different CHARM++ and MPI applications with a range of power and utilization profiles. They were run on a 32-node (128-core) cluster with a dedicated air conditioning unit. The scheme is assessed based on three metrics: the ability to control processors’ temperature and hence avoid hot spots, minimization of timing penalty, and cooling energy savings. Our results show cooling energy savings of up to 63%, with a timing penalty of only 2–23%.
Other authorsSee publication -
Using Shared Arrays in Message-Driven Parallel Programs
International Workshop on High-Level Parallel Programming Models and Supportive Environments at IPDPS (HIPS) 2011
Superseded by journal version in Parallel Computing
This paper describes a safe and efficient combination of the object-based message-driven execution and shared array parallel programming models. In particular, we demonstrate how this combination engenders the composition of loosely coupled parallel modules safely accessing a common shared array. That loose coupling enables both better flexibility in parallel execution and greater ease of implementing multi-physics simulations. As a…Superseded by journal version in Parallel Computing
This paper describes a safe and efficient combination of the object-based message-driven execution and shared array parallel programming models. In particular, we demonstrate how this combination engenders the composition of loosely coupled parallel modules safely accessing a common shared array. That loose coupling enables both better flexibility in parallel execution and greater ease of implementing multi-physics simulations. As a case study, we describe how the parallelization of a new method for molecular dynamics simulation benefits from both of these advantages. We also describe a system of typed handle objects that embed some of the determinacy constraints of the Multiphase Shared Array programming model in the C++ type system, to catch some violations at compile time. The combined programming model communicates in terms of these handles as a natural means of detecting and preventing errors.Other authorsSee publication -
PGAS in the message-driven execution model
1st Workshop on Asynchrony in the PGAS Programming Model APGAS
Asynchrony is increasingly important for high performance on modern parallel machines. A common approach to providing asynchrony in PGAS languages is to add additional language constructs to support asynchronous execution. In this paper we describe Multiphase Shared Arrays (MSA), a restricted PGAS programming model that takes the opposite approach, layering PGAS semantics over a fundamentally asynchronous runtime environment. We sidestep many of the difficulties of asynchronous programming…
Asynchrony is increasingly important for high performance on modern parallel machines. A common approach to providing asynchrony in PGAS languages is to add additional language constructs to support asynchronous execution. In this paper we describe Multiphase Shared Arrays (MSA), a restricted PGAS programming model that takes the opposite approach, layering PGAS semantics over a fundamentally asynchronous runtime environment. We sidestep many of the difficulties of asynchronous programming through a discipline that offers desirable safety properties while exposing opportunities for optimization at multiple levels. We retain generality by offering composability with general purpose parallel programming models.
Other authorsSee publication
Projects
-
National Water Model NextGen Framework
-
WarpX
-
See projectContribute to simulation and analysis capabilities desired by my client, Modern Electron. Focus on C++/Python interfacing, extending physics capabilities, and enhancing performance for their use cases that diverged substantially from the upstream team's focus.
-
CHIME
-
See projectContribute to accuracy, testing and validation, performance, and overall capabilities in a rapid response project to support hospitals and public health officials in forecasting the magnitude of short-term impacts they could expect in the early stages of the COVID pandemic.
Languages
-
English
-
Organizations
-
ACM, IEEE, SIAM
-
More activity by Phil
-
MPICH 5.0.0 was released this week. This is the first implementation of the much-awaited MPI-5 ABI in a release version. MPI-5 ABI support has been…
MPICH 5.0.0 was released this week. This is the first implementation of the much-awaited MPI-5 ABI in a release version. MPI-5 ABI support has been…
Liked by Phil Miller
-
Datacenters in Orbit: 🚀 A Sustainable Promise or Hype? I wanted to share some thoughts about in orbit datacenters. In case you missed it, there’s…
Datacenters in Orbit: 🚀 A Sustainable Promise or Hype? I wanted to share some thoughts about in orbit datacenters. In case you missed it, there’s…
Liked by Phil Miller
-
I'm excited to announce that this morning the University of Maryland Senate PCC approved our proposal to initiate a minor program of study: Climate…
I'm excited to announce that this morning the University of Maryland Senate PCC approved our proposal to initiate a minor program of study: Climate…
Liked by Phil Miller
-
We're looking for a senior scientist to join our hydrology team in Portland! We have a rare opening for a Hydrologist embedded at the Northwest…
We're looking for a senior scientist to join our hydrology team in Portland! We have a rare opening for a Hydrologist embedded at the Northwest…
Liked by Phil Miller
-
Woohooo - just saw the latest PRL cover shows a #WarpX simulation! Very creative work to generate coherent photons from laser-plasma interaction in a…
Woohooo - just saw the latest PRL cover shows a #WarpX simulation! Very creative work to generate coherent photons from laser-plasma interaction in a…
Liked by Phil Miller
-
13 years. In Silicon Valley, that's a lifetime. My chapter at LinkedIn has come to a close. As I like to say, it was my best job ever, and my only…
13 years. In Silicon Valley, that's a lifetime. My chapter at LinkedIn has come to a close. As I like to say, it was my best job ever, and my only…
Liked by Phil Miller
-
Last week, David Gross, Katie Shepherd, and I kicked off our second cohort of the Zigzag Project with the Harvey Mudd College alumni! To be honest…
Last week, David Gross, Katie Shepherd, and I kicked off our second cohort of the Zigzag Project with the Harvey Mudd College alumni! To be honest…
Liked by Phil Miller
-
High-resolution land cover data has been a game changer for communities assessing flood risks, stormwater runoff, and more. Lynker team staff…
High-resolution land cover data has been a game changer for communities assessing flood risks, stormwater runoff, and more. Lynker team staff…
Liked by Phil Miller
-
So grateful that I got to spend the afternoon talking with the fantastic people at the Techstars AI Health Baltimore powered by Johns Hopkins…
So grateful that I got to spend the afternoon talking with the fantastic people at the Techstars AI Health Baltimore powered by Johns Hopkins…
Liked by Phil Miller
-
Looking forward to this new adventure! #HPC
Looking forward to this new adventure! #HPC
Liked by Phil Miller
Other similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content