-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy patharchitecture.src
More file actions
232 lines (184 loc) · 9.5 KB
/
architecture.src
File metadata and controls
232 lines (184 loc) · 9.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
%include('html_intro.inc')
<meta name="keywords" content="projects">
<meta name="description" content="NCCS TechInt group thrust area">
<title>Technology Integration Group: System Architecture and Resilience</title>
%include('hdbody.inc')
%include('hdrnav.inc')
<!-- ========================================================== -->
%include('content_intro.inc')
<div class="overview">
These projects relate to innovations in system architecture that aim
to improve performance, reduce variability, and enhance
reliability/resilience. They may also involve evaluation of new
processor technologies as applicable to the OLCF roadmap.
</div>
<div class="plinks">
<a href="#Spectral">Spectral</a>
<br><a href="#Many-Core">Functional Partitioning Runtime for Many-Core Nodes</a>
<br><a href="#TBSchedule">Top and Bottom Scheduling</a>
<br><a href="#SCFailure">Supercomputer Failure Analysis</a>
<br><a href="#Congestion">Interconnect Congestion Analysis</a>
</div>
%include('content_close.inc')
<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="Spectral">Spectral</h2>
<p>Spectral is a software library for enabling the use of the Summit
Burst-Buffer I/O system without need for code modification. Using function call
intercept techniques, Spectral hooks into an application and detects when files
are written to specially configured directories. Upon detection, spectral
schedules and manages the asynchronous drains from the burst-buffer to the
parallel file system. Spectral has successfully been tested with Fortran GTC and
Lammps on OLCF Power clusters. The use of Spectral in both applications
significantly increased checkpointing performance while requiring no
modifications to the application source code.
<p>Contributors: Christopher Zimmer and Scott Atchley
<p>Status: On-going
<p> <a class="right" href="#">Top</a>
%include('content_close.inc')
<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="Many-Core">Functional Partitioning Runtime for Many-Core Nodes</h2>
<p>With the leveling off of processor clockspeeds, chip manufacturers have
increased the number of cores to consume the additional transitors promised by
Moore's Law. As we move towards exascale systems, we may see higher core counts
per node including more cores per socket, more sockets, and add-in boards such
as GP-GPUs and many-core coprocessors connected via PCI Express. As users
initially port their applications to Titan's GP-GPUs, they may not fully utilize
the CPUs leaving them available for functional partitioning.</p>
<p>This effort will explore providing runtime services to applications
using a small subset of cores on a many-core systems by developing a
Functional Partitioning (FP) runtime environment. This environment
will partition a many-core node such that an end-to-end application
(simulation + data analysis tasks) can be scheduled in-situ, on the
same node, alongside the application’s simulation job for better
end-to-end performance.
<p> Services provided may include I/O buffering, which would allow the
application to resume computation more quickly, or various forms of
fine-grain analysis or transformation.</p>
<p>Contributors: Scott Atchley, Saurabh Gupta, Ross Miller, and
Sudharshan Vazhkudai
<p>Status: On-going
<P>Selected publications:
<ul><li>
M. Li, S. S. Vazhkudai, A.R. Butt, F. Meng, X. Ma, Y. Kim, C.
Engelmann, G. Shipman, "Functional Partitioning to Optimize End-to-End
Performance on Many-core Architectures", <i>Proceedings of Supercomputing
2010 (SC10): 23rd IEEE/ACM Int'l Conference on High Performance
Computing, Networking, Storage and Analysis</i>, New Orleans, Louisiana,
November 2010.
<a href="http://users.nccs.gov/~vazhkuda/FP.pdf">pdf</a>
</ul>
<p> <a class="right" href="#">Top</a>
%include('content_close.inc')
<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="TBSchedule">Top and Bottom Scheduling</h2>
<p>
To minimize the runtime variability of jobs and reduce node
allocation fragmentation, we developed a dual-ended scheduling
policy for the Moab scheduler on Titan as opposed to the first-fit
policy. Our algorithm schedules large jobs top down and small jobs
bottom up, thereby minimizing the fragmentation for large,
long-running jobs that is caused by small, short-lived jobs.
Further, we also modified the scheduling of nodes to a job to
prioritize the z dimension (nodes within a rack), followed by the
x dimension (nodes in the racks that are in the same row offer
better communication bandwidth) and then the y dimension (columns).
<p>
Thousands of jobs are benefitting from this technique. This
project received a Significant Event Award, a prestigious lab-wide
recognition.
<p>Contributors: Chris Zimmer, Scott Atchley, Sudharshan Vazhkudai
<p>Status: Complete
<p>Selected publications:
<ul>
<li>
Christopher Zimmer, Saurabh Gupta, Scott Atchley, Sudharshan S.
Vazhkudai, and Carl Albing, "A Multi-faceted Approach to Job
Placement for Improved Performance on Extreme-Scale Systems,"
<i>Proceedings of Supercomputing 2016 (SC16): 29th Int'l
Conference on High Performance Computing, Networking, Storage and
Analysis</i>, Salt Lake City, UT, November 2016.
<a href="http://users.nccs.gov/~vazhkuda/Titan-Scheduling.pdf">pdf</a>
</ul>
<p> <a class="right" href="#">Top</a>
%include('content_close.inc')
<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="SCFailure">Supercomputer Failure Analysis</h2>
<p>
This project analyzes temporal failure characteristics of Titan’s
299,008 CPUs and 18,688 GPUs to understand trends in machine
failure, MTBF, single bit errors, double bit errors, off the bus
errors and temperature correlation to failure
[TitanGPUReliability:HPCA15]. This study was the first of its kind
for a large-scale GPU deployment. Based on the insights, devised
checkpointing advisory tools [LazyChkpt:DSN14] that are saving
millions of core hours for production jobs, devised an
energy-based scheduling algorithm to schedule large node-count GPU
jobs that stress the GPUs more to be scheduled at the bottom of
the rack where it is much cooler than the top of the rack, or
cordoned off nodes with frequent failures
<p>TechInt contributors: Devesh Tiwari, Sudharshan Vazhkudai
<p>Status: Complete
<p>Selected publications:
<ul>
<li>
Devesh Tiwari, Saurabh Gupta, Jim Rogers, Don Maxwell, Paolo Rech,
Sudharshan Vazhkudai, Daniel Oliveira, Dave Londo, Nathan
Debardeleben, Philippe Navaux, Luigi Carro, Arthur Buddy Bland,
"Understanding GPU Errors on Large-scale HPC Systems and the
Implications for System Design and Operation", <i>21st IEEE Symp.
on High Performance Computer Architecture (HPCA)</i>, San
Francisco, California, February 2015.
<a href="http://users.nccs.gov/~vazhkuda/hpca.pdf">pdf</a>
<li>
Devesh Tiwari, Saurabh Gupta, Sudharshan S. Vazhkudai, "Lazy
Checkpointing: Exploiting Temporal Locality in Failures to
Mitigate Checkpointing Overheads on Extreme-Scale Systems",
<i>Proceedings of the 44th Annual IEEE/IFIP Int'l Conference on
Dependable Systems and Networks (DSN 2014)</i>, Atlanta, Georgia,
June 2014.
<a href="http://users.nccs.gov/~vazhkuda/DSN2014.pdf">pdf</a>
</ul>
<p> <a class="right" href="#">Top</a>
%include('content_close.inc')
<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="Congestion">Interconnect Congestion Analysis</h2>
<p>High-bandwidth file-systems are critical to today’s
super-computing applications. To achieve the level of performance
necessary for leadership class of applications, the underlying
network must facilitate high aggregate-bandwidth demands.
Unfortunately, in such a large-scale network, congestion at
routers leads to limited overall I/O performance and high
variability.
<p>In this effort, our goal was to identify and understand
bottlenecks in the interconnection-network pertaining to the file
system I/O traffic. This work involved analyzing the impact of job
placement, router placement on performance, and studying how these
configurations play a role in reducing congestion in the
interconnection-network. During the course of this investigation
we sought to:
<ol class="numbered">
<li class="numbered">Understand the dynamic behavior of the system via
careful experimentation and investigation, and identify where
bottlenecks occur,</li>
<li>Develop simulated models of the network for an in-depth study
of network dynamics in presence of real-life workload scenarios,</li>
<li>Identify heuristics or layouts that improve upon the existing
topology, and
<li>Come up with optimal I/O router layouts for current and
several other network topologies for future systems.
</ol>
<p>
As a result of this research, we developed and deployed a
lightweight, scalable mechanism to monitor the router traffic on
Titan’s interconnect fabric (9600 routers). The tool is used to
analyze interference among jobs.
<p>Contributors: Saurabh Gupta, Chris Zimmer, and Sudharshan Vazhkudai
<p>Status: On-going
<p> <a class="right" href="#">Top</a>
%include('content_close.inc')
%include('html_close.inc')