Intersect360 Research White Paper: NEW AMD CPUs and GPUs CONTINUE
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Intersect360 Research White Paper: NEW AMD CPUs and GPUs CONTINUE MOMENTUM INTO HPC FUTURE EXECUTIVE SUMMARY The HPC industry is in the midst of an era of expansion. Intersect360 Research studies have shown the challenge that organizations face is the need to serve not only their ravenous technical computing applications, but also the adopted appetites of data science and machine learning. As this trend has progressed, it has led to a steady, corresponding shift in computational architecture for HPC. Today, HPC relies on new concepts of scalability. Most new HPC deployments are heterogeneous; in addition to the CPUs that run the system, they are powered by complementary co-processors, usually in the form of GPUs. The fields of machine learning and data science, combined with the attendant computational power of GPUs, offer tremendous upside for the application of HPC to an ever-increasing array of endeavors. Among all the HPC processing options to emerge in the past few years—and there have been many—the ones with the most momentum come from Advanced Micro Devices (AMD). AMD has a nearly monomaniacal focus on performance, and the company has impressively maintained consistent messaging company-wide, across generations of development. AMD has performed benchmark testing in which its EPYC™ processors outperform comparable Intel Xeon® parts on commonly-used HPC applications. As a result of this dedicated effort, AMD perception and adoption are on the rise among HPC users. The future of supercomputing is solidly heterogeneous, incorporating both CPUs and GPUs. AMD serves HPC not only with AMD EPYC CPUs, but also with AMD Instinct™ accelerators, and these are predestined to work well together. Upcoming generations of EPYC and Instinct will be integrated with AMD’s high-speed Infinity architecture, boosting inter-processor communication between CPU and GPU. AMD is addressing heterogeneous programming in two ways. First, with coherent memory, programming is more straightforward, with fewer lines of code or custom calls. Second, AMD supports an open “ROCm™” (pronounced “rock ’em”) programming and software environment, making applications more portable to future generations of processors, regardless of where they come from. By bringing together new technologies and new workloads, AMD provides a compelling vision of scalability into the future of HPC. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
MARKET DYNAMICS: THE NEW SCALABILITY
One basic characteristic of High Performance Computing (HPC) is that it doesn’t hold still. No
amount of scientific discovery, product enhancement, or engineering achievement is final; it Data science and
is merely the next evolutionary step in ongoing advancement. Every solution unveils the next machine
question, and the tools of HPC must also improve and evolve to solve each new generation of learning
challenges. techniques can
augment
Due to the nature of this ongoing expansion, HPC has always included notions of scalability.
preexisting
Invariably, scalability was tied to notions of “more”: more processors, more computations,
computational
more bandwidth. But thanks to added complexity due to a confluence of trends, scalability in
HPC is more complicated than ever, and new insights and discoveries are increasingly linked
methods,
to combinations of factors. unlocking new
computing
The New Scalability: Applications capabilities.
The HPC industry is in the midst of an era of expansion beyond its everyday, relentless Applications in
pursuit of knowledge. The last decade has seen the introduction of successive, related uber- financial
trends: the first, big data, which established data science and analytics as high-performance
services,
enterprise workloads; the second, artificial intelligence (AI), which popularized machine
manufacturing,
learning as an alternate, experiential approach to computing. Both are enabled by the
oil exploration,
creation and accessibility of vast amounts of data to complement raw computation.
and bio-sciences
As they evolve, data science and machine learning need not be independent from traditional are all data-
HPC. These techniques can augment preexisting computational methods, unlocking new intensive and
computing capabilities. Applications in financial services, manufacturing, oil exploration, and well-suited to
bio-sciences are all data-intensive and well-suited to the incorporation of machine learning. machine
Intersect360 Research studies have shown the strong majority of HPC-using organizations learning.
have incorporated data science or machine learning workloads into their environments. Note
that this does not imply a corresponding tripling of budgets; far from it, although there has
been an increase. The corresponding challenge that organizations face is the need to serve
not only their ravenous technical computing applications, but also the adopted appetites of
data science and machine learning. As this trend has progressed, it has led to a steady,
corresponding shift in computational architecture for HPC.
The New Scalability: Architectures
As HPC applications have continued to evolve, the systems that power them have had to
continue to evolve as well. Just as no application is complete for all time, no supercomputer
has ever proved powerful enough that it could not eventually be saturated. Over time, HPC
architectures have processed through their own eras to achieve greater scalability for
expanding application sets. Vector processors were replaced by RISC scalar. RISC gave way to
x86 as single-system deployments were replaced by clusters.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comFor years, cluster deployments were largely homogeneous. They ran x86 processors in dual-
socket, industry-standard servers, over some type of networking fabric. One of the
advantages this hegemony presented was portability. Applications could be migrated from
one cluster to the next without much care to vendor. Components like CPUs, memory, and
network could be upgraded, but the model remained the same.
But innovation isn’t innovation if someone else gets there first. Furthermore, new application
requirements—such as those posed by AI—mean that one single processor or configuration
isn’t always best for every workload. This set of circumstances has conspired to help establish
a new computational norm in HPC: accelerated computing. Today, most new HPC
deployments are heterogeneous; in addition to the CPUs that run the system, they are
powered by complementary co-processors, usually in the form of GPUs.
High-power GPUs—graphics processing units—have their roots in gaming and
entertainment. The processing elements that power these high-end visual effects are strong
engines for floating-point performance, and with the right programming tools, they can be
harnessed to boost mathematical performance for scientific computing. As GPUs have
evolved as accelerators, differences have begun to emerge in the needs between HPC and
graphics, demanding further specialization.
The tide of GPU computing was already rising in HPC before AI burst onto the scene. As it
happened, GPUs were well suited to machine learning, for both training and inference. AI
accelerated the adoption of GPUs for HPC, particularly for mixed-workload environments.
And there has been a budgetary effect as well: Over 60% of HPC users are running machine
learning as part of their environments, leading to frequent increases in budgets for high-
performance workloads—in some cases more than doubling (see chart below).1
The New Scalability: Challenges
The fields of machine learning and data science, combined with the attendant computational
power of GPUs, offer tremendous upside for the application of HPC to an ever-increasing
array of endeavors. All that’s left to worry about is the software.
Application development and optimization are perpetual challenges in HPC. Not only does
the developer need to create algorithms for simulating complex phenomena, but the
resulting programs have to be run by vast arrays of independent processing elements. Single,
massive problems must be decomposed into digestible computations that can be run quickly,
with interim calculations compared and reassigned in step after step.
Heterogeneous computing—the use of multiple types of processors, such as CPUs together
with GPUs—makes programming more complicated. The programmer needs to specify not
1
Intersect360 Research, HPC User Budget Map survey data, 2020.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comonly how to decompose a problem across multiple nodes, but also which portions of work to
assign to each type of element.
In the past, proprietary programming languages have been a hindrance for custom co-
processors. Recently, programming for GPUs has become more widespread, with more
accessible languages and libraries; however, the most commonly used tools are still
proprietary to particular brands of GPUs.
The net result is that HPC has, for the time being, returned to an era of specialization. Great
innovations are possible, but this has come at the expense of portability; application
advancements have been tied to particular vendors’ product lines.
When the HPC landscape was dominated by Intel x86 processors with few options, this
wasn’t a particular concern. Today, with a diversity of options between CPUs, GPUs, and In a 2016
other components, users are less certain in committing to one vendor only. Portability is still Intersect360
an issue, as software investment protection remains important in the continual updating of Research study,
capabilities. Furthermore, as machine learning and data science beget new applications, 36% of HPC
there is an interest in securing a bridge to the future. users said they
had a favorable
Effect on High-Performance Workload Budget Related to Incorporation of Machine Learning forward-looking
Intersect360 Research, 2021
impression of
AMD CPUs. In a
similar study in
2020, that
percentage had
risen to 78%.
INTERSECT360 RESEARCH ANALYSIS
An “EPYC” Comeback for AMD
Among all the HPC processing options to emerge in the past few years—and there have
been many—the ones with the most momentum come from Advanced Micro Devices
(AMD), a company on a major upward trend. AMD gained attention in HPC when it
announced its first generation of EPYC x86 processors. With the second generation, AMD
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comgained notoriety, becoming the first microprocessor vendor going to market with a 7-
nanometer (7nm) manufacturing process. Now, with the launch of third-generation AMD
EPYC processors, AMD has continued its monomaniacal focus on winning the performance
race.
The AMD “Zen 3” core architecture at the heart of the EPYC 7003 CPU delivers a range of
optimizations, resulting in up to 19% improvement in operations per cycle, according to AMD.
This performance boost comes from a combination of enhancements over the “Zen 2” core
architecture, such as increased bandwidth for both loads and store, and improved latency for
certain calculations. (See chart.)
AMD “Zen 3” Architecture Enhancements vs. “Zen 2”2
Source: AMD, 2021
The AMD EPYC 7003 enhancements go beyond the architecture performance. The complete
SOC (system on chip) has enhanced memory and cache functionality and additional security
features, while maintaining socket compatibility with previous AMD EPYC versions. In
particular, the L3 cache is integrated into a single, large 32GB reservoir, rather than two
smaller ones, allowing full cache allocation to any single core that may need it, thereby
benefiting applications with databases that may fit into the single, larger cache.
In one other subtle improvement, the AMD Infinity Fabric™ clock is now synchronous with
DRAM memory. This helps reduce latency in waiting for data, resulting in an improvement for
memory-sensitive applications, which are common in HPC.
These enhancements add up to real-world results on applications at the heart of HPC. AMD
has performed benchmark testing in which its EPYC processors outperform comparable Intel
2
AMD claim, based on AMD internal testing as of February 1, 2021, average performance improvement at ISO-frequency on an AMD
EPYC™ 72F3 (8C/8T, 3.7 GHz), compared to an AMD EPYC™ 7F32 (8C/8T, 3.7 GHz), per-core, single thread, using a select set of
workloads including estimated SPECrate®2017_int_base, SPECrate®2017_fp_base, and representative server workloads.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comXeon processors on commonly-used HPC applications, with average improvements ranging
from 43% on crash-test simulations up to 99% for computational fluid dynamics. (See chart.)
AMD EPYC 7003 Comparative Benchmark Results on HPC Applications3
2x AMD EPYC™ 75F3 (32 core) vs. 2x Intel® Xeon® Gold 6258R (28 core),
Average Performance Across Representative Workloads
Source: AMD, 2021
As a result of this dedicated effort, AMD perception is on the rise among HPC users. In a 2016
Intersect360 Research study, 36% of HPC users said they had a favorable forward-looking
impression of AMD CPUs. In a similar study in 2020, that percentage had risen to 78%— AMD CPUs have
higher even than the percentage of users with a favorable future impression of Intel CPUs a presence today
(69%). (See chart below.) in 70% of HPC
sites, an
And with AMD now shipping its third-generation of EPYC CPUs, this positive sentiment is
astonishing
beginning to show up in real customer deployments. AMD was named as the processor
transition from
vendor in only 5% of surveyed HPC systems in 2017 and 2018 combined.4 Today, 23% of HPC
three years ago.
users say they have AMD EPYC processors in widespread use. An additional 47% are testing
3
AMD internal testing. All tests compare 2x EPYC™ 75F3 (32C) to 2x Intel® Xeon® Gold 6258R (28C) processors. WRF version 4.1.5
comparison based on testing completed on 2/17/2021 on an AMD reference platform compared to an Intel server on a production
system. ANSYS® CFX® 2021.1 comparison based on testing as of 2/5/2021 measuring the time to run the Release 14.0 test case
simulations (converted to jobs/day – higher is better). ANSYS® LS-DYNA® version 2021.1 comparison based on testing as of 2/5/2021
measuring the time to run neon, 3cars, PPT-short, odb10m-short, and car2car test case simulations (converted to jobs/day – higher is
better); results were AMD 17,555 total seconds versus Intel 28,774 total seconds, for ~81.0% more per node or ~59% more per core
performance advantage; the 3cars test case gain individually was ~126% more per node or ~98% more per core jobs/day performance.
ESI Virtual Performance Solution (VPS better known as PAM-CRASH®) version 2020.0 comparison based on testing as of 2/5/2021
measuring the neon test case simulation (converted to jobs/day – higher is better) for ~43% more per node or ~25% more per core
jobs/day performance. Star-CCM+ 2020.3 comparison based on testing as of 2/5/2021 measuring the average seconds to complete 11
test cases and converted to jobs/day (higher is better); the KCS Marine Hull with No Rudder in Fine Waves test case individually was
~79% more per node or ~57% per core performance. Results may vary.
4
Intersect360 Research HPC User Site Census surveys across 2017 and 2018, total proportion of systems for which AMD was identified as
CPU provider, including half-system credit for systems in which AMD was a shared CPU provider.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comor using AMD EPYC at some level, giving AMD CPUs a presence in 70% of HPC sites, an
astonishing transition from three years ago. (See charts below.)
Percent of HPC Users with Favorable Forward-Looking Impressions of CPUs 5
Source: Intersect360 Research, 2021
Current Penetration of AMD CPUs Among Surveyed HPC Sites 6
Source: Intersect360 Research, 2021
23% of HPC sites
report “broad usage”
of AMD CPUs
70% of HPC sites have at
least some AMD CPUs
5
Intersect360 Research data from multiple studies. 2016: Special study: “Processing Elements for HPC”; question, “Overall, how favorable
is your forward-looking impression of each of the following, with respect to your HPC workloads? (1 = Completely unfavorable; 5 =
completely favorable)”; scores are combined percentage 4 and 5 for AMD Opteron versus Intel Xeon. 2020: “Vendor Satisfaction and
Loyalty in HPC”; question, “What is your impression of each of the following vendors' future prospects for HPC?” (five-point scale); scores
are combined top-two responses (Very Impressed; Impressed) for AMD EPYC CPUs versus Intel Xeon CPUs.
6
Intersect360 Research HPC Technology Survey, 2021.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comAMD’s momentum shows up in more than user surveys. The U.S. Department of Energy
(DOE) has selected AMD as the processor vendor for the pending Frontier and El Capitan
supercomputers. These will each be among the world’s earliest “Exascale”-class systems, with
peak rated speeds of over an Exaflop: one quintillion calculations per second7 for scientific,
64-bit calculations. Frontier, when it arrives, is expected to be the first Exascale
supercomputer in the U.S.,8 perhaps in the world, and El Capitan is projected to be the
world’s first supercomputer at 2 Exaflops peak performance.9
Marrying CPU and GPU
It wasn’t the EPYC CPU alone that attracted the DOE to AMD for Frontier and El Capitan. As
GPU-accelerated applications have become more common, few HPC users want to give up
their acceleration. The future of supercomputing is solidly heterogeneous, incorporating both
CPUs and GPUs for HPC, data science, and machine learning.
For the past ten years, Intel has been the dominant provider of CPUs, and the dominant AMD Instinct
provider of computational GPUs has been NVIDIA. While this situation has worked well MI100 is the
enough thus far, the polarization it presents is worrisome. NVIDIA and Intel compete more first GPU with
than they cooperate, both in processing and in networking, and programming for one over 10
company’s solutions does not translate to the other’s. Teraflops of
Enter AMD, again. AMD serves HPC not only with its EPYC CPUs, but also with AMD Instinct performance:
GPUs, and these are predestined to work well together. In fact, the upcoming generations of 11.5 Teraflops
EPYC and Instinct will be integrated with AMD’s high-speed Infinity architecture, boosting of peak 64-bit
inter-processor communication between CPU and GPU. performance.
Since they work in concert over a high-bandwidth, low-latency connection, AMD will be first
to market with a feature not previously seen for heterogeneous architectures: coherent
memory. With a single, coherent memory space across an integrated chipset, programmers
can assign individual calculations, subroutines, or loops without moving data. “Pass the
pointer, not the data” has been a mantra for advocates of coherent memory; now this
concept is applicable to heterogeneous computing as well.
AMD Instinct MI100 Accelerator
Reaching Exascale levels of performance doesn’t happen merely by ganging more elements
together; the individual elements need to get faster as well. AMD has advanced the CPU
components with its EPYC processor line, and in November 2020, in conjunction with the
annual SC conference10 for the worldwide supercomputing community, AMD announced its
latest GPU offering for HPC, the AMD Instinct MI100.
7
One quintillion = one billion billion = one million million million = 1018 = 1,000,000,000,000,000,000.
8
https://www.hpcwire.com/2020/10/01/auroras-troubles-move-frontier-into-pole-exascale-position/.
9
https://www.amd.com/en/products/exascale-era.
10
http://supercomputing.org/index.php.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comThe MI100 brings AMD’s focus on HPC performance to its GPU line. AMD is promoting MI100 “Pass the
as the first GPU with over 10 Teraflops of performance, with 11.5 Teraflops of peak 64-bit pointer, not the
performance.11 data” has been a
The MI100 is the first GPU based on the new AMD CDNA™ (Compute DNA) architecture, mantra for
which separates GPU designs between those focused on computation (AMD CDNA) and advocates of
those focused on graphics (AMD RDNA, for Radeon™ DNA). The AMD CDNA architecture coherent
offers a new core design with double the computational efficiency of previous AMD GPUs,12 memory; now
along with “Matrix Core Technology” that targets acceleration for HPC and AI applications. The this concept is
MI100 with AMD CDNA has 32GB of HBM2 memory, with over 1.2 Terabytes per second of applicable to
memory throughput,13 and the second-generation Infinity architecture, with up to a 37% heterogeneous
speed-up of GPU-to-GPU communication over the first generation.14 computing as
As much as it targets HPC as a primary market, MI100 does not ignore AI, which is an well.
integrated part of HPC environments. The AMD CDNA architecture supports mixed-precision
workloads, with FP32 and FP16 matrix, bfloat16, INT8, and INT4 for machine learning. AMD
states MI100 is nearly seven times faster than its previous Radeon GPUs for FP16 workloads
for AI.15
Programming for the Future
All the computing power in the world doesn’t help if you can’t program for it. AMD is
addressing this in two ways. First, with coherent memory, programming is more
straightforward, with fewer lines of code or custom calls. [See diagrams below.] This is useful
for the ongoing development of new applications that take advantage of heterogeneous
computing. Second, AMD supports programming models that are open, rather than
11
AMD claim: Calculations conducted by AMD Performance Labs as of September 18, 2020 for the AMD Instinct™ MI100 (32GB HBM2
PCIe® card) accelerator at 1,502 MHz peak boost engine clock resulted in 11.54 TFLOPS peak double precision (FP64), 46.1 TFLOPS
peak single precision matrix (FP32), 23.1 TFLOPS peak single precision (FP32), 184.6 TFLOPS peak half precision (FP16) peak theoretical,
floating-point performance. Published results on the NVidia Ampere A100 (40GB) GPU accelerator resulted in 9.7 TFLOPS peak double
precision (FP64). 19.5 TFLOPS peak single precision (FP32), 78 TFLOPS peak half precision (FP16) theoretical, floating-point
performance. Server manufacturers may vary configuration offerings yielding different results. MI100-03.
12
AMD claim: AMD Instinct™ MI100 accelerators provide 120 compute units and 7,680 stream cores in a 300W accelerator card. Radeon
Instinct™ MI50 accelerators provide 60 compute units (CUs) and 3,840 stream cores in a 300W accelerator card. MI100-09.
13
AMD claim: Calculations by AMD Performance Labs as of Oct 5th, 2020 for the AMD Instinct™ MI100 accelerator designed with AMD
AMD CDNA 7nm FinFET process technology at 1,200 MHz peak memory clock resulted in 1.2288 TFLOPS peak theoretical memory
bandwidth performance. The results calculated for Radeon Instinct™ MI50 GPU designed with “Vega” 7nm FinFET process technology
with 1,000 MHz peak memory clock resulted in 1.024 TFLOPS peak theoretical memory bandwidth performance. AMD CDNA-04.
14
AMD claim: Calculations as of SEP 18th, 2020. AMD Instinct™ MI100 accelerators support PCIe® Gen4 providing up to 64 GB/s peak
theoretical transport data bandwidth from CPU to GPU per card. AMD Instinct™ MI100 accelerators include three Infinity Fabric™ links
providing up to 276 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card.
Combined with PCIe Gen4 support, this provides an aggregate GPU card I/O peak bandwidth of up to 340 GB/s. Server manufacturers
may vary configuration offerings yielding different results. MI100-06.
15
AMD claim: Calculations performed by AMD Performance Labs as of September 18, 2020 for the AMD Instinct™ MI100 accelerator at
1,502 MHz peak boost engine clock resulted in 184.57 TFLOPS peak theoretical half precision (FP16) and 46.14 TFLOPS peak theoretical
single precision (FP32 Matrix) floating-point performance. The results calculated for Radeon Instinct™ MI50 GPU at 1,725 MHz peak
engine clock resulted in 26.5 TFLOPS peak theoretical half precision (FP16) and 13.25 TFLOPS peak theoretical single precision (FP32
Matrix) floating-point performance. Server manufacturers may vary configuration offerings yielding different results. MI100-04.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comproprietary, making them more portable to future generations of processors, regardless of
where they come from.
Programming Heterogeneous Computing with Coherent Memory
Source: AMD, 2020
AMD Commitment to Open-Source Development for Heterogeneous Computing
Source: AMD, 2020
To achieve this, AMD is putting forward its own open-development environment, AMD ROCm
(pronounced “Rock ’em”). ROCm is the critical piece that will determine software
performance and scalability in HPC environments. (See chart below.)
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comAnd what about applications that already exist? Many users have already spent years porting
and optimizing applications with NVIDIA’s CUDA tools. Here AMD has a potentially critical
solution: HIP (Heterogeneous-Compute Interface for Portability), which is a programming
model that enables applications to run on both AMD and NVIDIA hardware with the same
code base. HIP provides an abstraction layer to call the optimized code, compiled for each
architecture. Through a pair of “HIPify” porting tools, AMD offers automated conversion of
applications written to run only on NVIDIA CUDA to be compatible with open AMD HIP. AMD
cites examples of over 90% to 95% of code converting automatically, with no user
intervention.16
Features of the AMD ROCm Software Environment
Source: AMD, 2021
Recognition
Recommend
ation Engine
Data Security
Processing
Language
Govt Labs
Academic
& Detection
Research
Oil & Gas
Sciences
Image
Life
◢ Complete set of libraries, tools and
HPC Industry Solutions management APIs
ROCmTM Software Stack ◢ Open-source compiler for OpenMP and HIP
DC Tools, Dev Tools, Comm & Math Libraries, Compilers
Validated, Optimized Systems ◢ Easy, automated tools to convert CUDA code
to HIP with virtually no performance loss
◢ Scales from Workstation to Cloud to Exascale
◢ Open ecosystem for 3rd party development
Here the partnership with DOE will pay dividends for AMD as well. In a 2019 vendor profile
of AMD, Intersect360 Research wrote: “AMD need not wait for the installation of Frontier to
begin reaping the benefits of its affiliation with the DOE labs. … DOE researchers now have a
vested interest in seeing their scientific research applications run well, at scale, on AMD
architectures. Furthermore, the DOE is committed to open science; researchers will have an
incentive to push any porting or optimization to the broader scientific community.” 17
Bringing It All Together
HPC is continuing to expand and diversify in ways that are both exciting and frightening. New
workloads, new architectures, and new frontiers of scalability will lead to innovations and
discoveries beyond yesterday’s conception. But with expanding possibility comes the added
challenge of harnessing all that power.
AMD is back in the HPC game, and in many ways, AMD is back in the lead. Winning on
price/performance involves innovation on two fronts: maximizing raw performance and
16
https://www.admin-magazine.com/HPC/Articles/Porting-CUDA-to-HIP.
17
Intersect360 Research, Vendor Overview and Outlook: AMD in HPC, November 2019.
© 2021 Intersect360 Research.
White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement.
P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124
www.Intersect360.com | info@Intersect360.comoptimizing efficiency. AMD is targeting both. AMD has both CPU and GPU, each aiming for HPC supremacy, over its own integrated fabric. Additionally, AMD has an open software environment that not only protects existing investments with automated conversion tools but also promotes open software development in the future. And AMD has a critical relationship with the DOE that will provide support and stability for a new generation of supercomputing. Incorporating CPU, GPU, and software into a consolidated, open-community environment is something no other company has done yet. By bringing together new technologies and new workloads, AMD provides a compelling vision of scalability into the future of HPC. For more information about AMD solutions for HPC, visit www.AMD.com/HPC. AMD, the AMD logo, EPYC, Infinity, AMD Instinct, ROCm, Radeon, AMD CDNA, AMD RDNA and combinations thereof are trademarks of Advanced Micro Devices, Inc. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
You can also read