The next BriefingsDirect
Voice of the Customer discussion explores a
program to expand the variety of CPUs that
support supercomputer
and artificial intelligence
(AI)-intensive workloads.
The Catalyst program in the UK is seeding the advancement of the ARM CPU architecture for high performance computing (HPC) as well as establishing a vibrant software ecosystem around it.
The Catalyst program in the UK is seeding the advancement of the ARM CPU architecture for high performance computing (HPC) as well as establishing a vibrant software ecosystem around it.
Stay with us to learn about unlocking new choices and innovation for the next generations of
supercomputing with Dr.
Eng Lim Goh, Vice President and Chief Technology Officer for HPC and AI at Hewlett Packard Enterprise (HPE), and Professor
Mark Parsons, Director of the Edinburgh
Parallel Computing Centre (EPCC) at the University
of Edinburgh. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.
Here are some excerpts:
Gardner: Mark,
why is there a need now for more variety of CPU architectures for
such use cases as HPC, AI, and supercomputing?
Parsons |
Parsons: In
some ways this discussion is a bit odd because we have had huge variety over
the years in supercomputing with regard to processors. It’s really only the last
five to eight years that we’ve ended up with the majority of supercomputers
being built from the Intel x86
architecture.
It’s always good in
supercomputing to be on the leading edge of technology and getting more variety
in the processor is really important. It is interesting to seek different processor
designs for better performance for AI or supercomputing workloads. We want the
best type of processors for what we want to do today.
Gardner: What
is the Catalyst program? Why did it come about? And how does it help address
those issues?
Parsons: The
Catalyst UK program is jointly funded by a number of large companies and three
universities: The University of Bristol,
the University of Leicester, and the University
of Edinburgh. It is UK-focused because Arm Holdings is based in
the UK, and there is a long history in the UK of exploring new processor
technologies.
Through Catalyst,
each of the three universities hosts a 4,000-core ARM processor-based system. We
are running them as services. At my university, for example, we now have a
number of my staff using this system. But we also have external academics using
it, and we are gradually opening it up to other users.
Catalyst for change in processor
We want as many people as
possible to understand how difficult it will be to port their code to ARM. Or, rather
-- as we will explore in this podcast -- how easy it is.
You only learn by breaking
stuff, right? And so, we are going to learn which bits of the software tool
chain, for example, need some work. [Such porting is necessary] because ARM predominantly
sat in the mobile phone world until recently. The supercomputing and AI world
is a different space for the ARM processor to be operating in.
Gardner: Eng Lim, why is this program
of interest
to HPE? How will it help create new opportunity and performance benchmarks
for such uses as AI?
Goh |
Goh: Mark
makes a number of very strong points. First and foremost, we are very keen as a
company to broaden the reach of HPC among our customers. If you look at our
customer base, a large portion of them come from the commercial HPC sites, the
retailers, banks, and across the financial industry. Letting them reach new types
of HPC is important and a variety of offerings makes it easier for them.
The second thing is the recent
reemergence of more AI applications, which also broadens the user base. There is
also a need for greater specialization in certain areas of processor capabilities.
We believe in this case, the ARM processor -- given the fact that it enables
different companies to build innovative variations of the processor – will provide
a rich set of new options in the area of AI.
Gardner: What
is it, Mark, about the ARM architecture and specifically the Marvell ThunderX2 ARM
processor that is so attractive for these types of AI workloads?
Expanding memory for the future
Parsons: It’s absolutely
the case that all numerical computing -- AI, supercomputing, and desktop
technical computing -- is controlled by memory bandwidth. This is about getting
data to the processor so the processor core can act on it.
What we see in the ThunderX2
now, as well as in future iterations of this processor, is the strong memory bandwidth
capabilities. What people don’t realize is a vast amount of the time, processor
cores are just waiting for data. The faster you get the data to the processor,
the more compute you are going to get out with that processor. That’s one
particular area where the ARM architecture is very strong.
Goh: Indeed,
memory bandwidth is the key. Not only in supercomputing applications, but especially
in machine learning
(ML) where the machine is in the early phases of learning, before it does a
prediction or makes an inference.
It has to go through the
process of learning, and this learning is a highly data-intensive process. You
have to consume massive amounts of historical data and examples in order to
tune itself to a model that can make good predictions. So, memory bandwidth is
utmost in the training phase of ML systems.
And related to this is the fact
that the ARM processor’s core intellectual property is available to many
companies to innovate around. More companies therefore recognize they can leverage
that intellectual property and build high-memory bandwidth innovations around
it. They can come up with a new processor. Such an ability to allow different
companies to innovate is very valuable.
Gardner: Eng
Lim, does this fit in with the larger HPE drive toward memory-intensive
computing in general? Does the ARM processor fit into a larger HPE strategy?
Goh: Absolutely.
The ARM processor together with the other processors provide choice and options
for HPE’s strategy of being edge-centric, cloud-enabled, and data-driven.
Across that strategy, the
commonality is data movement. And as such, the ARM processor allowing different
companies to come in to innovate will produce processors that meet the needs of
all these various kinds of sectors. We see that as highly valuable and it supports
our strategy.
Gardner: Mark,
Arm Holdings controls the intellectual property, but there is a budding ecosystem
both on the processor design as well as the software that can take advantage of
it. Tell us about that ecosystem and why the Catalyst UK program is
facilitating a more vibrant ecosystem.
The design-to-build ecosystem
Parsons: The
whole Arm story is very, very interesting. This company grew out of home
computing about 30 to 40 years ago. The interesting thing is the way that they
are an intellectual property company, at the end of the day. Arm Holdings
itself doesn’t make processors. It designs processors and sells those designs
to other people to make.
We've
had this wonderful ecosystem of different companies making their own
ARM processors or making them for other people. It's no surprise it's
the most common processor in the world today.
So, we’ve had this wonderful ecosystem of different companies making their own ARM processors or making them for other people. With the wide variety of different ARM processors in mobile phones, for example, there is no surprise that it’s the most common processor in the world today.
Now, people think that x86 processors rule the roost, but
actually they don’t. The most common processor you will find is an ARM
processor. As a result, there is a whole load of development tools that come
both from ARM and also within the developer community that support people who
want to develop code for the processors.
In the context of Catalyst UK,
in talking to Arm, it’s quite clear that many of their tools are designed to
meet their predominant market today, the mobile phone market. As they move into
the higher-end computing space, it’s clear we may find things in the programs
where the compiler isn’t optimized. Certain libraries may be difficult to
compile, and things like that. And this is what excites me about the Catalyst program.
We are getting to play with leading-edge technology and show that it is easy to
use all sorts of interesting stuff with it.
Gardner: And
while the ARM CPU is being purpose-focused for high-intensity workloads, we are
seeing more applications being brought in, too. How does the porting process of
moving apps from x86 to ARM work? How easy or difficult is it? How does the
Catalyst UK program help?
Parsons: All three
of the universities are porting various applications that they commonly use. At
the EPCC,
we run the national HPC service for the UK called ARCHER. As part of that we have run national
[supercomputing] services since 1994, but as part of the ARCHER service, we
decided for the first time to offer many of the common scientific applications
as modules.
You can just ask for the
module that you want to use. Because we saw users compiling their own copies of
code, we had multiple copies, some of them identically compiled, others not
compiled particularly well.
So, we have a model of
offering about 40 codes on ARCHER as precompiled where we are trying to keep them
up to date and we patch them, etc. We have 100 staff at EPCC that look after code.
I have asked those staff to get an account on the Catalyst system, take that
code across and spend an afternoon trying to compile. We already know for some
that they just compile and run. Others may have some problems, and it’s those
that we’re passing on to ARM and HPE, saying, “Look, this is what we found out.”
The important thing is that we
found there are very few programs [with such problems]. Most code is simply
recompiling very, very smoothly.
Gardner: How
does HPE support that effort, both in terms of its corporate support but also
with the IT systems themselves?
ARM’s reach
Goh: We
are very keen about the work that Mark and the Catalyst program are doing. As Mark
mentioned, the ARM processor came more from the edge-centric side of our
strategy. In mobile phones, for example.
Now we are very keen to see how far these ARM systems can go. Already we have shipped to the US Department of Energy at the Sandia National Lab a large ARM processor-based supercomputer called Astra. These efforts are ongoing in the area of HPC applications. We are very keen to see how this processor and the compilers for it work with various HPC applications in the UK and the US.
Gardner: And
as we look to the larger addressable market, with the edge and AI being such high-growth
markets, it strikes me that supercomputing -- something that has been around
for decades -- is not fully mature. We are entering a whole new era of
innovation.
Mark, do you see supercomputing
as in its heyday, sunset years, or perhaps even in its infancy?
Parsons: I
absolutely think that supercomputing is still in its infancy. There are so many
bits in the world around us that we have never even considered trying to model,
simulate, or understand on supercomputers. It’s strange because quite often people
think that supercomputing has solved everything -- and it really hasn’t. I will
give you a direct example of that.
Supercomputing
is still in its infancy. There are so many bits in the world around us
that we have never even considered trying to model, simulate, or
understand on supercomputers. It's strange because people think that
supercomputers have already solved everything.
A few years ago, a European project I was running won an award for simulating the highest accuracy of water flowing through a piece of porous rock. It took over a day on the whole of the national service [to run the simulation]. We won a prize for this, and we only simulated 1 cubic centimeter of rock.
People think supercomputers
can solve massive problems -- and they can, but the universe and the world are
complex. We’ve only scratched the surface of modeling and simulation.
This is an interesting moment
in time for AI and supercomputing. For a lot of data analytics, we have at our fingertips
for the very first time very, very large amounts of data. It’s very rich data
from multiple sources, and supercomputers are getting much better at handling
these large data sources.
The reason the whole AI story
is really hot now, and lots of people are involved, is not actually about the AI
itself. It’s about our ability to move data around and use our data to train AI
algorithms. The link directly into supercomputing is because in our world we
are good at moving large amounts of data around. The synergy now between
supercomputing and AI is not to do with supercomputing or AI – it is to do with
the data.
Gardner: Eng
Lim, how do you see the evolution of supercomputing? Do you agree with Mark
that we are only scratching the surface?
Top-down and bottom-up data crunching
Goh: Yes, absolutely,
and it’s an early scratch. It’s still very early. I will give you an example.
Solving games is important to
develop a method or strategy for cyber defense. If you just take the most
recent game that machines are beating the best human players, the game of Go, is much more complex
than chess in terms of the number of potential combinations. The number of
combinations is actually 10171, if you comprehensively went through all
the different combinations of that game.
You know how big that number
is? Well, okay, if we took all computers in the world together, all the supercomputers,
all of the computers in the data centers of the Internet companies and put them
all together, run them for 100 years -- all you can do is 1030 ,
which is so very far from 10171. So, you can see just by this one
game example alone that we are very early in that scratch.
A second group of examples
relates to new ways that supercomputers are being used. From ML to AI, there is
now a new class of applications changing how supercomputers are used. Traditionally,
most supercomputers have been used for simulation. That’s what I call top-down
modeling. You create your model out of physics equations or formulas and then
you run that model on a supercomputer to try and make predictions.
The new way of making predictions
uses the ML approach. You do not begin with physics. You begin with a blank
model and you keep feeding it data, the outcomes of history and past examples.
You keep feeding data into the model, which is written in such a way that for each
new piece of data that is fed, a new prediction is made. If the accuracy is not
high, you keep tuning the model. Over time -- with thousands, hundreds of
thousand, and even millions of examples -- the model gets tuned to make good predictions.
I call this the bottom-up approach.
Now we have people applying
both approaches. Supercomputers used traditionally in a top-down simulation are
also employing the bottom-up ML approach. They can work in tandem to make
better and faster predictions.
Supercomputers are therefore now
being employed for a new class of applications in combination with the traditional
or gold-standard simulations.
Gardner: Mark,
are we also seeing a democratization of supercomputing? Can we extend these
applications and uses? Is what’s happening now decreasing the cost, increasing
the value, and therefore opening these systems up to more types of uses and
more problem-solving?
Cloud clears the way for easy access
Parsons: Cloud
computing is having a big impact on everything that we do, to be quite honest. We
have all of our photos in the cloud, our music in the cloud, et cetera. That’s
why EPCC last year got rid of its file server. All our data running the actual
organization is in the cloud.
The cloud model is great
inasmuch as it allows people who don’t want to operate and run a large system
100 percent of the time the ability to access these technologies in ways they have
never been able to do before.
The
cloud model is great inasmuch as it allows people who don't want to
operate and run a large system 100 percent of the time the ability to
access these technologies in ways they have never been able to do
before.
The other side of that is that there are fantastic software frameworks now that didn’t exist even five years ago for doing AI. There is so much open source for doing simulations.
It doesn’t mean that an
organization like EPCC, which is a supercomputing center, will stop hosting
large systems. We are still great aggregators of demand. We will still have the
largest computers. But it does mean that, for the first time through the
various cloud providers, any company, any small research group and university,
has access to the right level of resources that they need in a cost-effective
way.
Gardner: Eng
Lim, do you have anything more to offer on the value and economics of HPC? Does
paying based on use rather than a capital expenditure change the game?
More choices, more innovation
Goh: Oh,
great question. There are some applications and institutions with processes
that work very well with a cloud, and there are some applications that don’t
and processes that don’t. That’s part of the reason why you embrace both. And,
in fact, we at HPE embrace the cloud and we also we build on-premises solutions
for our customers, like the one at the Catalyst UK program.
We also have something that is
a mix of the two. We call that HPE GreenLake,
which is the ability for us to acquire the system the customer needs, but the
customer pays per use. This is software-defined experience on consumption-based
economics.
These are some of the options
we put together to allow choice for our customers, because there is a variation
of needs and processes. Some are more CAPEX-oriented in a way they acquire resources
and others are more OPEX-oriented.
Gardner: Do
you have examples of where some of the fruits of Catalyst, and some of the
benefits of the ecosystem approach, have led to applications, use cases, and
demonstrated innovation?
Parsons: What
we are trying to do is show how easy ARM is to use. We have taken some really
powerful, important code that runs every day on our big national services and
have simply moved them across to ARM. Users don’t really understand or don’t
need to understand they are running on a different system. It’s that boring.
We have picked up one or two
problems with code that probably exist in the x86 version, but because you are
running a new processor, it exposes it more, and we are fixing that. But in
general -- and this is absolutely the wrong message for an interview -- we are
proceeding in a very boring way. The reason I say that is, it’s really
important that this is boring, because if we don’t show this is easy, people
won’t put ARM on their next procurement list. They will think that it’s too
difficult, that it’s going to be too much trouble to move codes across.
One of the aims of Catalyst,
and I am joking, is definitely to be boring. And I think at this point in time
we are succeeding.
More interestingly, though, another
aim of Catalyst is about storage. The ARM systems around the world today still tend
to do storage on x86. The storage will be running on Lustre or BeeGFS server, all sitting on
x86 boxes.
We have made a decision to do
everything on ARM, if we can. At the moment, we are looking at different
storage software on ARM services. We are looking at Ceph, at Lustre, at
BeeGFS, because unless you have the ecosystem running in ARM as well, people
won’t think it’s as pervasive of a solution as x86, or Power, or whatever.
The benefit of being boring
Goh: Yes,
in this case boring is good. Seamless movement of code across different
platforms is the key. It’s very important for an ecosystem to be successful. It
needs to be easy to develop code for and it, and it needs to be easy to port.
And those are just as important with our commercial HPC systems for the broader
HPC customer base.
In addition to customers
writing their own code and compiling it well and easily to ARM, we also want to
make it easy for the independent software vendors (ISVs) to join and strengthen
this ecosystem.
Parsons: That
is one of the key things we intend to do over the next six months. We have good
relationships, as does HPE, with many of the big and small ISVs. We want to get
them on a new kind of system, let them compile their code, and get some help to
do it. It’s really important that we end up with ISV code on ARM, all running
successfully.
Gardner: If we
are in a necessary, boring period, what will happen when we get to a more
exciting stage? Where do you see this potentially going? What are some of the
use cases using supercomputers to impact business, commerce, public services, and
public health?
Goh: It’s
not necessarily boring, but it is brilliantly done. There will be richer
choices coming to supercomputing. That’s the key. Supercomputing and HPC need
to reach a broader customer base. That’s the goal of our HPC team within HPE.
Over the years, we have
increased our reach to the commercial side, such as the financial industry and retailers.
Now there is a new opportunity coming with the bottom-up approach of using HPC.
Instead of building models out of physics, we train the models with example
data. This is a new way of using HPC. We will reach out to even more users.
So, the success of our
supercomputing industry is getting more users, with high diversity, to come on
board.
Gardner: Mark,
what are some of the exciting outcomes you anticipate?
Parsons: As we
get more experience with ARM it will become a serious player. If you look
around the world today, in Japan, for example, they have a big new ARM-based
supercomputer that’s going to be similar to the Thunder X2 when it’s launched.
I predict in the next three or four years we are going to see some very significant supercomputers up at the X2 level, built from ARM processors. Based on what I hear, the next generations of these processors will produce a really exciting time.
Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.
You may also be
interested in:
- HPE and PTC join forces to deliver best manufacturing outcomes from the OT-IT productivity revolution
- How rapid machine learning at the racing edge accelerates Venturi Formula E Team to top-efficiency wins
- The budding storage relationship between HPE and Cohesity brings the best of startup innovation to global enterprise reach
- Industrial-strength wearables combine with collaboration cloud to bring anywhere expertise to intelligent-edge work
- HPE’s Erik Vogel on what's driving success in hybrid cloud adoption and optimization
- IT kit sustainability: A business advantage and balm for the planet
- How total deployment intelligence overcomes the growing complexity of multicloud management
- Manufacturer gains advantage by expanding IoT footprint from many machines to many insights
- How Texmark Chemicals pursues analysis-rich, IoT-pervasive path to the ‘refinery of the future’