Archive

Category Archives for "The Next Platform"

The Skylake Calm Before The Compute Storm

It looks like the push to true cloud computing that many of us have been projecting for such a long time is actually coming to pass, and despite many of the misgivings that many of us have expressed about giving up control of our own datacenters and the applications that run there.

That chip giant Intel is making money as it rolls up its 14 nanometer manufacturing process ramp is not really a surprise. During the second quarter of this year, rival AMD had not yet gotten its “Naples” Epyc X86 server processors into the field, and IBM has pushed

The Skylake Calm Before The Compute Storm was written by Timothy Prickett Morgan at The Next Platform.

The Supercomputing Slump Hits HPC

Supercomputing, by definition, is an esoteric, exotic, and relatively small slice of the overall IT landscape, but it is, also by definition, a vital driver of innovation within IT and in all of the segments of the market where simulation, modeling, and now machine learning are used to provide goods and services.

As we have pointed out many times, the supercomputing business is not, however, one that is easy to participate in and generate a regular stream of revenues and predictable profits and it is most certainly one where the vendors and their customers have to, by necessity, take the

The Supercomputing Slump Hits HPC was written by Timothy Prickett Morgan at The Next Platform.

Texas Advanced Supercomputing Center Taps Latest HPC Tech

Building on the successes of the Stampede1 supercomputer, the Texas Advanced Computing Center (TACC) has rolled out its next-generation HPC system, Stampede2. Over the course of 2017, Stampede2 will undergo further optimization phases with the support of a $30 million grant from the National Science Foundation (NSF). With the latest Xeon and Skylake processors, and enhanced networking provided by the Omni-Path architecture, the new flagship system is expected to deliver approximately 18 petaflops, nearly doubling Stampede1’s performance.

Stampede2 continues Stampede1’s mission: enabling thousands of scientists and researchers across the United States to deliver breakthrough scientific discoveries in science, engineering, artificial

Texas Advanced Supercomputing Center Taps Latest HPC Tech was written by Nicole Hemsoth at The Next Platform.

Oil And Gas Upstart Has No Reserves About GPUs

The oil and gas industry has been on the cutting edge of many waves of computing over the several decades that supercomputers have been used to model oil reservoirs in both the planning of the development of an oil field and in quantifying the stored reserves of a field and therefore the future possible revenue stream of the company.

Oil companies can’t see through the earth’s crust to the domes where oil has been trapped, and it is the job of reservoir engineers to eliminate as much risk as possible from the field so the oil company can be prosperous

Oil And Gas Upstart Has No Reserves About GPUs was written by Timothy Prickett Morgan at The Next Platform.

Engineering Code Scales Across 200,000 Cores on Cray Super

Teams at Saudi Aramco using the Shaheen II at King Abdullah University of Science and Technology (KAUST) supercomputer have managed to scale ANSYS Fluent across 200,000 cores, marking top-end scaling for the commercial engineering code.

The news last year of a code scalability effort that topped out at 36,000 cores on the Blue Waters machine at the National Center for Supercomputing Applications (NCSA) was impressive. That was big news for ANSYS and NCSA, but also a major milestone for Cray. Just as Blue Waters is a Cray system, albeit one at the outer reaches of its lifespan (it was installed

Engineering Code Scales Across 200,000 Cores on Cray Super was written by Nicole Hemsoth at The Next Platform.

The Trials And Tribulations Of IBM Systems

IBM is a bit of an enigma these days. It has the art – some would say black magic – of financial engineering down pat, and its system engineering is still quite good. Big Blue talks about all of the right things for modern computing platforms, although it speaks a slightly different dialect because the company still thinks that it is the one setting the pace, and therefore coining the terms, rather than chasing markets that others are blazing. And it just can’t seem to grow revenues, even after tens of billions of dollars in acquisitions and internal investments over

The Trials And Tribulations Of IBM Systems was written by Timothy Prickett Morgan at The Next Platform.

How Google Wants To Rewire The Internet

When all of your business is driven by end users coming to use your applications over the Internet, the network is arguably the most critical part of the infrastructure. That is why search engine and ad serving giant Google, which has expanded out to media serving, hosted enterprise applications, and cloud computing, has put a tremendous amount of investment into creating its own network stack.

But running a fast, efficient, hyperscale network for internal datacenters is not sufficient for a good user experience, and that is why Google has created a software defined networking stack to do routing over the

How Google Wants To Rewire The Internet was written by Timothy Prickett Morgan at The Next Platform.

The Golden Grail: Automatic Distributed Hyperparameter Tuning

While it might not be an exciting problem front and center of AI conversations, the issue of efficient hyperparameter tuning for neural network training is a tough one. There are some options that aim to automate this process but for most users, this is a cumbersome area—and one that can lead to bad performance when not done properly.

The problem with coming up with automatic tools for tuning is that many machine learning workloads are dependent on the dataset and the conditions of the problem being solved. For instance, some users might prefer less accuracy over a speedup or efficiency

The Golden Grail: Automatic Distributed Hyperparameter Tuning was written by Nicole Hemsoth at The Next Platform.

The System Bottleneck Shifts To PCI-Express

No matter what, system architects are always going to have to contend with one – and possibly more – bottlenecks when they design the machines that store and crunch the data that makes the world go around. These days, there is plenty of compute at their disposal, a reasonable amount of main memory to hang off of it, and both Ethernet and InfiniBand are on the cusp of 200 Gb/sec of performance and not too far away from 400 Gb/sec and even higher bandwidths.

Now, it looks like the peripheral bus based on the PCI-Express protocol is becoming the bottleneck,

The System Bottleneck Shifts To PCI-Express was written by Timothy Prickett Morgan at The Next Platform.

Technology Requirements for Deep and Machine Learning

Having been at the forefront of machine learning since the 1980s when I was a staff scientist in the Theoretical Division at Los Alamos performing basic research on machine learning (and later applying it in many areas including co-founding a machine-learning  based drug discovery company), I was lucky enough to participate in the creation and subsequently to observe first-hand the process by which the field of machine-learning grew to become a ‘bandwagon’ that eventually imploded due to misconceptions about the technology and what it could accomplish.

Fueled by across-the-board technology advances including algorithmic developments, machine learning has again become a

Technology Requirements for Deep and Machine Learning was written by Nicole Hemsoth at The Next Platform.

The New Server Economies Of Scale For AMD

In the first story of this series, we discussed the Infinity fabric that is at the heart of the new “Naples” Epyc processor from AMD, and how this modified and extended HyperTransport interconnect glues together the cores, dies, and sockets based on Eypc processors into a unified system.

In this follow-on story, we will expand out from the Epyc processor design to the basic feeds and speeds of the system components based on this chip and then take a look at some of the systems that AMD and its partners were showing off at the Epyc launch a few

The New Server Economies Of Scale For AMD was written by Timothy Prickett Morgan at The Next Platform.

The Convergence Or Divergence Of HPC And AI Iron

Based on datacenter practices of the past two decades, it is a matter of faith that it is always better to run a large number of applications on a given set of generic infrastructure than it is to have highly tuned machines running specific workloads. Siloed applications on separate machines are a thing of the past. However, depending on how Moore’s Law progresses (or doesn’t) and how the software stacks shake out for various workloads, organizations might be running applications on systems with very different architectures, either in a siloed, standalone fashion or across a complex workflow that links the

The Convergence Or Divergence Of HPC And AI Iron was written by Timothy Prickett Morgan at The Next Platform.

The Heart Of AMD’s Epyc Comeback Is Infinity Fabric

At AMD’s Epyc launch few weeks ago, Lisa Su, Mark Papermaster, and the rest of the AMD Epyc team hammered home that AMD designed its new Zen processor core for servers first. This server-first approach has implications for performance, performance per watt, and cost perspectives in both datacenter and consumer markets.

AMD designed Epyc as a modular architecture around its “Zeppelin” processor die with its eight “Zen” architecture cores. To allow multi-die scalability, AMD first reworked its HyperTransport socket-to-socket system I/O architecture for use on a chip, across a multi-chip module (MCM), and for inter-socket connectivity. AMD has named this

The Heart Of AMD’s Epyc Comeback Is Infinity Fabric was written by Timothy Prickett Morgan at The Next Platform.

Securing The HPC Infrastructure

In the world of high performance computing (HPC), the most popular buzzwords include speed, performance, durability, and scalability. Security is one aspect of HPC that is not often discussed, or else seems to be relatively low on the list of priorities when organizations begin building out their infrastructures to support a demanding new applications, whether for oil and gas exploration, machine learning, simulations, or visualization of complex datasets.

While IT security is paramount for businesses in the digital age, HPC systems typically do not encounter the same risks as public-facing infrastructures – in the same way as a web server

Securing The HPC Infrastructure was written by Timothy Prickett Morgan at The Next Platform.

High Expectations for Low Precision at CERN

The last couple of years has seen a steady drumbeat for the use of low precision in growing numbers of workloads driven in large part by the rise of machine learning and deep learning applications and the ongoing desire to cut back on the amount of power consumed.

The interest in low precision is rippling through the high-performance computing (HPC) field, spanning companies that are running applications sets to the tech vendors that are creating the systems and components on which the work is done.

The Next Platform has kept a steady eye on the developments the deep-learning and machine-learning

High Expectations for Low Precision at CERN was written by Nicole Hemsoth at The Next Platform.

The X86 Battle Lines Drawn With Intel’s Skylake Launch

At long last, Intel’s “Skylake” converged Xeon server processors are entering the field, and the competition with AMD’s “Naples” Epyc X86 alternatives can begin and the ARM server chips from Applied Micro, Cavium, and Qualcomm and the Power9 chip from IBM know exactly what they are aiming at.

It is a good time to be negotiating with a chip maker for compute power.

The Skylake chips, which are formally known as the Xeon Scalable Processor family, are the result of the convergence of the workhorse Xeon E5 family of chips for two-socket and four-socket servers with the higher-end Xeon E7

The X86 Battle Lines Drawn With Intel’s Skylake Launch was written by Timothy Prickett Morgan at The Next Platform.

China Tunes Neural Networks for Custom Supercomputer Chip

Supercomputing centers around the world are preparing their next generation architectural approaches for the insertion of AI into scientific workflows. For some, this means retooling around an existing architecture to make capability of double-duty for both HPC and AI.

Teams in China working on the top performing supercomputer in the world, the Sunway TaihuLight machine with its custom processor, have shown that their optimizations for theSW26010 architecture on deep learning models have yielded a 1.91-9.75X speedup over a GPU accelerated model using the Nvidia Tesla K40m in a test convolutional neural network run with over 100 parameter configurations.

Efforts on

China Tunes Neural Networks for Custom Supercomputer Chip was written by Nicole Hemsoth at The Next Platform.

OpenPower, Efficiency Tweaks Define Europe’s DAVIDE Supercomputer

When talking about the future of supercomputers and high-performance computing, the focus tends to fall on the ongoing and high-profile competition between the United States with its slowly eroding place as the kingpin in the industry and China and the tens of billions of dollars that the government has invested in recent years to rapidly expand the reach of the country’s tech community and the use of home-grown technologies in massive new systems.

Both trends were on display at the recent International Supercomputing Conference in Frankfurt, Germany, where China not only continued to hold the top two spots on the

OpenPower, Efficiency Tweaks Define Europe’s DAVIDE Supercomputer was written by Nicole Hemsoth at The Next Platform.

Ethernet Getting Back On The Moore’s Law Track

It would be ideal if we lived in a universe where it was possible to increase the capacity of compute, storage, and networking at the same pace so as to keep all three elements expanding in balance. The irony is that over the past two decades, when the industry needed for networking to advance the most, Ethernet got a little stuck in the mud.

But Ethernet has pulls out of its boots and left them in the swamp and is back to being barefoot again on much more solid ground where it can run faster. The move from 10 Gb/sec

Ethernet Getting Back On The Moore’s Law Track was written by Timothy Prickett Morgan at The Next Platform.

Parameter Encoding on FPGAs Boosts Neural Network Efficiency

The key to creating more efficient neural network models is rooted in trimming and refining the many parameters in deep learning models without losing accuracy. Much of this work is happening on the software side, but devices like FPGAs that can be tuned for trimmed parameters are offering promising early results for implementation.

A team from UC San Diego has created a reconfigurable clustering approach to deep neural networks that encodes the parameters the network according the accuracy requirements and limitations of the platform—which are often bound by memory access bandwidth. Encoding the trimmed parameters in an FPGA resulted in

Parameter Encoding on FPGAs Boosts Neural Network Efficiency was written by Nicole Hemsoth at The Next Platform.