Archive

Category Archives for "The Next Platform"

The Ironic – And Fleeting – Volatility In NVM Storage

There is no question any longer that flash memory has found its place – in fact, many places – in the datacenter, even though the debate is still raging about when or if solid state memory will eventually replace disk drives in all datacenters of the world.

Sometime between tomorrow and never is a good guess.

Flash is still a hot commodity, so much so that the slower-than-expected transition to 3D NAND has caused a shortage in supply that is driving up the price of enterprise-grade flash – unfortunately at the same time that memory makers are having trouble cranking

The Ironic – And Fleeting – Volatility In NVM Storage was written by Timothy Prickett Morgan at The Next Platform.

When Agility Outweighs Cost for Big Cloud Operations

If anything has become clear over the last several years of watching infrastructure and application trends among SaaS-businesses, it is that nothing is as simple as it seems. Even relatively straightforward services, like transactional email processing, have some hidden layers of complexity, which tends to equal cost.

For most businesses providing web-based services, the solution for complexity was found by offloading infrastructure concerns to the public cloud. This provided geographic availability, pricing flexibility, and development agility, but not all web companies went the cloud route out of the gate. Consider SendGrid, which pushes out over 30 billion emails per month.

When Agility Outweighs Cost for Big Cloud Operations was written by Nicole Hemsoth at The Next Platform.

Performance Portability on the Road to Exascale

The software ecosystem in high performance computing is set to be more complex with the leaps in capability coming with next generation exascale systems. Among several challenges is making sure that applications retain their performance as they scale to higher core counts and accelerator-rich systems.

Software development and performance profiling company, Allinea, which has been around for almost two decades in HPC, was recently acquired by ARM to add to the company’s software ecosystem story. We talked with one of the early employees of Allinea, VP of Product Development, Mark O’Connor about what has come before—and what the software performance

Performance Portability on the Road to Exascale was written by Nicole Hemsoth at The Next Platform.

Nvidia Is A Textbook Case Of Sowing And Reaping Markets

In a properly working capitalist economy, innovative companies make big bets, help create new markets, vanquish competition or at least hold it at bay, and profit from all of the hard work, cleverness, luck, and deal making that comes with supplying a good or service to demanding customers.

There is no question that Nvidia has become a textbook example of this as it helped create and is now benefitting from the wave of accelerated computing that is crashing into the datacenters of the world. The company is on a roll, and is on the very laser-sharp cutting edge of its

Nvidia Is A Textbook Case Of Sowing And Reaping Markets was written by Timothy Prickett Morgan at The Next Platform.

One Small Step Toward Supercomputers in Space

While it is not likely we will see large supercomputers on the International Space Station (ISS) anytime soon, HPE is getting a head start on providing more advanced on-board computing capabilities via a pair of its aptly-named “Apollo” water-cooled servers in orbit.

The two-socket machines, connected with Infiniband will put Broadwell computing capabilities on the ISS, mostly running benchmarks, including High Performance Linpack (HPL), the metric that determines the Top 500 supercomputer rankings. These tests, in addition to the more data movement-centric HPCG benchmark and NASA’s own NAS parallel benchmark will determine what performance changes, if any, are to be

One Small Step Toward Supercomputers in Space was written by Nicole Hemsoth at The Next Platform.

A Look Inside U.S. Nuclear Security’s Commodity Technology Systems

In the following interview, Dr. Matt Leininger, Deputy for Advanced Technology Projects at Lawrence Livermore National Laboratory (LLNL), one of the National Nuclear Security Administration’s (NNSA) Tri Labs describes how scientists at the Tri Labs—LLNL, Los Alamos National Laboratory (LANL), and Sandia National Laboratories (SNL)—carry out the work of certifying America’s nuclear stockpile through computational science and focused above-ground experiments.

We spoke with Dr. Leininger about some of the workflow that Tri Labs scientists follow, how the Commodity Technology Systems clusters are used in their research, and how machine learning is helping them.

The overall goal is to demonstrate a

A Look Inside U.S. Nuclear Security’s Commodity Technology Systems was written by Nicole Hemsoth at The Next Platform.

Fujitsu Bets On Deep Leaning And HPC Divergence

One of the luckiest coincidences in the past decade has been that the hybrid machines designed for traditional HPC simulation and modeling workloads. which combined the serial processing performance of CPUs and the parallel processing and massive memory bandwidth of GPUs, we also well suited to run machine learning training applications.

If the HPC community had not made the investments in hybrid architectures, the hyperscalers and their massive machine learning operations, which drive just about all aspects of their businesses these days, would not have seen such stellar results. (And had that not happen, many of us would have had

Fujitsu Bets On Deep Leaning And HPC Divergence was written by Timothy Prickett Morgan at The Next Platform.

IBM Highlights PowerAI, OpenPower System Scalability

The golden grail of deep learning has two handles. On the one hand, developing and scaling systems that can train ever-growing model sizes is one concern. And on the other side, cutting down inference latencies while preserving accuracy of trained models is another issue.

Being able to do both on the same system represents its own host of challenges, but for one group at IBM Research, focusing on the compute-intensive training element will have a performance and efficiency trickle-down effect that speed the entire deep learning workflow—from training to inference. This work, which is being led at the T.J. Watson

IBM Highlights PowerAI, OpenPower System Scalability was written by Nicole Hemsoth at The Next Platform.

The Shape Of AMD HPC And AI Iron To Come

In the IT business, just like any other business, you have to try to sell what is on the truck, not what is planned to be coming out of the factories in the coming months and years. AMD has put a very good X86 server processor into the market for the first time in nine years, and it also has a matching GPU that gives its OEM and ODM partners a credible alternative for HPC and AI workload to the combination of Intel Xeons and Nvidia Teslas that dominate hybrid computing these days.

There are some pretty important caveats to

The Shape Of AMD HPC And AI Iron To Come was written by Timothy Prickett Morgan at The Next Platform.

Google Research Pushing Neural Networks Out of the Datacenter

Google has been at the bleeding edge of AI hardware development with the arrival of its TPU and other system-scale modifications to make large-scale neural network processing efficient and fast.

But just as these developments come to fruition, advances in trimmed-down deep learning could move many more machine learning training and inference operations out of the datacenter and into your palm.

Although it might be natural to think the reason that neural networks cannot be processed on devices like smartphones is because of limited CPU power, the real challenge lies in the vastness of the model sizes and hardware memory

Google Research Pushing Neural Networks Out of the Datacenter was written by Nicole Hemsoth at The Next Platform.

A MapReduce Accelerator to Tackle Molecular Dynamics

Novel architectures are born out of necessity and for some applications, including molecular dynamics, there have been endless attempts to push parallel performance.

In this area, there are already numerous approaches to acceleration. At the highest end is the custom ASIC-driven Anton machine from D.E. Shaw, which is the fastest system, but certainly not the cheapest. On the more accessible accelerators side are Tesla GPUs for accelerating highly parallel parts of the workload—and increasingly, FPGAs are being considered for boosting the performance of major molecular dynamics applications, most notably GROMACS as well as general purpose, high-end CPUs (Knights Landing

A MapReduce Accelerator to Tackle Molecular Dynamics was written by Nicole Hemsoth at The Next Platform.

Wrenching Efficiency Out of Custom Deep Learning Accelerators

Custom accelerators for neural network training have garnered plenty of attention in the last couple of years, but without significant software footwork, many are still difficult to program and could leave efficiencies on the table. This can be addressed through various model optimizations, but as some argue, the efficiency and utilization gaps can also be addressed with a tailored compiler.

Eugenio Culurciello, an electrical engineer at Purdue University, argues that getting full computational efficiency out of custom deep learning accelerators is difficult. This prompted his team at Purdue to build an FPGA based accelerator that could be agnostic to CNN

Wrenching Efficiency Out of Custom Deep Learning Accelerators was written by Nicole Hemsoth at The Next Platform.

Drilling Down Into The Xeon Skylake Architecture

The “Skylake” Xeon SP processors from Intel have been in the market for nearly a month now, and we thought it would be a good time to drill down into the architecture of the new processor. We also want to see what the new Xeon SP has to offer for HPC, AI, and enterprise customers as well as compare the new X86 server motor to prior generations of Xeons and alternative processors in the market that are vying for a piece of the datacenter action.

That’s a lot, and we relish it. So let’s get started with a deep dive

Drilling Down Into The Xeon Skylake Architecture was written by Timothy Prickett Morgan at The Next Platform.

Making Mainstream Ethernet Switches More Malleable

While the hyperscalers of the world are pushing the bandwidth envelope and are rolling out 100 Gb/sec gear in their Ethernet switch fabrics and looking ahead to the not-too-distant future when 200 Gb/sec and even 400 Gb/sec will be available, enterprise customers who make up the majority of switch revenues are still using much slower networks, usually 10 Gb/sec and sometimes even 1 Gb/sec, and 100 Gb/sec seems like a pretty big leap.

That is why Broadcom, which still has the lion’s share of switch ASIC sales in the datacenter, has revved its long-running Trident family of chips, which lead

Making Mainstream Ethernet Switches More Malleable was written by Timothy Prickett Morgan at The Next Platform.

Accelerating Deep Learning Insights With GPU-Based Systems

Explosive data growth and a rising demand for real-time analytics are making high performance computing (HPC) technologies increasingly vital to success. Organizations across all industries are seeking the next generation of IT solutions to facilitate scientific research, enhance national security, ensure economic stability, and empower innovation to face the challenges of today and tomorrow.

HPC solutions are key to quickly answering some of the world’s most daunting questions. From Tesla’s self-driving car to quantum computing, artificial intelligence (AI) is enabling unparalleled compute capabilities and outmatching humans at many cognitive tasks. Deep learning, an advanced AI technique, is growing in popularity

Accelerating Deep Learning Insights With GPU-Based Systems was written by Timothy Prickett Morgan at The Next Platform.

An OS for Neuromorphic Computing on Von Neumann Devices

Ziyang Xu from Peking University in Beijing sees several similarities between the human brain and Von Neumann computing devices.

While he believes there is value in neuromorphic, or brain-inspired, chips, with the right operating system, standard processors can mimic some of the efficiencies of the brain and achieve similar performance for certain tasks.

In short, even though our brains do not have the same high-speed, high-frequency capacity of modern chips, the way information is routed and addressed is the key. At the core of this efficiency is a concept similar to a policy engine governing information compression, storage, and retrieval.

An OS for Neuromorphic Computing on Von Neumann Devices was written by Nicole Hemsoth at The Next Platform.

Managing Deep Learning Development Complexity

For developers, deep learning systems are becoming more interactive and complex. From the building of more malleable datasets that can be iteratively augmented, to more dynamic models, to more continuous learning being built into neural networks, there is a greater need to manage the process from start to finish with lightweight tools.

“New training samples, human insights, and operation experiences can consistently emerge even after deployment. The ability of updating a model and tracking its changes thus becomes necessary,” says a team from Imperial College London that has developed a library to manage the iterations deep learning developers make across

Managing Deep Learning Development Complexity was written by Nicole Hemsoth at The Next Platform.

Fresh Thinking on Programmable Storage

The difficult part about storage these days is far less about capability than about adapting to change. Accordingly, the concept of programmable storage is getting more traction.

With such an approach, the internal services and abstractions of the storage stack can be considered as building blocks for higher level services and while this may not be simple to implement, it can work to eliminate duplication of complex, unreliable software that is commonly used as a workaround for storage system deficiencies.

A team from the University of California Santa Cruz has developed a programmable storage platform to counter these issues called

Fresh Thinking on Programmable Storage was written by Nicole Hemsoth at The Next Platform.

The Skylake Calm Before The Compute Storm

It looks like the push to true cloud computing that many of us have been projecting for such a long time is actually coming to pass, and despite many of the misgivings that many of us have expressed about giving up control of our own datacenters and the applications that run there.

That chip giant Intel is making money as it rolls up its 14 nanometer manufacturing process ramp is not really a surprise. During the second quarter of this year, rival AMD had not yet gotten its “Naples” Epyc X86 server processors into the field, and IBM has pushed

The Skylake Calm Before The Compute Storm was written by Timothy Prickett Morgan at The Next Platform.

The Supercomputing Slump Hits HPC

Supercomputing, by definition, is an esoteric, exotic, and relatively small slice of the overall IT landscape, but it is, also by definition, a vital driver of innovation within IT and in all of the segments of the market where simulation, modeling, and now machine learning are used to provide goods and services.

As we have pointed out many times, the supercomputing business is not, however, one that is easy to participate in and generate a regular stream of revenues and predictable profits and it is most certainly one where the vendors and their customers have to, by necessity, take the

The Supercomputing Slump Hits HPC was written by Timothy Prickett Morgan at The Next Platform.