Just two years ago, supercomputing was thrust into a larger spotlight because of the surge of interest in deep learning. As we talked about here, the hardware similarities, particularly for training on GPU-accelerated machines and key HPC development approaches, including MPI to scale across a massive number of nodes, brought new attention to the world of scientific and technical computing.
What wasn’t clear then was how traditional supercomputing could benefit from all the framework developments in deep learning. After all, they had many of the same hardware environments and problems that could benefit from prediction, but what they lacked …
Supercomputing Gets Neural Network Boost in Quantum Chemistry was written by Nicole Hemsoth at The Next Platform.
Google created quite a stir when it released architectural details and performance metrics for its homegrown Tensor Processing Unit (TPU) accelerator for machine learning algorithms last week. But as we (and many of you reading) pointed out, comparing the TPU to earlier “Kepler” generation GPUs from Nvidia was not exactly a fair comparison. Nvidia has done much in the “Maxwell” and “Pascal” GPU generations specifically to boost machine learning performance.
To set the record straight, Nvidia took some time and ran some benchmarks of its own to put the performance of its latest Pascal accelerators, particularly the ones it aims …
Does Google’s TPU Investment Make Sense Going Forward? was written by Timothy Prickett Morgan at The Next Platform.
There has been much discussion about the “black box” problem of neural networks. Sophisticated models can perform well on predictive workloads, but when it comes to backtracking how the system came to its end result, there is no clear way to understand what went right or wrong—or how the model turned on itself to arrive a conclusion.
For old-school machine learning models, this was not quite the problem it is now with non-linear, hidden data structures and countless parameters. For researchers deploying neural networks for scientific applications, this lack of reproducibility from the black box presents validation hurdles, but for …
A Look at Facebook’s Interactive Neural Network Visualization System was written by Nicole Hemsoth at The Next Platform.
If you can’t beat the largest cloud players at economies of scale, the only option is to try to outrun them in performance, capabilities, or price.
While go head to head with Amazon, Google, Microsoft, or IBM on cloud infrastructure prices is a challenge, one way to gain an edge is by being the first to deliver bleeding-edge hardware to those users with emerging, high-value workloads. The trick is to be at the front of the wave, often with some of the most expensive iron, which is risky with AWS and others nipping at heels and quick to follow. It …
Risk or Reward: First Nvidia DGX-1 Boxes Hit the Cloud was written by Nicole Hemsoth at The Next Platform.
Containers are an extremely mobile, safe and reproducible computing infrastructure that is now ready for production HPC computing. In particular, the freely available Singularity container framework has been designed specifically for HPC computing. The barrier to entry is low and the software is free.
At the recent Intel HPC Developer Conference, Gregory Kurtzer (Singularity project lead and LBNL staff member) and Krishna Muriki (Computer Systems Engineer at LBNL) provided a beginning and advanced tutorial on Singularity. One of Kurtzer’s key takeaways: “setting up workflows in under a day is commonplace with Singularity”.
Many people have heard about code modernization and …
Singularity Containers for HPC, Reproducibility, and Mobility was written by Nicole Hemsoth at The Next Platform.
Here at The Next Platform, we tend to focus on deep learning as it relates to hardware and systems versus algorithmic innovation, but at times, it is useful to look at the co-evolution of both code and machines over time to see what might be around the next corner.
One segment of the deep learning applications area that has generated a great deal of work is in speech recognition and translation—something we’ve described in detail via efforts from Baidu, Google, Tencent, among others. While the application itself is interesting, what is most notable is how codes …
From Mainframes to Deep Learning Clusters: IBM’s Speech Journey was written by Nicole Hemsoth at The Next Platform.
If you want an object lesson in the interplay between Moore’s Law, Dennard scaling, and the desire to make money from selling chips, you need look no further than the past several years of Intel’s Xeon E3 server chip product lines.
The Xeon E3 chips are illustrative particularly because Intel has kept the core count constant for these processors, which are used in a variety of gear, from workstations (remote and local), entry servers to storage controllers to microservers employed at hyperscalers and even for certain HPC workloads (like Intel’s own massive EDA chip design and validation farms). …
Xeon E3: A Lesson In Moore’s Law And Dennard Scaling was written by Timothy Prickett Morgan at The Next Platform.
While a lot of the applications in the world run on clusters of systems with a relatively modest amount of compute and memory compared to NUMA shared memory systems, big iron persists and large enterprises want to buy it. That is why IBM, Fujitsu, Oracle, Hewlett Packard Enterprise, Inspur, NEC, Unisys, and a few others are still in the big iron racket.
Fujitsu and its reseller partner – server maker, database giant, and application powerhouse Oracle – have made a big splash at the high end of the systems space with a very high performance processor, the Sparc64-XII, and a …
Fujitsu Takes On IBM Power9 With Sparc64-XII was written by Timothy Prickett Morgan at The Next Platform.
Four years ago, Google started to see the real potential for deploying neural networks to support a large number of new services. During that time it was also clear that, given the existing hardware, if people did voice searches for three minutes per day or dictated to their phone for short periods, Google would have to double the number of datacenters just to run machine learning models.
The need for a new architectural approach was clear, Google distinguished hardware engineer, Norman Jouppi, tells The Next Platform, but it required some radical thinking. As it turns out, that’s exactly …
First In-Depth Look at Google’s TPU Architecture was written by Nicole Hemsoth at The Next Platform.
Spark has grown rapidly over the past several years to become a significant tool in the big data world. Since emerging from the AMPLab at the University of California at Berkeley, Spark adoption has increased quickly as the open-source in-memory processing platform has become a key framework for handling workloads for machine learning, graph processing and other emerging technologies.
Developers continue to add more capabilities to Spark, including a SQL front-end for SQL processing and APIs for relational query optimization to build upon the basic Spark RDD API. The addition of the Spark SQL module promises greater performance and opens …
Flare Gives Spark SQL a Performance Boost was written by Nicole Hemsoth at The Next Platform.
Exascale computing promises to bring significant changes to both the high-performance computing space and eventually enterprise datacenter infrastructures.
The systems, which are being developed in multiple countries around the globe, promise 50 times the performance of current 20 petaflop-capable systems that are now among the fastest in the world, and that bring corresponding improvements in such areas as energy efficiency and physical footprint. The systems need to be powerful run the increasingly complex applications being used by engineers and scientists, but they can’t be so expensive to acquire or run that only a handful of organizations can use them.
At …
Machine Learning, Analytics Play Growing Role in US Exascale Efforts was written by Nicole Hemsoth at The Next Platform.
The tick-tock-clock three step dance that Intel will be using to progress its Core client and Xeon server processors in the coming years is on full display now that the Xeon E3-1200 v6 processors based on the “Kaby Lake” have been unveiled.
The Kaby Lake chips are Intel’s third generation of Xeon processors that are based on its 14 nanometer technologies, and as our naming convention for Intel’s new way of rolling out chips suggests, it is a refinement of both the architecture and the manufacturing process that, by and large, enables Intel to ramp up the clock speed on …
Intel “Kaby Lake” Xeon E3 Sets The Server Cadence was written by Timothy Prickett Morgan at The Next Platform.
The IT industry has gotten good at developing computer systems that can easily work at the nanosecond and millisecond scales.
Chip makers have developed multiple techniques that have helped drive the creation of nanosecond-scale devices, while primarily software-based solutions have been rolled out for slower millisecond-scale devices. For a long time, that has been enough to address the various needs of high-performance computing environments, where performance is a key metric and issues such as the simplicity of the code and the level of programmer productivity are not as great of concerns. Given that, programming at the microsecond level as not …
Google Researchers Measure the Future in Microseconds was written by Nicole Hemsoth at The Next Platform.
It’s easy when talking about the ongoing push toward exascale computing to focus on the hardware architecture that will form the foundation of the upcoming supercomputers. Big systems packed with the latest chips and server nodes and storage units still hold a lot of fascination, and the names of those vendors involved – like Intel, IBM and Nvidia – still resonate broadly across the population. And that interest will continue to hold as exascale systems move from being objects of discussion now to deployed machines over the next several years.
However, the development and planning of these systems is a …
Argonne National Lab Lead Details Exascale Balancing Act was written by Nicole Hemsoth at The Next Platform.
Data security has always been a key concern as organizations look to leverage the operational and cost efficiencies that come with cloud computing. Huge volumes of critical and sensitive data often are in transit and distributed among multiple systems, and increasingly are being collected and analyzed in cloud-based big data platforms, putting them at higher risk of being hacked and compromised.
Even as encryption methods and security procedures have improved, the data is still at risk of being attacked through such vulnerabilities as access pattern leakage through memory or the network. It’s the threat of an attack via access pattern …
An Opaque Alternative to Oblivious Cloud Analytics was written by Nicole Hemsoth at The Next Platform.
It is almost a foregone conclusion that when it comes to infrastructure, the industry will follow the lead of the big hyperscalers and cloud builders, building a foundation of standardized hardware for serving, storing, and switching and implementing as much functionality and intelligence as possible in the software on top of that to allow it to scale up and have costs come down as it does.
The reason this works is that these companies have complete control of their environments, from the processors and memory in the supply chain to the Linux kernel and software stack maintained by hundreds to …
Weaving Together Flash For Nearly Unlimited Scale was written by Timothy Prickett Morgan at The Next Platform.
It is difficult to shed a tear for Moore’s Law when there are so many interesting architectural distractions on the systems horizon.
While the steady tick-tock of the tried and true is still audible, the last two years have ushered a fresh wave of new architectures targeting deep learning and other specialized workloads, as well as a bevy of forthcoming hybrids with FPGAs, zippier GPUs, and swiftly emerging open architectures. None of this has been lost on system architects at the bleeding edge, where the rush is on to build systems that can efficiently chew through ever-growing datasets with …
Neuromorphic, Quantum, Supercomputing Mesh for Deep Learning was written by Nicole Hemsoth at The Next Platform.
With absolute dominance in datacenter and desktop compute, considerable sway in datacenter storage, a growing presence in networking, and profit margins that are the envy of the manufacturing and tech sectors alike, it is not a surprise that companies are gunning for Intel. They all talk about how Moore’s Law is dead and how that removes a significant advantage for the world’s largest – and most profitable – chip maker.
After years of this, the top brass in Intel’s Technology and Manufacturing Group as well as its former chief financial officer, who is now in charge of its manufacturing, operations, …
Intel Vigorously Defends Chip Innovation Progress was written by Timothy Prickett Morgan at The Next Platform.
It is one thing to scale a neural network on a single GPU or even a single system with four or eight GPUs. But it is another thing entirely to push it across thousands of nodes. Most centers doing deep learning have relatively small GPU clusters for training and certainly nothing on the order of the Titan supercomputer at Oak Ridge National Laboratory.
The emphasis on machine learning scalability has often been focused on node counts in the past for single-model runs. This is useful for some applications, but as neural networks become more integrated into existing workflows, including those …
Scaling Deep Learning on an 18,000 GPU Supercomputer was written by Nicole Hemsoth at The Next Platform.
The ramp for Intel’s Optane 3D XPoint memory, which sits between DDR4 main memory and flash or disk storage, or beside main memory, in the storage hierarchy, is going to shake up the server market. And maybe not in the ways that Intel and its partner, Micron Technology, anticipate.
Last week, Intel unveiled its first Optane 3D XPoint solid state cards and drives, which are now being previewed by selected hyperscalers and which will be rolling out in various capacities and form factors in the coming quarters. As we anticipated, and as Intel previewed last fall, the company is …
Use Optane Memory Like A Hyperscaler was written by Timothy Prickett Morgan at The Next Platform.