Archive

Category Archives for "The Next Platform"

The Hyperscale Effect: Tracking the Newest High-Growth IT Segment

Don’t just call it “the cloud.” Even if you think you know what cloud means, the word is fraught with too many different interpretations for too many people. Nevertheless, the effect of cloud computing, the web, and their assorted massive datacenters has had a profound impact on enterprise computing, creating new application segments and consolidating IT resources into a smaller number of mega-players with tremendous buying power and influence.

Welcome to the hyperscale market.

At the top end of the market, ten companies – behemoths like Google, Amazon, eBay, and Alibaba – each spend over $1 billion per year on

The Hyperscale Effect: Tracking the Newest High-Growth IT Segment was written by Nicole Hemsoth at The Next Platform.

Large-Scale Weather Prediction at the Edge of Moore’s Law

Having access to fairly reliable 10-day forecasts is a luxury, but it comes with high computational costs for centers in the business of providing predictability. This ability to accurately predict weather patterns, dangerous and seasonal alike, has tremendous economic value and accordingly, significant investment goes into powering ever-more extended and on-target forecast.

What is interesting on the computational front is that the future of weather prediction accuracy, timeliness, efficiency, and scalability seems to be riding a curve not so dissimilar to that of Moore’s Law. Big leaps, followed by steady progress up the trend line, and a moderately predictable sense

Large-Scale Weather Prediction at the Edge of Moore’s Law was written by Nicole Hemsoth at The Next Platform.

Driving Compute And Storage Scale Independently

While legacy monolithic applications will linger in virtual machines for an incredibly long time in the datacenter, new scale-out applications run best on new architectures. And that means the underlying hardware will look a lot more like what the hyperscalers have built than traditional siloed enterprise systems.

But most enterprises can’t design their own systems and interconnects, as Google, Facebook, and others have done, and as such, they will rely on others to forge their machines. A group of hot-shot system engineers that were instrumental in creating systems at Sun Microsystems and Cisco Systems in the past two decades have

Driving Compute And Storage Scale Independently was written by Timothy Prickett Morgan at The Next Platform.

Cray Sharpens Approach to Large-Scale Graph Analytics

For those in enterprise circles who still conjure black and white images of hulking supercomputers when they hear the name “Cray,” it is worth noting that the long-standing company has done a rather successful job of shifting a critical side of its business to graph analytics and large-scale data processing.

In addition to the data-driven capabilities cooked into its XC line of supercomputers, and now with their DataWarp burst buffer adding to the I/O bottom line on supercomputers including Cori, among others, Cray has managed to take supercomputing to the enterprise big data set by blending high performance hardware with

Cray Sharpens Approach to Large-Scale Graph Analytics was written by Nicole Hemsoth at The Next Platform.

Samsung Experts Put Kubernetes Through The Paces

No one expects that setting up management tools for complex distributed computing frameworks to be an easy thing, but there is always room for improvement–and always a chance to take out unnecessary steps and improve the automated deployment of such tools.

The hassle of setting up such frameworks, such as Hadoop for data analytics, OpenStack for virtualized infrastructure, or Kubernetes or Mesos for software container management is an inhibitor to the adoption of those new technologies. Working with raw open source software and weaving it together into a successful management control plane is not something all enterprises have the skills

Samsung Experts Put Kubernetes Through The Paces was written by Timothy Prickett Morgan at The Next Platform.

Chip Upstarts Get Coherent With Hybrid Compute

Accelerators and coprocessors are proliferating in the datacenter, and it has been a boon for speeding up certain kinds of workloads and, in many cases, making machine learning or simulation jobs possible at scale for the first time. But ultimately, in a hybrid system, the processors and the accelerators have to share data, and moving it about is a pain in the neck.

Having the memory across these devices operate in a coherent manner – meaning that all devices can address all memory attached to those devices in a single, consistent way – is one of the holy grails of

Chip Upstarts Get Coherent With Hybrid Compute was written by Timothy Prickett Morgan at The Next Platform.

Lustre to DAOS: Machine Learning on Intel’s Platform

Training a machine learning algorithm to accurately solve complex problems requires large amounts of data. The previous article discussed how scalable distributed parallel computing using a high-performance communications fabric like Intel Omni-Path Architecture (Intel OPA) is an essential part of what makes the training of deep learning on large complex datasets tractable in both the data center and within the cloud. Preparing large unstructured data sets for machine learning can be as intensive a task as the training process – especially for the file-system and storage subsystem(s). Starting (and restarting) big data training jobs using tens of thousands of clients

Lustre to DAOS: Machine Learning on Intel’s Platform was written by Nicole Hemsoth at The Next Platform.

Google Takes Unconventional Route with Homegrown Machine Learning Chips

At the tail end of Google’s keynote speech at its developer conference Wednesday, Sundar Pichai, Google’s CEO mentioned that Google had built its own chip for machine learning jobs that it calls a Tensor Processing Unit, or TPU.

The boast was that the TPU offered “an order of magnitude” improvement in the performance per watt for machine learning. Any company building a custom chip for a dedicated workload is worth noting, because building a new processor is a multimillion-dollar effort when you consider hiring a design team, the cost of getting a chip to production and building the hardware and

Google Takes Unconventional Route with Homegrown Machine Learning Chips was written by Nicole Hemsoth at The Next Platform.

IBM Extends GPU Cloud Capabilities, Targets Machine Learning

As we have noted over the last year in particular, GPUs are set for another tsunami of use cases for server workloads in high performance computing and most recently, machine learning.

As GPU maker Nvidia’s CEO stressed at this year’s GPU Technology Conference, deep learning is a target market, fed in part by a new range of their GPUs for training and executing deep neural networks, including the Tesla M40, M4, the existing supercomputing-focused K80, and now, the P100 (Nvidia’s latest Pascal processor, which is at the heart of a new appliance specifically designed for deep learning workloads).

While

IBM Extends GPU Cloud Capabilities, Targets Machine Learning was written by Nicole Hemsoth at The Next Platform.

Climate Research Pulls Deep Learning Onto Traditional Supercomputers

Over the last year, stories pointing to a bright future for deep neural networks and deep learning in general have proliferated. However, most of what we have seen has been centered on the use of deep learning to power consumer services. Speech and image recognition, video analysis, and other features have spun from deep learning developments, but from the mainstream view, it would seem that scientific computing use cases are still limited.

Deep neural networks present an entirely different way of thinking about a problem set and the data that feeds it. While there are established approaches for images and

Climate Research Pulls Deep Learning Onto Traditional Supercomputers was written by Nicole Hemsoth at The Next Platform.

In-Memory Breathes New Life Into NUMA

Hyperscalers and the academics that often do work with them have invented a slew of distributed computing methods and frameworks to get around the problem of scaling up shared memory systems based on symmetric multiprocessing (SMP) or non-uniform memory access (NUMA) techniques that have been in the systems market for decades. SMP and NUMA systems are expensive and they do not scale to hundreds or thousands of nodes, much less the tens of thousands of nodes that hyperscalers require to support their data processing needs.

It sure would be convenient if they did. But for those who are not hyperscalers,

In-Memory Breathes New Life Into NUMA was written by Timothy Prickett Morgan at The Next Platform.

IBM Throws Weight Behind Phase Change Memory

There is no question that the memory hierarchy in systems is being busted wide open and that new persistent memory technology that can be byte addressable like DRAM or block addressable like storage are going to radically change the architecture of machines and the software that runs on them. Picking what memory might go mainstream is another story.

It has been decades since IBM made its own DRAM, but the company still has a keen interest in doing research and development on core processing and storage technologies and in integrating new devices with its Power-based systems.

To that end, IBM

IBM Throws Weight Behind Phase Change Memory was written by Timothy Prickett Morgan at The Next Platform.

Scaling All Flash Arrays Up And Out

The ubiquity of the Xeon server has been a boon for datacenters and makers of IT products alike, creating an ever more powerful on which to build compute, storage, and now networking or a mix of the three all in the same box. But that universal hardware substrate cuts both ways, and IT vendors have to be clever indeed if they hope to differentiate from their competitors.

So it is with the “Wolfcreek” storage platform from DataDirect Networks, which specializes in high-end storage arrays aimed at HPC, webscale, and high-end enterprise workloads. DDN started unveiling the Wolfcreek system last June

Scaling All Flash Arrays Up And Out was written by Timothy Prickett Morgan at The Next Platform.

First Burst Buffer Use at Scale Bolsters Application Performance

Over the last year, we have focused on the role burst buffer technology might play in bolstering the I/O capabilities on some of the world’s largest machines and have focused on use cases ranging from the initial target to more application-centric goals.

As we have described in discussions with the initial creator of the concept, Los Alamos National Lab’s, Gary Grider, the starting point for the technology was for moving the checkpoint and restart capabilities forward faster (detailed description of how this works here). However, as the concept developed over the years, some large supercomputing sites, including the National

First Burst Buffer Use at Scale Bolsters Application Performance was written by Nicole Hemsoth at The Next Platform.

Tesla Pushes Nvidia Deeper Into The Datacenter

If you are trying to figure out what impact the new “Pascal” family of GPUs is going to have on the business at Nvidia, just take a gander at the recent financial results for the datacenter division of the company. If Nvidia had not spent the better part of a decade building its Tesla compute business, it would be a little smaller and quite a bit less profitable.

In the company’s first quarter of fiscal 2017, which ended on May 1, Nvidia posted sales of $1.31 billion, up 13 percent from the year ago period, and net income hit $196

Tesla Pushes Nvidia Deeper Into The Datacenter was written by Timothy Prickett Morgan at The Next Platform.

Can Open Source Hardware Crack Semiconductor Industry Economics?

The running joke is that when a headline begs a question, the answer is, quite simply, “No.” However, when the question is multi-layered, wrought with dependencies that stretch across an entire supply chain, user bases, and device range, and across companies in the throes of their own economic and production uncertainties, a much more nuanced answer is required.

Although Moore’s Law is not technically dead yet, organizations from the IEEE to individual device makers are already thinking their way out of a box that has held the semiconductor industry neatly for decades. However, it turns out, that thought process is

Can Open Source Hardware Crack Semiconductor Industry Economics? was written by Nicole Hemsoth at The Next Platform.

IBM Research Lead Charts Scope of Watson AI Effort

Over the past few years, IBM has been devoting a great deal of corporate energy into developing Watson, the company’s Jeopardy-beating supercomputing platform. Watson represents a larger focus at IBM that integrates machine learning and data analytics technologies to bring cognitive computing capabilities to its customers.

To find out about how the company perceives its own invention, we asked IBM Fellow Dr. Alessandro Curioni to characterize Watson and how it has evolved into new application domains. Curioni, will be speaking on the subject at the upcoming ISC High Performance conference. He is an IBM Fellow, Vice President Europe and

IBM Research Lead Charts Scope of Watson AI Effort was written by Nicole Hemsoth at The Next Platform.

Shared Memory Pushes Wheat Genomics To Boost Crop Yields

Wheat has been an important part of the human diet for the past 9,000 years or so, and depending on the geography can comprise up to 40 percent to 50 percent of the diet within certain regions today.

But there is a problem. Pathogens and changing climate are adversely affecting wheat yields just as Earth’s population is growing, and the Genome Analysis Center (TGAC) is front and center in sequencing and assembling the wheat genome, a multi-year effort that is going to be substantially accelerated by some hardware and updated software.

With the world’s population expected to hit 10 billion

Shared Memory Pushes Wheat Genomics To Boost Crop Yields was written by Timothy Prickett Morgan at The Next Platform.

Facebook Flow Is An AI Factory Of The Future

We have been convinced for many years that machine learning, the kind of artificial intelligence that actually works in practice, not in theory, would be a key element of the next platform. In fact, it might be the most important part of the stack. And therefore, those who control how we deploy machine learning will, to a large extent, control the nature of future applications and the systems that run them.

Machine learning is the killer app for the hyperscalers, just like modeling and simulation were for supercomputing centers decades ago, and we believe we are only seeing the tip

Facebook Flow Is An AI Factory Of The Future was written by Timothy Prickett Morgan at The Next Platform.

Intel Stretches Deep Learning on Scalable System Framework

The strong interest in deep learning neural networks lies in the ability of neural networks to solve complex pattern recognition tasks – sometimes better than humans. Once trained, these machine learning solutions can run very quickly – even in real-time – and very efficiently on low-power mobile devices and in the datacenter.

However training a machine learning algorithm to accurately solve complex problems requires large amounts of data that greatly increases the computational workload. Scalable distributed parallel computing using a high-performance communications fabric is an essential part of what makes the training of deep learning on large complex datasets

Intel Stretches Deep Learning on Scalable System Framework was written by Nicole Hemsoth at The Next Platform.