Archive

Category Archives for "The Next Platform"

Generalizing a Hardware, Software Platform for Industrial AI

Industrial companies have replaced people with machines, systems analysts with simulations, and now the simulations themselves could be outpaced by machine learning—albeit with a human in the loop, at the beginning at least.

The new holy grail of machine learning and deep learning, as with almost any other emerging technology set, is to mask enough of the complexity to make it broadly applicable without lose the performance and other features that can be retained by taking a low-level approach. If this kind of deep generalization can happen, a new mode of considering how data is used in research and enterprise

Generalizing a Hardware, Software Platform for Industrial AI was written by Nicole Hemsoth at The Next Platform.

HPC System Delays Stall InfiniBand

Enterprise spending on servers was a bit soft in the first quarter, as evidenced by the financial results posted by Intel and by its sometime rival IBM, but the hyperscale and HPC markets, at least when it comes to networking, was a bit soft, according to high-end network chip and equipment maker Mellanox Technologies.

In the first quarter ended March 31, Mellanox had a 4.1 percent revenue decline, to $188.7 million, and because of higher research and development costs, presumably associated with the rollout of 200 Gb/sec Quantum InfiniBand technology (which the company has talked about) and

HPC System Delays Stall InfiniBand was written by Timothy Prickett Morgan at The Next Platform.

Rambus, Microsoft Put DRAM Into Deep Freeze To Boost Performance

Energy efficiency and operating costs for systems are as important as raw performance in today’s datacenters. Everyone from the largest hyperscalers and high performance computing centers to large enterprises that are sometimes like them are trying squeeze as much performance as they can from their infrastructure while reining in power consumption and the costs associated with keeping it all from overheating.

Throw in the slowing down of Moore’s Law and new emerging workloads like data analytics and machine learning, and the challenge to these organizations becomes apparent.

In response, organizations on the cutting edge have embraced accelerators like GPUs and

Rambus, Microsoft Put DRAM Into Deep Freeze To Boost Performance was written by Timothy Prickett Morgan at The Next Platform.

Red Hat Is The Gatekeeper For ARM In The Datacenter

If any new hardware technology is going to get traction in the datacenter, it has to have the software behind it. And as the dominant supplier of commercial Linux, Red Hat’s support of ARM-based servers gives the upstart chip makers like Applied Micro, Cavium, and Qualcomm the leverage to help pry the glasshouse doors open and get a slice of the server and storage business that is so utterly dominated by Intel’s Xeon processors today.

It is now or never for ARM in the datacenter, and that means Red Hat has to go all the way and not just support

Red Hat Is The Gatekeeper For ARM In The Datacenter was written by Jeffrey Burt at The Next Platform.

Intel Melds Xeon E5 And E7 With Skylake

We have been saying for the past two year that the impending “Skylake” Xeon processors represented the biggest platform architectural change in the Xeon processor business at Intel since the transformational “Nehalem” Xeon 5500s that debuted back in March 2009 into the gaping maw of the Great Recession.

There is no global recession breathing down the IT sector’s neck like a hungry wolf here in 2017, eight years and seven chip generations later. But Intel is facing competitive pressures from AMD’s Naples Opterons, IBM’s Power9, and the ARM collective (mainly Cavium and Qualcomm at this point, but Applied Micro is

Intel Melds Xeon E5 And E7 With Skylake was written by Timothy Prickett Morgan at The Next Platform.

An Inside Look at One Major Media Outlet’s Cloud Transition

When it comes to large media in the U.S. with a broad reach into television and digital, the Scripps Networks Interactive brand might not come to mind first, but many of the channels and sources are household names, including HGTV, Food Network, and The Travel Channel, among others.

Delivering television and web-based content and services is a data and computationally intensive task, which just over five years ago was handled by on-premises machines in the company’s two local datacenters. In order to keep up with peaks in demand during popular events or programs, Scripps Interactive had to overprovision with those

An Inside Look at One Major Media Outlet’s Cloud Transition was written by Nicole Hemsoth at The Next Platform.

OpenMP: From Parallel Loops To Exaflops

This fall will mark twenty years since the publication of the v1.0 specification of OpenMP Fortran. From early loop parallelism to a heterogeneous, exascale future, OpenMP has apparently weathered well the vicissitudes and tumultuous changes of the computer industry over that past two decades and appears to be positioned to address the needs of our exascale future.

In the 1990s when the OpenMP specification was first created, memory was faster than the processors that performed the computation. This is the exact opposite of today’s systems where memory is the key bottleneck and the HPC community is rapidly adopting faster memory

OpenMP: From Parallel Loops To Exaflops was written by Timothy Prickett Morgan at The Next Platform.

Red Hat Gears Up OpenShift For Developers

During the five years that Red Hat has been building out its OpenShift cloud applications platform, much of the focus has been on making it easier to use by customers looking to adapt to an increasingly cloud-centric world for both new and legacy applications. Just as it did with the Linux operating system through Red Hat Enterprise Linux and related middleware and tools, the vendor has worked to make it easier for enterprises to embrace OpenShift.

That has included a major reworking of the platform with the release of version 3.0 last year, which ditched Red Hat’s in-house technologies for

Red Hat Gears Up OpenShift For Developers was written by Jeffrey Burt at The Next Platform.

Lessons Learned From Facebook’s Split Network Backbone

Distributed applications, whether they are containerized or not, have a lot of benefits when it comes to modularity and scale. But in a world of feature creep on all applications, whether they are internally facing ones running a business or hyperscale consumer applications like Google’s search engine or Facebook’s social media network, these distributed applications put a huge strain on the network.

This, more than any other factor, is why network costs are rising faster than any other aspect of the datacenter. Gone are the days when everything was done in three or four tiers, with a Web server like

Lessons Learned From Facebook’s Split Network Backbone was written by Timothy Prickett Morgan at The Next Platform.

The Datacenter Does Not Revolve Around AWS, Despite Its Gravity

If the public cloud computing market were our solar system, then Amazon Web Services would be Jupiter and Saturn together and the remaining five fast-growing big clouds would be like the inner planets like Mercury, Venus, Earth, Mars,  and that pile of rocks that used to be a planet mixed up with those clouds that are finding growth a bit more challenging  – think Uranus and Neptune and maybe even Pluto if you still want to count it a planet.

This analogy came to us in the wake of Amazon’s reporting of its financial results for the first quarter of

The Datacenter Does Not Revolve Around AWS, Despite Its Gravity was written by Timothy Prickett Morgan at The Next Platform.

Cluster Management for Distributed Machine Learning at Scale

Over the last couple of decades, those looking for a cluster management platform faced no shortage of choices. However, large-scale clusters are being asked to operate in different ways, namely by chewing on large-scale deep learning workloads—and this requires a specialized approach to get high utilization, efficiency, and performance.

Nearly all of the cluster management tools from the high performance computing community are being bent in the machine learning direction, but for production deep learning shops, there appears to be a DIY tendency. This is not as complicated as it might sound, given the range of container-based open source tools,

Cluster Management for Distributed Machine Learning at Scale was written by Nicole Hemsoth at The Next Platform.

Swiss Army Knife File System Cuts Through Petabytes

Petabytes are in the future of every company, and luckily, the future is always being invented by the IT ecosystem to handle it.

Those wrestling with tens to hundreds of petabytes of data today are constantly challenged to find the best ways to store, search and manage it all. Qumulo was founded in 2012 and came out of the chute two years ago with the idea that a software-based file system that includes built-in analytics that enables the system to increase capacity as the amount of data grows. QSFS, now called Qumulo Core, also does it all: fast with big

Swiss Army Knife File System Cuts Through Petabytes was written by Jeffrey Burt at The Next Platform.

Intel Moves Xeons To The Moore’s Law Leading Edge

In the wake of the Technology and Manufacturing Day event that Intel hosted last month, we were pondering this week about what effect the tick-tock-clock method of advancing chip designs and manufacturing processes might have on the Xeon server chip line from Intel, and we suggested that it might close the gaps between the Core client chips and the Xeons. It turns out that Intel is not only going to close the gaps, but reverse them and put the Xeons on the leading edge.

To be precise, Brian Krzanich, Intel’s chief financial officer, and Robert Swan, the company’s chief financial

Intel Moves Xeons To The Moore’s Law Leading Edge was written by Timothy Prickett Morgan at The Next Platform.

The Next Battleground for Deep Learning Performance

The frameworks are in place, the hardware infrastructure is robust, but what has been keeping machine learning performance at bay has far less to do with the system-level capabilities and more to do with intense model optimization.

While it might not be the sexy story that generates the unending wave of headlines around deep learning, hyperparameter tuning is a big barrier when it comes to new leaps in deep learning performance. In more traditional machine learning, there are plenty of open sources tools for this, but where it is needed most is in deep learning—an area that does appear to

The Next Battleground for Deep Learning Performance was written by Nicole Hemsoth at The Next Platform.

Mapping Intel’s Tick Tock Clock Onto Xeon Processors

Chip maker Intel takes Moore’s Law very seriously, and not just because one of its founders observed the consistent rate at which the price of a transistor scales down with each tweak in manufacturing. Moore’s Law is not just personal with Intel. It is business because Intel is a chip maker first and a chip designer second, and that is how it has been able to take over the desktops and datacenters of the world.

Last month, the top brass in Intel’s chip manufacturing operations vigorously defended Moore’s Law, contending that not only was the two year cadence of

Mapping Intel’s Tick Tock Clock Onto Xeon Processors was written by Timothy Prickett Morgan at The Next Platform.

A Trillion Edge Graph on a Single Commodity Node

Efficiently and quickly chewing through one trillion edges of a complex graph is no longer in itself a standalone achievement, but doing so on a single node, albeit with some acceleration and ultra-fast storage, is definitely worth noting.

There are many paths to processing trillions of edges efficiently and with high performance as demonstrated by companies like Facebook with its distributed trillion-edge scaling effort across 200 nodes in 2015 and Microsoft with a similar feat as well.

However, these approaches all required larger clusters; something that comes with obvious cost but over the course of scaling across nodes, latency as

A Trillion Edge Graph on a Single Commodity Node was written by Nicole Hemsoth at The Next Platform.

Pushing A Trillion Row Database With GPU Acceleration

There is an arms race in the nascent market for GPU-accelerated databases, and the winner will be the one that can scale to the largest datasets while also providing the most compatibility with industry-standard SQL.

MapD and Kinetica are the leaders in this market, but BlazingDB, Blazegraph, and PG-Strom also in the field, and we think it won’t be long before the commercial relational database makers start adding GPU acceleration to their products, much as they have followed SAP HANA with in-memory processing.

MapD is newer than Kinetica, and it up until now, it has been content to allow clustering

Pushing A Trillion Row Database With GPU Acceleration was written by Timothy Prickett Morgan at The Next Platform.

Escher Erases Batching Lines for Efficient FPGA Deep Learning

Aside from the massive parallelism available in modern FPGAs, there are other two other key reasons why reconfigurable hardware is finding a fit in neural network processing in both training and inference.

First is the energy efficiency of these devices relative to performance, and second is the flexibility of an architecture that can be recast to the framework at hand. In the past we’ve described how FPGAs can fit over GPUs as well as custom ASICs in some cases, and what the future might hold for novel architectures based on reconfigurable hardware for these workloads. But there is still

Escher Erases Batching Lines for Efficient FPGA Deep Learning was written by Nicole Hemsoth at The Next Platform.

Taking the Heavy Lifting Out of TensorFlow at Extreme Scale

There is no real middle ground when it comes to TensorFlow use cases. Most implementations take place either in a single node or at the drastic Google-scale, with few scalability stories in between.

This is starting to change, however, as more users find an increasing array of open source tools based on MPI and other approaches to hop to multi-GPU scalability for training, but it still not simple to scale Google’s own framework across larger machines. Code modifications get hairy beyond single node and for the MPI uninitiated, there is a steep curve to scalable deep learning.

Although high performance

Taking the Heavy Lifting Out of TensorFlow at Extreme Scale was written by Nicole Hemsoth at The Next Platform.

Riding The Virtual SAN Gravy Train

Being the first mover in establishing a new technology in the enterprise is important, but it is not more important than having a vast installed base and sales force peddling an existing and adjacent product set in which to sell a competing and usually lagging technology.

VMware can’t be said to have initially been particularly enthusiastic about server-SAN hybrids like those created by upstart Nutanix, with its Acropolis platform, or pioneer Hewlett Packard Enterprise, which bought into the virtual SAN market with its LeftHand Networks acquisition in October 2008 for $360 million and went back to the hyperconverged well

Riding The Virtual SAN Gravy Train was written by Timothy Prickett Morgan at The Next Platform.