If any new hardware technology is going to get traction in the datacenter, it has to have the software behind it. And as the dominant supplier of commercial Linux, Red Hat’s support of ARM-based servers gives the upstart chip makers like Applied Micro, Cavium, and Qualcomm the leverage to help pry the glasshouse doors open and get a slice of the server and storage business that is so utterly dominated by Intel’s Xeon processors today.
It is now or never for ARM in the datacenter, and that means Red Hat has to go all the way and not just support …
Red Hat Is The Gatekeeper For ARM In The Datacenter was written by Jeffrey Burt at The Next Platform.
We have been saying for the past two year that the impending “Skylake” Xeon processors represented the biggest platform architectural change in the Xeon processor business at Intel since the transformational “Nehalem” Xeon 5500s that debuted back in March 2009 into the gaping maw of the Great Recession.
There is no global recession breathing down the IT sector’s neck like a hungry wolf here in 2017, eight years and seven chip generations later. But Intel is facing competitive pressures from AMD’s Naples Opterons, IBM’s Power9, and the ARM collective (mainly Cavium and Qualcomm at this point, but Applied Micro is …
Intel Melds Xeon E5 And E7 With Skylake was written by Timothy Prickett Morgan at The Next Platform.
When it comes to large media in the U.S. with a broad reach into television and digital, the Scripps Networks Interactive brand might not come to mind first, but many of the channels and sources are household names, including HGTV, Food Network, and The Travel Channel, among others.
Delivering television and web-based content and services is a data and computationally intensive task, which just over five years ago was handled by on-premises machines in the company’s two local datacenters. In order to keep up with peaks in demand during popular events or programs, Scripps Interactive had to overprovision with those …
An Inside Look at One Major Media Outlet’s Cloud Transition was written by Nicole Hemsoth at The Next Platform.
This fall will mark twenty years since the publication of the v1.0 specification of OpenMP Fortran. From early loop parallelism to a heterogeneous, exascale future, OpenMP has apparently weathered well the vicissitudes and tumultuous changes of the computer industry over that past two decades and appears to be positioned to address the needs of our exascale future.
In the 1990s when the OpenMP specification was first created, memory was faster than the processors that performed the computation. This is the exact opposite of today’s systems where memory is the key bottleneck and the HPC community is rapidly adopting faster memory …
OpenMP: From Parallel Loops To Exaflops was written by Timothy Prickett Morgan at The Next Platform.
During the five years that Red Hat has been building out its OpenShift cloud applications platform, much of the focus has been on making it easier to use by customers looking to adapt to an increasingly cloud-centric world for both new and legacy applications. Just as it did with the Linux operating system through Red Hat Enterprise Linux and related middleware and tools, the vendor has worked to make it easier for enterprises to embrace OpenShift.
That has included a major reworking of the platform with the release of version 3.0 last year, which ditched Red Hat’s in-house technologies for …
Red Hat Gears Up OpenShift For Developers was written by Jeffrey Burt at The Next Platform.
Distributed applications, whether they are containerized or not, have a lot of benefits when it comes to modularity and scale. But in a world of feature creep on all applications, whether they are internally facing ones running a business or hyperscale consumer applications like Google’s search engine or Facebook’s social media network, these distributed applications put a huge strain on the network.
This, more than any other factor, is why network costs are rising faster than any other aspect of the datacenter. Gone are the days when everything was done in three or four tiers, with a Web server like …
Lessons Learned From Facebook’s Split Network Backbone was written by Timothy Prickett Morgan at The Next Platform.
If the public cloud computing market were our solar system, then Amazon Web Services would be Jupiter and Saturn together and the remaining five fast-growing big clouds would be like the inner planets like Mercury, Venus, Earth, Mars, and that pile of rocks that used to be a planet mixed up with those clouds that are finding growth a bit more challenging – think Uranus and Neptune and maybe even Pluto if you still want to count it a planet.
This analogy came to us in the wake of Amazon’s reporting of its financial results for the first quarter of …
The Datacenter Does Not Revolve Around AWS, Despite Its Gravity was written by Timothy Prickett Morgan at The Next Platform.
Over the last couple of decades, those looking for a cluster management platform faced no shortage of choices. However, large-scale clusters are being asked to operate in different ways, namely by chewing on large-scale deep learning workloads—and this requires a specialized approach to get high utilization, efficiency, and performance.
Nearly all of the cluster management tools from the high performance computing community are being bent in the machine learning direction, but for production deep learning shops, there appears to be a DIY tendency. This is not as complicated as it might sound, given the range of container-based open source tools, …
Cluster Management for Distributed Machine Learning at Scale was written by Nicole Hemsoth at The Next Platform.
Petabytes are in the future of every company, and luckily, the future is always being invented by the IT ecosystem to handle it.
Those wrestling with tens to hundreds of petabytes of data today are constantly challenged to find the best ways to store, search and manage it all. Qumulo was founded in 2012 and came out of the chute two years ago with the idea that a software-based file system that includes built-in analytics that enables the system to increase capacity as the amount of data grows. QSFS, now called Qumulo Core, also does it all: fast with big …
Swiss Army Knife File System Cuts Through Petabytes was written by Jeffrey Burt at The Next Platform.
In the wake of the Technology and Manufacturing Day event that Intel hosted last month, we were pondering this week about what effect the tick-tock-clock method of advancing chip designs and manufacturing processes might have on the Xeon server chip line from Intel, and we suggested that it might close the gaps between the Core client chips and the Xeons. It turns out that Intel is not only going to close the gaps, but reverse them and put the Xeons on the leading edge.
To be precise, Brian Krzanich, Intel’s chief financial officer, and Robert Swan, the company’s chief financial …
Intel Moves Xeons To The Moore’s Law Leading Edge was written by Timothy Prickett Morgan at The Next Platform.
The frameworks are in place, the hardware infrastructure is robust, but what has been keeping machine learning performance at bay has far less to do with the system-level capabilities and more to do with intense model optimization.
While it might not be the sexy story that generates the unending wave of headlines around deep learning, hyperparameter tuning is a big barrier when it comes to new leaps in deep learning performance. In more traditional machine learning, there are plenty of open sources tools for this, but where it is needed most is in deep learning—an area that does appear to …
The Next Battleground for Deep Learning Performance was written by Nicole Hemsoth at The Next Platform.
Chip maker Intel takes Moore’s Law very seriously, and not just because one of its founders observed the consistent rate at which the price of a transistor scales down with each tweak in manufacturing. Moore’s Law is not just personal with Intel. It is business because Intel is a chip maker first and a chip designer second, and that is how it has been able to take over the desktops and datacenters of the world.
Last month, the top brass in Intel’s chip manufacturing operations vigorously defended Moore’s Law, contending that not only was the two year cadence of …
Mapping Intel’s Tick Tock Clock Onto Xeon Processors was written by Timothy Prickett Morgan at The Next Platform.
Efficiently and quickly chewing through one trillion edges of a complex graph is no longer in itself a standalone achievement, but doing so on a single node, albeit with some acceleration and ultra-fast storage, is definitely worth noting.
There are many paths to processing trillions of edges efficiently and with high performance as demonstrated by companies like Facebook with its distributed trillion-edge scaling effort across 200 nodes in 2015 and Microsoft with a similar feat as well.
However, these approaches all required larger clusters; something that comes with obvious cost but over the course of scaling across nodes, latency as …
A Trillion Edge Graph on a Single Commodity Node was written by Nicole Hemsoth at The Next Platform.
There is an arms race in the nascent market for GPU-accelerated databases, and the winner will be the one that can scale to the largest datasets while also providing the most compatibility with industry-standard SQL.
MapD and Kinetica are the leaders in this market, but BlazingDB, Blazegraph, and PG-Strom also in the field, and we think it won’t be long before the commercial relational database makers start adding GPU acceleration to their products, much as they have followed SAP HANA with in-memory processing.
MapD is newer than Kinetica, and it up until now, it has been content to allow clustering …
Pushing A Trillion Row Database With GPU Acceleration was written by Timothy Prickett Morgan at The Next Platform.
Aside from the massive parallelism available in modern FPGAs, there are other two other key reasons why reconfigurable hardware is finding a fit in neural network processing in both training and inference.
First is the energy efficiency of these devices relative to performance, and second is the flexibility of an architecture that can be recast to the framework at hand. In the past we’ve described how FPGAs can fit over GPUs as well as custom ASICs in some cases, and what the future might hold for novel architectures based on reconfigurable hardware for these workloads. But there is still …
Escher Erases Batching Lines for Efficient FPGA Deep Learning was written by Nicole Hemsoth at The Next Platform.
There is no real middle ground when it comes to TensorFlow use cases. Most implementations take place either in a single node or at the drastic Google-scale, with few scalability stories in between.
This is starting to change, however, as more users find an increasing array of open source tools based on MPI and other approaches to hop to multi-GPU scalability for training, but it still not simple to scale Google’s own framework across larger machines. Code modifications get hairy beyond single node and for the MPI uninitiated, there is a steep curve to scalable deep learning.
Although high performance …
Taking the Heavy Lifting Out of TensorFlow at Extreme Scale was written by Nicole Hemsoth at The Next Platform.
Being the first mover in establishing a new technology in the enterprise is important, but it is not more important than having a vast installed base and sales force peddling an existing and adjacent product set in which to sell a competing and usually lagging technology.
VMware can’t be said to have initially been particularly enthusiastic about server-SAN hybrids like those created by upstart Nutanix, with its Acropolis platform, or pioneer Hewlett Packard Enterprise, which bought into the virtual SAN market with its LeftHand Networks acquisition in October 2008 for $360 million and went back to the hyperconverged well …
Riding The Virtual SAN Gravy Train was written by Timothy Prickett Morgan at The Next Platform.
In high performance computing, machine learning, and a growing set of other application areas, accelerated, heterogeneous systems are becoming the norm.
With that state come several parallel programming approaches; from OpenMP, OpenACC, OpenCL, CUDA, and others. The trick is choosing the right framework for maximum performance and efficiency—but also productivity.
There have been several studies comparing relative performance between the various frameworks over the last several years, but many take two head to head for compares on a single benchmark or application. A team from Linneaus University in Sweden took these comparisons a step further by developing a custom tool …
Parallel Programming Approaches for Accelerated Systems Compared was written by Nicole Hemsoth at The Next Platform.
International Business Machines has gone through so many changes in its eleven decades of existence, and it is important to remember that some days. If IBM’s recent changes are a bit bewildering, as they were in the late 1980s, the middle 1990s, and the early 2010s in particular, they are perhaps nothing compared the changes that were wrought to transform a maker of meat slicers, time clocks, and tabulating equipment derived from looms.
Yeah, and you thought turning GPUs into compute engines was a stretch.
Herman Hollerith, who graduated from Columbia University in 1879 when its engineering school was still …
International Cognitive And Cloud Business Machines was written by Timothy Prickett Morgan at The Next Platform.
Chip maker Intel is getting out of the business of trying to make money with a commercially supported release of the high-end Lustre parallel file system. Lustre is commonly used at HPC centers and is increasingly deployed by enterprises to take on their biggest file system jobs.
But don’t jump too far to any other conclusions. The core development and support team, minus a few key people who have already left, remains at Intel and will be working on Lustre for the foreseeable future.
Intel quietly announced its plans to shutter its Lustre commercialization efforts in a posting earlier this …
Intel Shuts Down Lustre File System Business was written by Timothy Prickett Morgan at The Next Platform.