Chip maker Intel takes Moore’s Law very seriously, and not just because one of its founders observed the consistent rate at which the price of a transistor scales down with each tweak in manufacturing. Moore’s Law is not just personal with Intel. It is business because Intel is a chip maker first and a chip designer second, and that is how it has been able to take over the desktops and datacenters of the world.
Last month, the top brass in Intel’s chip manufacturing operations vigorously defended Moore’s Law, contending that not only was the two year cadence of …
Mapping Intel’s Tick Tock Clock Onto Xeon Processors was written by Timothy Prickett Morgan at The Next Platform.
Efficiently and quickly chewing through one trillion edges of a complex graph is no longer in itself a standalone achievement, but doing so on a single node, albeit with some acceleration and ultra-fast storage, is definitely worth noting.
There are many paths to processing trillions of edges efficiently and with high performance as demonstrated by companies like Facebook with its distributed trillion-edge scaling effort across 200 nodes in 2015 and Microsoft with a similar feat as well.
However, these approaches all required larger clusters; something that comes with obvious cost but over the course of scaling across nodes, latency as …
A Trillion Edge Graph on a Single Commodity Node was written by Nicole Hemsoth at The Next Platform.
There is an arms race in the nascent market for GPU-accelerated databases, and the winner will be the one that can scale to the largest datasets while also providing the most compatibility with industry-standard SQL.
MapD and Kinetica are the leaders in this market, but BlazingDB, Blazegraph, and PG-Strom also in the field, and we think it won’t be long before the commercial relational database makers start adding GPU acceleration to their products, much as they have followed SAP HANA with in-memory processing.
MapD is newer than Kinetica, and it up until now, it has been content to allow clustering …
Pushing A Trillion Row Database With GPU Acceleration was written by Timothy Prickett Morgan at The Next Platform.
Aside from the massive parallelism available in modern FPGAs, there are other two other key reasons why reconfigurable hardware is finding a fit in neural network processing in both training and inference.
First is the energy efficiency of these devices relative to performance, and second is the flexibility of an architecture that can be recast to the framework at hand. In the past we’ve described how FPGAs can fit over GPUs as well as custom ASICs in some cases, and what the future might hold for novel architectures based on reconfigurable hardware for these workloads. But there is still …
Escher Erases Batching Lines for Efficient FPGA Deep Learning was written by Nicole Hemsoth at The Next Platform.
There is no real middle ground when it comes to TensorFlow use cases. Most implementations take place either in a single node or at the drastic Google-scale, with few scalability stories in between.
This is starting to change, however, as more users find an increasing array of open source tools based on MPI and other approaches to hop to multi-GPU scalability for training, but it still not simple to scale Google’s own framework across larger machines. Code modifications get hairy beyond single node and for the MPI uninitiated, there is a steep curve to scalable deep learning.
Although high performance …
Taking the Heavy Lifting Out of TensorFlow at Extreme Scale was written by Nicole Hemsoth at The Next Platform.
Being the first mover in establishing a new technology in the enterprise is important, but it is not more important than having a vast installed base and sales force peddling an existing and adjacent product set in which to sell a competing and usually lagging technology.
VMware can’t be said to have initially been particularly enthusiastic about server-SAN hybrids like those created by upstart Nutanix, with its Acropolis platform, or pioneer Hewlett Packard Enterprise, which bought into the virtual SAN market with its LeftHand Networks acquisition in October 2008 for $360 million and went back to the hyperconverged well …
Riding The Virtual SAN Gravy Train was written by Timothy Prickett Morgan at The Next Platform.
In high performance computing, machine learning, and a growing set of other application areas, accelerated, heterogeneous systems are becoming the norm.
With that state come several parallel programming approaches; from OpenMP, OpenACC, OpenCL, CUDA, and others. The trick is choosing the right framework for maximum performance and efficiency—but also productivity.
There have been several studies comparing relative performance between the various frameworks over the last several years, but many take two head to head for compares on a single benchmark or application. A team from Linneaus University in Sweden took these comparisons a step further by developing a custom tool …
Parallel Programming Approaches for Accelerated Systems Compared was written by Nicole Hemsoth at The Next Platform.
International Business Machines has gone through so many changes in its eleven decades of existence, and it is important to remember that some days. If IBM’s recent changes are a bit bewildering, as they were in the late 1980s, the middle 1990s, and the early 2010s in particular, they are perhaps nothing compared the changes that were wrought to transform a maker of meat slicers, time clocks, and tabulating equipment derived from looms.
Yeah, and you thought turning GPUs into compute engines was a stretch.
Herman Hollerith, who graduated from Columbia University in 1879 when its engineering school was still …
International Cognitive And Cloud Business Machines was written by Timothy Prickett Morgan at The Next Platform.
Chip maker Intel is getting out of the business of trying to make money with a commercially supported release of the high-end Lustre parallel file system. Lustre is commonly used at HPC centers and is increasingly deployed by enterprises to take on their biggest file system jobs.
But don’t jump too far to any other conclusions. The core development and support team, minus a few key people who have already left, remains at Intel and will be working on Lustre for the foreseeable future.
Intel quietly announced its plans to shutter its Lustre commercialization efforts in a posting earlier this …
Intel Shuts Down Lustre File System Business was written by Timothy Prickett Morgan at The Next Platform.
It all started with a new twist on an old idea, that of a lightweight software container running inside Linux that would house applications and make them portable. And now Docker is coming full circle and completing its eponymous platform by opening up the tools to allow users to create their own minimalist Linux operating system that is containerized and modular above the kernel and that only gives applications precisely what they need to run.
The new LinuxKit is not so much a variant of Linux as a means of creating them. The toolkit for making Linuxes, which was unveiled …
Docker Completes Its Platform With DIY Linux was written by Timothy Prickett Morgan at The Next Platform.
Scaling the performance of machine learning frameworks so they can train larger neural networks – or so the same training a lot faster – has meant that the hyperscalers of the world who are essentially creating this technology have had to rely on increasingly beefy compute nodes, these days almost universally augmented with GPUs.
There is a healthy rivalry between the hyperscalers over who has the best machine learning framework and the co-designed iron to take the best advantage of its capabilities. At its F8 developer conference, Facebook not only rolled out a significantly tweaked variant of the open source …
Machine Learning Gets An InfiniBand Boost With Caffe2 was written by Timothy Prickett Morgan at The Next Platform.
So you are a system architect, and you want to make the databases behind your applications run a lot faster. There are a lot of different ways to accomplish this, and now, there is yet another — and perhaps more disruptive — one.
You can move the database storage from disk drives to flash memory, You can move from a row-based database to a columnar data store that segments data and speeds up accesses to it. And for even more of a performance boost, you can pull that columnar data into main memory to be read and manipulated at memory …
FPGAs To Shake Up Stodgy Relational Databases was written by Timothy Prickett Morgan at The Next Platform.
The fields where machine learning and neural networks can have positive impacts seem almost limitless. From healthcare and genomics to pharmaceutical development, oil and gas exploration, retail, smart cities and autonomous vehicles, the ability to rapidly and automatically find patterns in massive amounts of data promises to help solve increasingly complex problems and speed up discoveries that will improve lives, create a heathier world and make businesses more efficient.
Climate science is one of those fields that will see significant benefits from machine learning, and scientists in the field are pushing hard to see how the technology can help them …
Machine Learning Storms Into Climate Research was written by Jeffrey Burt at The Next Platform.
There is no question that plenty of companies are shifting their storage infrastructure from giant NAS and SAN appliances to more generic file, block, and object storage running on plain vanilla X86 servers equipped with flash and disk. And similarly, companies are looking to the widespread availability of dual-ported NVM-Express drives on servers to give them screaming flash performance on those storage servers.
But the fact remains that very few companies want to build and support their own storage servers, and moreover, there is still room for an appliance approach to these commodity components for enterprises that want to buy …
Hyperscaling With Consumer Flash And NVM-Express was written by Timothy Prickett Morgan at The Next Platform.
There is increasing interplay between the worlds of machine learning and high performance computing (HPC). This began with a shared hardware and software story since many supercomputing tricks of the trade play well into deep learning, but as we look to next generation machines, the bond keeps tightening.
Many supercomputing sites are figuring out how to work deep learning into their existing workflows, either as a pre- or post-processing step, while some research areas might do away with traditional supercomputing simulations altogether eventually. While these massive machines were designed with simulations in mind, the strongest supers have architectures that parallel …
China Pushes Breadth-First Search Across Ten Million Cores was written by Nicole Hemsoth at The Next Platform.
When Red Hat began building out its OpenShift cloud application platform more than five years ago, the open source software vendor found itself in a similar situation as others in the growing platform-as-a-service (PaaS) space: they were all using technologies developed in-house because there were no real standards in the industry that could be used to guide them.
That changed about three years ago, when Google officials decided to open source the technology – called Borg – they were using internally to manage the search giant’s clusters and make it available to the wider community. Thus was born Kubernetes, …
Red Hat Tunes Up OpenShift For Legacy Code In Kubernetes was written by Jeffrey Burt at The Next Platform.
Intel might have its own thoughts about the trajectory of Moore’s Law, but many leaders in the industry have views that variate slightly from the tick-tock we keep hearing about.
Sophie Wilson, designer of the original Acorn Micro-Computer in the 1970s and later developer of the instruction set for ARM’s low-power processors that have come to dominate the mobile device world has such thoughts. And when Wilson talks about processors and the processor industry, people listen.
Wilson’s message is essentially that Moore’s Law, which has been the driving force behind chip development in particular and the computer industry …
ARM Pioneer Sophie Wilson Also Thinks Moore’s Law Coming to an End was written by Jeffrey Burt at The Next Platform.
Just two years ago, supercomputing was thrust into a larger spotlight because of the surge of interest in deep learning. As we talked about here, the hardware similarities, particularly for training on GPU-accelerated machines and key HPC development approaches, including MPI to scale across a massive number of nodes, brought new attention to the world of scientific and technical computing.
What wasn’t clear then was how traditional supercomputing could benefit from all the framework developments in deep learning. After all, they had many of the same hardware environments and problems that could benefit from prediction, but what they lacked …
Supercomputing Gets Neural Network Boost in Quantum Chemistry was written by Nicole Hemsoth at The Next Platform.
Google created quite a stir when it released architectural details and performance metrics for its homegrown Tensor Processing Unit (TPU) accelerator for machine learning algorithms last week. But as we (and many of you reading) pointed out, comparing the TPU to earlier “Kepler” generation GPUs from Nvidia was not exactly a fair comparison. Nvidia has done much in the “Maxwell” and “Pascal” GPU generations specifically to boost machine learning performance.
To set the record straight, Nvidia took some time and ran some benchmarks of its own to put the performance of its latest Pascal accelerators, particularly the ones it aims …
Does Google’s TPU Investment Make Sense Going Forward? was written by Timothy Prickett Morgan at The Next Platform.
There has been much discussion about the “black box” problem of neural networks. Sophisticated models can perform well on predictive workloads, but when it comes to backtracking how the system came to its end result, there is no clear way to understand what went right or wrong—or how the model turned on itself to arrive a conclusion.
For old-school machine learning models, this was not quite the problem it is now with non-linear, hidden data structures and countless parameters. For researchers deploying neural networks for scientific applications, this lack of reproducibility from the black box presents validation hurdles, but for …
A Look at Facebook’s Interactive Neural Network Visualization System was written by Nicole Hemsoth at The Next Platform.