Author Archives: Timothy Prickett Morgan
Author Archives: Timothy Prickett Morgan
It all started with a new twist on an old idea, that of a lightweight software container running inside Linux that would house applications and make them portable. And now Docker is coming full circle and completing its eponymous platform by opening up the tools to allow users to create their own minimalist Linux operating system that is containerized and modular above the kernel and that only gives applications precisely what they need to run.
The new LinuxKit is not so much a variant of Linux as a means of creating them. The toolkit for making Linuxes, which was unveiled …
Docker Completes Its Platform With DIY Linux was written by Timothy Prickett Morgan at The Next Platform.
Scaling the performance of machine learning frameworks so they can train larger neural networks – or so the same training a lot faster – has meant that the hyperscalers of the world who are essentially creating this technology have had to rely on increasingly beefy compute nodes, these days almost universally augmented with GPUs.
There is a healthy rivalry between the hyperscalers over who has the best machine learning framework and the co-designed iron to take the best advantage of its capabilities. At its F8 developer conference, Facebook not only rolled out a significantly tweaked variant of the open source …
Machine Learning Gets An InfiniBand Boost With Caffe2 was written by Timothy Prickett Morgan at The Next Platform.
So you are a system architect, and you want to make the databases behind your applications run a lot faster. There are a lot of different ways to accomplish this, and now, there is yet another — and perhaps more disruptive — one.
You can move the database storage from disk drives to flash memory, You can move from a row-based database to a columnar data store that segments data and speeds up accesses to it. And for even more of a performance boost, you can pull that columnar data into main memory to be read and manipulated at memory …
FPGAs To Shake Up Stodgy Relational Databases was written by Timothy Prickett Morgan at The Next Platform.
There is no question that plenty of companies are shifting their storage infrastructure from giant NAS and SAN appliances to more generic file, block, and object storage running on plain vanilla X86 servers equipped with flash and disk. And similarly, companies are looking to the widespread availability of dual-ported NVM-Express drives on servers to give them screaming flash performance on those storage servers.
But the fact remains that very few companies want to build and support their own storage servers, and moreover, there is still room for an appliance approach to these commodity components for enterprises that want to buy …
Hyperscaling With Consumer Flash And NVM-Express was written by Timothy Prickett Morgan at The Next Platform.
Google created quite a stir when it released architectural details and performance metrics for its homegrown Tensor Processing Unit (TPU) accelerator for machine learning algorithms last week. But as we (and many of you reading) pointed out, comparing the TPU to earlier “Kepler” generation GPUs from Nvidia was not exactly a fair comparison. Nvidia has done much in the “Maxwell” and “Pascal” GPU generations specifically to boost machine learning performance.
To set the record straight, Nvidia took some time and ran some benchmarks of its own to put the performance of its latest Pascal accelerators, particularly the ones it aims …
Does Google’s TPU Investment Make Sense Going Forward? was written by Timothy Prickett Morgan at The Next Platform.
If you want an object lesson in the interplay between Moore’s Law, Dennard scaling, and the desire to make money from selling chips, you need look no further than the past several years of Intel’s Xeon E3 server chip product lines.
The Xeon E3 chips are illustrative particularly because Intel has kept the core count constant for these processors, which are used in a variety of gear, from workstations (remote and local), entry servers to storage controllers to microservers employed at hyperscalers and even for certain HPC workloads (like Intel’s own massive EDA chip design and validation farms). …
Xeon E3: A Lesson In Moore’s Law And Dennard Scaling was written by Timothy Prickett Morgan at The Next Platform.
While a lot of the applications in the world run on clusters of systems with a relatively modest amount of compute and memory compared to NUMA shared memory systems, big iron persists and large enterprises want to buy it. That is why IBM, Fujitsu, Oracle, Hewlett Packard Enterprise, Inspur, NEC, Unisys, and a few others are still in the big iron racket.
Fujitsu and its reseller partner – server maker, database giant, and application powerhouse Oracle – have made a big splash at the high end of the systems space with a very high performance processor, the Sparc64-XII, and a …
Fujitsu Takes On IBM Power9 With Sparc64-XII was written by Timothy Prickett Morgan at The Next Platform.
The tick-tock-clock three step dance that Intel will be using to progress its Core client and Xeon server processors in the coming years is on full display now that the Xeon E3-1200 v6 processors based on the “Kaby Lake” have been unveiled.
The Kaby Lake chips are Intel’s third generation of Xeon processors that are based on its 14 nanometer technologies, and as our naming convention for Intel’s new way of rolling out chips suggests, it is a refinement of both the architecture and the manufacturing process that, by and large, enables Intel to ramp up the clock speed on …
Intel “Kaby Lake” Xeon E3 Sets The Server Cadence was written by Timothy Prickett Morgan at The Next Platform.
It is almost a foregone conclusion that when it comes to infrastructure, the industry will follow the lead of the big hyperscalers and cloud builders, building a foundation of standardized hardware for serving, storing, and switching and implementing as much functionality and intelligence as possible in the software on top of that to allow it to scale up and have costs come down as it does.
The reason this works is that these companies have complete control of their environments, from the processors and memory in the supply chain to the Linux kernel and software stack maintained by hundreds to …
Weaving Together Flash For Nearly Unlimited Scale was written by Timothy Prickett Morgan at The Next Platform.
With absolute dominance in datacenter and desktop compute, considerable sway in datacenter storage, a growing presence in networking, and profit margins that are the envy of the manufacturing and tech sectors alike, it is not a surprise that companies are gunning for Intel. They all talk about how Moore’s Law is dead and how that removes a significant advantage for the world’s largest – and most profitable – chip maker.
After years of this, the top brass in Intel’s Technology and Manufacturing Group as well as its former chief financial officer, who is now in charge of its manufacturing, operations, …
Intel Vigorously Defends Chip Innovation Progress was written by Timothy Prickett Morgan at The Next Platform.
The ramp for Intel’s Optane 3D XPoint memory, which sits between DDR4 main memory and flash or disk storage, or beside main memory, in the storage hierarchy, is going to shake up the server market. And maybe not in the ways that Intel and its partner, Micron Technology, anticipate.
Last week, Intel unveiled its first Optane 3D XPoint solid state cards and drives, which are now being previewed by selected hyperscalers and which will be rolling out in various capacities and form factors in the coming quarters. As we anticipated, and as Intel previewed last fall, the company is …
Use Optane Memory Like A Hyperscaler was written by Timothy Prickett Morgan at The Next Platform.
An increasing amount of the world’s data is encapsulated in images and video and by its very nature it is difficult and extremely compute intensive to do any kind of index and search against this data compared to the relative ease with which we can do so with the textual information that heretofore has dominated both our corporate and consumer lives.
Initially, we had to index images by hand and it is with these datasets that the hyperscalers pushed the envelope with their image recognition algorithms, evolving neural network software on CPUs and radically improving it with a jump to …
Facebook Pushes The Search Envelope With GPUs was written by Timothy Prickett Morgan at The Next Platform.
Increasing parallelism is the only way to get more work out of a system. Architecting for that parallelism required requires a lot of rethinking of each and every component in a system to make everything hum along as efficiently as possible.
There are lots of ways to skin the parallelism cats and squeeze more performance and less energy out of the system, and for DRAM memory, just stacking things up helps, but according to some research done at Stanford University, the University of Texas, and GPU maker Nvidia, there is another way to boost performance and lower energy consumption. The …
Squeezing The Joules Out Of DRAM, Possibly Without Stacking was written by Timothy Prickett Morgan at The Next Platform.
Ethernet switching has its own kinds of Moore’s Law barriers. The transition from 10 Gb/sec to 100 Gb/sec devices over the past decade has been anything but smooth, and a lot of compromises had to be made to even get to the interim – and unusual – 40 Gb/sec stepping stone towards the 100 Gb/sec devices that are ramping today in the datacenter.
While 10 Gb/sec Ethernet switching is fine for a certain class of enterprise applications that are not bandwidth hungry, for the hyperscalers and cloud builders, 100 Gb/sec is nowhere near enough bandwidth, and 200 Gb/sec, which is …
Upstart Switch Chip Maker Tears Up The Ethernet Roadmap was written by Timothy Prickett Morgan at The Next Platform.
ARM has rearchitected its multi-core chips to they can better compete in a world where computing needs are becoming more specialized.
The new DynamIQ architecture will provide flexible compute, with up to eight different cores in a single cluster on a system on a chip. Each core can run at a different clock speed so a company making an ARM SoC can tailor the silicon to handle multiple workloads at varying power efficiencies. The DynamIQ architecture also adds faster access to accelerators for artificial intelligence or networking jobs, and a resiliency that allows it to be used in robotics, autonomous …
New ARM Architecture Offers A DynamIQ Response To Compute was written by Timothy Prickett Morgan at The Next Platform.
In the datacenter, flash memory took off first as a caching layer between processors and their cache memories and main memory and the ridiculously slow disk drives that hang off the PCI-Express bus on the systems. It wasn’t until the price of flash came way down and the capacities of flash card and drives came down that companies could think about going completely to flash for some, much less all of their workloads.
So it will be with Intel’s Optane 3D XPoint non-volatile memory, which Intel is starting to roll out in its initial datacenter-class SSDs and will eventually deliver …
Like Flash, 3D XPoint Enters The Datacenter As Cache was written by Timothy Prickett Morgan at The Next Platform.
The hyperscalers of the world are increasingly dependent on machine learning algorithms for providing a significant part of the user experience and operations of their massive applications, so it is not much of a surprise that they are also pushing the envelope on machine learning frameworks and systems that are used to deploy those frameworks. Facebook and Microsoft were showing off their latest hybrid CPU-GPU designs at the Open Compute Summit, and they provide some insight into how to best leverage Nvidia’s latest “Pascal” Tesla accelerators.
Not coincidentally, the specialized systems that have been created for supporting machine learning workloads …
Open Hardware Pushes GPU Computing Envelope was written by Timothy Prickett Morgan at The Next Platform.
With over 700 employees, Cineca is Italy’s largest and most advanced high performance computing (HPC) center, channeling their systems expertise to benefit organizations across the nation. Comprised of six Italian research institutions, 70 Italian universities, and the Italian Ministry of Education, Cineca is a privately held, non-profit consortium.
The team at Cineca dedicates itself to tackling the greatest computational challenges faced by public and private companies, and research institutions. With so many organizations depending on Italy’s HPC centers, Cineca relies on Intel® technologies to reliably and efficiently further the country’s innovations in scientific computing, web and networking-based services, big data …
Cineca’s HPC Systems Tackle Italy’s Biggest Computing Challenges was written by Timothy Prickett Morgan at The Next Platform.
The HPC community is trying to solve the critical compute challenges of next generation high performance computing and ARM considers itself well-positioned to act as a catalyst in this regard. Applications like machine learning and scientific computing are driving demands for orders of magnitude improvements in capacity, capability and efficiency to achieve exascale computing for next generation deployments.
ARM has been taking a co-design approach with the ecosystem from silicon to system design to application development to provide innovative solutions that address this challenge. The recent Allinea acquisition is one example of ARM’s commitment to HPC, but ARM has worked …
ARM Antes Up For An HPC Software Stack was written by Timothy Prickett Morgan at The Next Platform.
Having a proliferation of server makes and models over a span of years in the datacenter is not a huge deal for most enterprises. They cope with the diversity because they support a diversity of application and can kind of keep things isolated and, moreover, IT may be integral to their product or service, but it is usually not the actual product or service that they sell.
Not so with hyperscalers and cloud builders. For them, the IT is the product, and keeping things as monolithic and consistent as possible lowers the cost of goods purchased through higher volumes and …
A Peek Inside Facebook’s Server Fleet Upgrade was written by Timothy Prickett Morgan at The Next Platform.