There is an adage, not quite yet old, suggesting that compute is free but storage is not. Perhaps a more accurate and, as far as public clouds are concerned, apt adaptation of this saying might be that computing and storage are free, and so are inbound networking within a region, but moving data across regions in a public cloud is brutally expensive, and it is even more costly spanning regions.
So much so that, at a certain scale, it makes sense to build your own datacenter and create your own infrastructure hardware and software stack that mimics the salient characteristics …
Bouncing Back To Private Clouds With OpenStack was written by Timothy Prickett Morgan at The Next Platform.
Large enterprises are embracing NVM-Express flash as the storage technology of choice for their data intensive and often highly unpredictable workloads. NVM-Express devices bring with them high performance – up to 1 million I/O operations per second – and low latency – less than 100 microseconds. And flash storage now has high capacity, too, making it a natural fit for such datacenter applications.
As we have discussed here before, all-flash arrays are quickly becoming mainstream, particularly within larger enterprises, as an alternative to disk drives in environments where tens or hundreds of petabytes of data – rather than the …
Making Remote NVM-Express Flash Look Local And Fast was written by Jeffrey Burt at The Next Platform.
Exascale computing, which has been long talked about, is now – if everything remains on track – only a few years away. Billions of dollars are being spent worldwide to develop systems capable of an exaflops of computation, which is 50 times the performance of the most capacious systems the current Top500 supercomputer rankings and will usher in the next generation of HPC workloads.
As we have talked about at The Next Platform, China is pushing ahead with three projects aimed at delivering exascale systems to the market, with a prototype – dubbed the Tianhe-3 – being prepped for …
AMD Researchers Eye APUs For Exascale was written by Jeffrey Burt at The Next Platform.
With a new generation of Xeon processors coming out later this year from Intel and AMD trying to get back in the game with its own X86 server chips – they probably will not be called Opterons – it is not a surprise to us that server makers are having a bit of trouble making their numbers in recent months. But we are beginning to wonder if something else might be going on here than the usual pause before a big set of processor announcements.
In many ways, server spending is a leading indicator because when companies are willing to …
Mixed Signals From Server Land was written by Timothy Prickett Morgan at The Next Platform.
Computed tomography (CT) is a widely-used process in medicine and industry. Many X-ray images taken around a common axis of rotation are combined to create a three-dimensional view of an object, including the interior.
In medicine, this technique is commonly used for non-invasive diagnostic applications such as searching for cancerous masses. Industrial applications include examining metal components for stress fractures and comparing produced materials to the original computer-aided design (CAD) specifications. While this process provides invaluable insight, it also presents an analytical challenge.
State-of-the-art CT scanners use synchrotron light, which enables very fine resolution in four dimensions. For example, the …
Scaling Compute to Meet Large-Scale CT Scan Demands was written by Nicole Hemsoth at The Next Platform.
Cloud computing in its various forms is often pitched as a panacea of sorts for organizations that are looking to increase the flexibility of their data and to drive down costs associated with their IT infrastructures. And for many, the benefits are real.
By offloading many of their IT tasks – from processing increasingly large amounts of data to storing all that data – to cloud providers, companies can take the money normally spent in building out and managing their internal IT infrastructures and put it toward other important business efforts. In addition, by having their data in an easily …
Financial Institutions Weigh Risks, Benefits of Cloud Migration was written by Jeffrey Burt at The Next Platform.
Moore’s Law has been the driving force behind computer evolution for more than five decades, fueling the relentless innovation that led to more transistors being added to increasingly smaller processors that rapidly increased the performance of computing while at the same time driving down the cost.
Fifty-plus years later, as the die continues to shrink, there are signs that Moore’s Law is getting more difficult to keep up with. For example, Intel – the keeper of the Moore’s Law flame – has pushed back the transition from 14-nanometers to 10nm by more than a year as it worked through issues …
A Glimmer of Light Against Dark Silicon was written by Jeffrey Burt at The Next Platform.
The idea of bringing compute and memory functions in computers closer together physically within the systems to accelerate the processing of data is not a new one.
Some two decades ago, vendors and researchers began to explore the idea of processing-in-memory (PIM), the concept of placing compute units like CPUs and GPUs closer together to help reduce to the latency and cost inherent in transferring data, and building prototypes with names like EXECUBE, IRAM, DIVA and FlexRAM. For HPC environments that relied on data-intensive applications, the idea made a lot of sense. Reduce the distance between where data was …
Promises, Challenges Ahead for Near-Memory, In-Memory Processing was written by Jeffrey Burt at The Next Platform.
One way to characterize the challenges of achieving exascale, is to look at how advancing compute, memory/storage, software, and fabric will lead to a future-generation balanced system. Recently Al Gara of Intel, Jean-Philippe Nominé of the French Alternative Energies and Atomic Energy Commission (CEA), and Katherine Riley of Argonne National Lab were on a panel that weighed in on these and a host of other interrelated challenges.
Exascale will represent a watershed achievement in computer science. More than just a nice, round number (“exa-” denotes a billion billion), exascale computing is also supposed1 by the Human Brain Project and …
Exascale Leaders on Next Horizons in Supercomputing was written by Nicole Hemsoth at The Next Platform.
During the past decade, enterprises have begun using machine learning (ML) to collect and analyze large amounts of data to obtain a competitive advantage. Now some are looking to go even deeper – using a subset of machine learning techniques called deep learning (DL), they are seeking to delve into the more esoteric properties hidden in the data. The goal is to create predictive applications for such areas as fraud detection, demand forecasting, click prediction, and other data-intensive analyses.
The computer vision, speech recognition, natural language processing, and audio recognition applications being developed using DL techniques need large amounts of …
Current Trends in Tools for Large-Scale Machine Learning was written by Nicole Hemsoth at The Next Platform.
There is an old joke that in the post-apocalyptic world that comes about because of plague or nuclear war, only two things will be left alive: cockroaches and Keith Richards, the guitarist for the Rolling Stones. As it hails from New York City, you can understand why Cockroach Labs, the upstart software company that is cloning Google’s Spanner distributed relational database, chose that particular bug to epitomize a system that will stay alive no matter what. But, they could have just as easily called it RichardsDB.
When discussing Google’s cloud implementation of Spanner, which launched in beta earlier this …
Google Spanner Inspires CockroachDB To Outrun It was written by Timothy Prickett Morgan at The Next Platform.
When it comes to solving deep learning cluster and software stack problems at scale, few companies are riding the bleeding edge like Chinese search giant, Baidu. As we have detailed in the past, the company’s Silicon Valley AI Lab (SVAIL) has some unique hardware and framework implementations that put AI to the test at scale. As it turns out, scalability of the models they specialize in (beginning with speech recognition) is turning out to be one of the great challenges ahead on all fronts—hardware, compiler/runtime, and framework alike.
As we have described across multiple use cases, at Baidu and elsewhere …
Baidu Targets Deep Learning Scalability Challenges was written by Nicole Hemsoth at The Next Platform.
Whether being built for capacity or capability, the conventional wisdom about memory provisioning on the world’s fastest systems is changing quickly. The rise of 3D memory has thrown a curveball into the field as HPC centers consider the specific tradeoffs between traditional, stacked, and hybrid combinations of both on next-generation supercomputers. In short, allocating memory on these machines is always tricky—with a new entrant like stacked memory into the design process, it is useful to gauge where 3D devices might fit.
While stacked memory is getting a great deal of airplay, for some HPC application areas, it might fall just …
3D Memory Sparks New Thinking in HPC System Design was written by Nicole Hemsoth at The Next Platform.
Many oil and gas exploration shops have invested many years and many more millions of dollars into homegrown codes, which is critical internally (competitiveness, specialization, etc.) but leaves gaps in the ability to quickly exploit new architectures that could lead to better performance and efficiency.
That tradeoff between architectural agility and continuing to scale a complex, in-house base of codes is one that many companies with HPC weigh—and as one might imagine, oil and gas giant, ExxonMobil is no different.
The company came to light last week with news that it scaled one of its mission-critical simulation codes on the …
Inside Exxon’s Effort to Scale Homegrown Codes, Keep Architectural Pace was written by Nicole Hemsoth at The Next Platform.
The Global Scientific Information and Computing Center at the Tokyo Institute of Technology has been at the forefront of accelerated computing, and well before GPUs came along and made acceleration not only cool but affordable and normal. But its latest system, Tsubame 3.0, being installed later this year, the Japanese supercomputing center is going to lay the hardware foundation for a new kind of HPC application that brings together simulation and modeling and machine learning workloads.
The hot new idea in HPC circles is not just being able to run machine learning workloads side by side with simulations, but to …
Japan Keeps Accelerating With Tsubame 3.0 AI Supercomputer was written by Timothy Prickett Morgan at The Next Platform.
Kubernetes, the software container management system born out of Google, has seen its popularity in the datacenter soar in recent years as datacenter admins look to gain greater control of highly distributed computing environments and to take advantage of the advantages that virtualization, containers, and other technologies offer.
Open sourced by Google three years ago, Kubernetes is derived from the Borg and Omega controllers that the search engine giant created for its own clusters and has become an important part of the management tool ecosystem that includes OpenStack, Mesos, and Docker Swarm. These all try to bring order to what …
Wrapping Kubernetes Around Applications Old And New was written by Jeffrey Burt at The Next Platform.
As the world’s dominant supplier of switches and routers into the datacenter and one of the big providers of servers (with a hope of transforming part of that server businesses into a sizeable hyperconverged storage business), Cisco Systems provides a kind of lens into the glass houses of the world. You can see what companies are doing – and what they are not doing – and watch how Cisco reacts to try to give them what they need while trying to extract the maximum profit out of its customers.
Say what you will, but Cisco has spent the last …
What Bellwether Cisco Reveals About Datacenter Spending was written by Timothy Prickett Morgan at The Next Platform.
Five years ago, many bleeding edge IT shops had either implemented a Hadoop cluster for production use or at least had a cluster set aside to explore the mysteries of MapReduce and the HDFS storage system.
While it is not clear all these years later how many ultra-scale production Hadoop deployments there are in earnest (something we are analyzing for a later in-depth piece), those same shops are likely on top trying to exploit the next big thing in the datacenter—machine learning, or for the more intrepid, deep learning.
For those that were able to get large-scale Hadoop clusters into …
How Yahoo’s Internal Hadoop Cluster Does Double-Duty on Deep Learning was written by Nicole Hemsoth at The Next Platform.
What supercomputers will look like in the future, post-Moore’s Law, is still a bit hazy. As exascale computing comes into focus over the next several years, system vendors, universities and government agencies are all trying to get a gauge on what will come after that. Moore’s Law, which has driven the development of computing systems for more than five decades, is coming to an end as the challenge of making smaller chips loaded with more and more features is becoming increasingly difficult to do.
While the rise of accelerators, like GPUs, FPGAs and customized ASICs, silicon photonics and faster interconnects …
Large-Scale Quantum Computing Prototype on Horizon was written by Jeffrey Burt at The Next Platform.
Google has proven time and again it is on the extreme bleeding edge of invention when it comes to scale out architectures that make supercomputers look like toys. But what would the world look like if the search engine giant had started selling capacity on its vast infrastructure back in 2005, before Amazon Web Services launched, and then shortly thereafter started selling capacity on its high level platform services? And what if it had open sourced these technologies, as it has done with the Kubernetes container controller?
The world would be surely different, and the reason it is not is …
Why Google’s Spanner Database Won’t Do As Well As Its Clone was written by Timothy Prickett Morgan at The Next Platform.