One way to characterize the challenges of achieving exascale, is to look at how advancing compute, memory/storage, software, and fabric will lead to a future-generation balanced system. Recently Al Gara of Intel, Jean-Philippe Nominé of the French Alternative Energies and Atomic Energy Commission (CEA), and Katherine Riley of Argonne National Lab were on a panel that weighed in on these and a host of other interrelated challenges.
Exascale will represent a watershed achievement in computer science. More than just a nice, round number (“exa-” denotes a billion billion), exascale computing is also supposed1 by the Human Brain Project and …
Exascale Leaders on Next Horizons in Supercomputing was written by Nicole Hemsoth at The Next Platform.
During the past decade, enterprises have begun using machine learning (ML) to collect and analyze large amounts of data to obtain a competitive advantage. Now some are looking to go even deeper – using a subset of machine learning techniques called deep learning (DL), they are seeking to delve into the more esoteric properties hidden in the data. The goal is to create predictive applications for such areas as fraud detection, demand forecasting, click prediction, and other data-intensive analyses.
The computer vision, speech recognition, natural language processing, and audio recognition applications being developed using DL techniques need large amounts of …
Current Trends in Tools for Large-Scale Machine Learning was written by Nicole Hemsoth at The Next Platform.
There is an old joke that in the post-apocalyptic world that comes about because of plague or nuclear war, only two things will be left alive: cockroaches and Keith Richards, the guitarist for the Rolling Stones. As it hails from New York City, you can understand why Cockroach Labs, the upstart software company that is cloning Google’s Spanner distributed relational database, chose that particular bug to epitomize a system that will stay alive no matter what. But, they could have just as easily called it RichardsDB.
When discussing Google’s cloud implementation of Spanner, which launched in beta earlier this …
Google Spanner Inspires CockroachDB To Outrun It was written by Timothy Prickett Morgan at The Next Platform.
When it comes to solving deep learning cluster and software stack problems at scale, few companies are riding the bleeding edge like Chinese search giant, Baidu. As we have detailed in the past, the company’s Silicon Valley AI Lab (SVAIL) has some unique hardware and framework implementations that put AI to the test at scale. As it turns out, scalability of the models they specialize in (beginning with speech recognition) is turning out to be one of the great challenges ahead on all fronts—hardware, compiler/runtime, and framework alike.
As we have described across multiple use cases, at Baidu and elsewhere …
Baidu Targets Deep Learning Scalability Challenges was written by Nicole Hemsoth at The Next Platform.
Whether being built for capacity or capability, the conventional wisdom about memory provisioning on the world’s fastest systems is changing quickly. The rise of 3D memory has thrown a curveball into the field as HPC centers consider the specific tradeoffs between traditional, stacked, and hybrid combinations of both on next-generation supercomputers. In short, allocating memory on these machines is always tricky—with a new entrant like stacked memory into the design process, it is useful to gauge where 3D devices might fit.
While stacked memory is getting a great deal of airplay, for some HPC application areas, it might fall just …
3D Memory Sparks New Thinking in HPC System Design was written by Nicole Hemsoth at The Next Platform.
Many oil and gas exploration shops have invested many years and many more millions of dollars into homegrown codes, which is critical internally (competitiveness, specialization, etc.) but leaves gaps in the ability to quickly exploit new architectures that could lead to better performance and efficiency.
That tradeoff between architectural agility and continuing to scale a complex, in-house base of codes is one that many companies with HPC weigh—and as one might imagine, oil and gas giant, ExxonMobil is no different.
The company came to light last week with news that it scaled one of its mission-critical simulation codes on the …
Inside Exxon’s Effort to Scale Homegrown Codes, Keep Architectural Pace was written by Nicole Hemsoth at The Next Platform.
The Global Scientific Information and Computing Center at the Tokyo Institute of Technology has been at the forefront of accelerated computing, and well before GPUs came along and made acceleration not only cool but affordable and normal. But its latest system, Tsubame 3.0, being installed later this year, the Japanese supercomputing center is going to lay the hardware foundation for a new kind of HPC application that brings together simulation and modeling and machine learning workloads.
The hot new idea in HPC circles is not just being able to run machine learning workloads side by side with simulations, but to …
Japan Keeps Accelerating With Tsubame 3.0 AI Supercomputer was written by Timothy Prickett Morgan at The Next Platform.
Kubernetes, the software container management system born out of Google, has seen its popularity in the datacenter soar in recent years as datacenter admins look to gain greater control of highly distributed computing environments and to take advantage of the advantages that virtualization, containers, and other technologies offer.
Open sourced by Google three years ago, Kubernetes is derived from the Borg and Omega controllers that the search engine giant created for its own clusters and has become an important part of the management tool ecosystem that includes OpenStack, Mesos, and Docker Swarm. These all try to bring order to what …
Wrapping Kubernetes Around Applications Old And New was written by Jeffrey Burt at The Next Platform.
As the world’s dominant supplier of switches and routers into the datacenter and one of the big providers of servers (with a hope of transforming part of that server businesses into a sizeable hyperconverged storage business), Cisco Systems provides a kind of lens into the glass houses of the world. You can see what companies are doing – and what they are not doing – and watch how Cisco reacts to try to give them what they need while trying to extract the maximum profit out of its customers.
Say what you will, but Cisco has spent the last …
What Bellwether Cisco Reveals About Datacenter Spending was written by Timothy Prickett Morgan at The Next Platform.
Five years ago, many bleeding edge IT shops had either implemented a Hadoop cluster for production use or at least had a cluster set aside to explore the mysteries of MapReduce and the HDFS storage system.
While it is not clear all these years later how many ultra-scale production Hadoop deployments there are in earnest (something we are analyzing for a later in-depth piece), those same shops are likely on top trying to exploit the next big thing in the datacenter—machine learning, or for the more intrepid, deep learning.
For those that were able to get large-scale Hadoop clusters into …
How Yahoo’s Internal Hadoop Cluster Does Double-Duty on Deep Learning was written by Nicole Hemsoth at The Next Platform.
What supercomputers will look like in the future, post-Moore’s Law, is still a bit hazy. As exascale computing comes into focus over the next several years, system vendors, universities and government agencies are all trying to get a gauge on what will come after that. Moore’s Law, which has driven the development of computing systems for more than five decades, is coming to an end as the challenge of making smaller chips loaded with more and more features is becoming increasingly difficult to do.
While the rise of accelerators, like GPUs, FPGAs and customized ASICs, silicon photonics and faster interconnects …
Large-Scale Quantum Computing Prototype on Horizon was written by Jeffrey Burt at The Next Platform.
Google has proven time and again it is on the extreme bleeding edge of invention when it comes to scale out architectures that make supercomputers look like toys. But what would the world look like if the search engine giant had started selling capacity on its vast infrastructure back in 2005, before Amazon Web Services launched, and then shortly thereafter started selling capacity on its high level platform services? And what if it had open sourced these technologies, as it has done with the Kubernetes container controller?
The world would be surely different, and the reason it is not is …
Why Google’s Spanner Database Won’t Do As Well As Its Clone was written by Timothy Prickett Morgan at The Next Platform.
Much of the talk around artificial intelligence these days focuses on software efforts – various algorithms and neural networks – and such hardware devices as custom ASICs for those neural networks and chips like GPUs and FPGAs that can help the development of reprogrammable systems. A vast array of well-known names in the industry – from Google and Facebook to Nvidia, Intel, IBM and Qualcomm – is pushing hard in this direction, and those and other organizations are making significant gains thanks to new AI methods as deep learning.
All of this development is happening at a time when the …
Memristor Research Highlights Neuromorphic Device Future was written by Jeffrey Burt at The Next Platform.
Despite the emphasis on X86 clusters, large public clouds, accelerators for commodity systems, and the rise of open source analytics tools, there is a very large base of transactional processing and analysis that happens far from this landscape. This is the mainframe, and these fully integrated, optimized systems account for a large majority of the enterprise world’s most critical data processing for the largest companies in banking, insurance, retail, transportation, healthcare, and beyond.
With great memory bandwidth, I/O, powerful cores, and robust security, mainframes are still the supreme choice for business-critical operations at many Global 1000 companies, even if the …
IBM Wants to Make Mainframes Next Platform for Machine Learning was written by Nicole Hemsoth at The Next Platform.
Intel’s many-core “Knights Landing” Xeon Phi processor is just a glimpse of what can be expected of supercomputers in the not-so-distant future of high performance computing. As the industry continues its march to exascale computing, systems will become more complex, and evolution that will include processors that not only sport a rapidly increasing number of cores but also a broad array of on-chip resources ranging from memory to I/O. Workloads ranging from simulation and modeling applications to data analytics and deep learning algorithms are all expected to benefit from what these new systems will offer in terms of processing capabilities. …
Juggling Applications On Intel Knights Landing Xeon Phi Chips was written by Jeffrey Burt at The Next Platform.
If Nvidia’s Datacenter business unit was a startup and separate from the company, we would all be talking about the long investment it has made in GPU-based computing and how the company has moved from the blade of the hockey stick and rounded the bend and is moving rapidly up the handle with triple-digit revenue growth and an initial public offering on the horizon.
But the part of Nvidia’s business that is driven by its Tesla compute engines and GRID visualization engines is not a separate company and it is not going public. Still, that business is sure making things …
Nvidia Tesla Compute Business Quadruples In Q4 was written by Timothy Prickett Morgan at The Next Platform.
China represents a huge opportunity for chip designer ARM as it looks to extend its low-power system-on-a-chip (SoC) architecture beyond the mobile and embedded devices spaces and into new areas, such as the datacenter and emerging markets like autonomous vehicles, drones and the Internet of Things. China is a massive, fast-growing market with tech companies – including such giants as Baidu, Alibaba, and Tencent – looking to leverage such technologies as artificial intelligence to help expand their businesses deeper into the global market and turning to vendors like ARM that can help them fuel that growth.
ARM Holdings, which designs …
ARM Gains Stronger Foothold In China With AI And IoT was written by Jeffrey Burt at The Next Platform.
China’s massive Sunway TaihuLight supercomputer sent ripples through the computing world last year when it debuted in the number-one spot on the Top500 list of the world’s fastest supercomputers. Delivering 93 teraflops of performance – and a peak of more than 125,000 teraflops – the system is nearly three times faster than the second supercomputer on the list (the Tianhe-2, also a Chinese system) and dwarfs the Titan system Oak Ridge National Laboratory, a Cray-based machine that is the world’s third-fastest system, and the fastest in the United States.
However, it wasn’t only the system’s performance that garnered a lot …
Top Chinese Supercomputer Blazes Real-World Application Trail was written by Jeffrey Burt at The Next Platform.
Like all hardware device makers eager to meet the newest market opportunity, Intel is placing multiple bets on the future of machine learning hardware. The chipmaker has already cast its Xeon Phi and future integrated Nervana Systems chips into the deep learning pool while touting regular Xeons to do the heavy lifting on the inference side.
However, a recent conversation we had with Intel turned up a surprising new addition to the machine learning conversation—an emphasis on neuromorphic devices and what Intel is openly calling “cognitive computing” (a term used primarily—and heavily—for IBM’s Watson-driven AI technologies). This is the first …
Intel Gets Serious About Neuromorphic, Cognitive Computing Future was written by Nicole Hemsoth at The Next Platform.
It happens time and time again with any new technology. Coders create this new thing, it gets deployed as an experiment and, if it is an open source project, shared with the world. As its utility is realized, adoption suddenly spikes with the do-it-yourself crowd that is eager to solve a particular problem. And then, as more mainstream enterprises take an interest, the talk turns to security.
It’s like being told to grow up by a grownup, to eat your vegetables. In fact, it isn’t like that at all. It is precisely that, and it is healthy for any technology …
Locking Down Docker To Open Up Enterprise Adoption was written by Timothy Prickett Morgan at The Next Platform.