Nicole Hemsoth

Author Archives: Nicole Hemsoth

Chinese Researchers One Step Closer to Parallel Turing Machine

Parallel computing has become a bedrock in the HPC field, where applications are becoming increasingly complex and such compute-intensive technologies as data analytics, deep learning and artificial intelligence (AI) are rapidly emerging. Nvidia and AMD have driven the adoption of GPU accelerators in supercomputers and other high-end systems, Intel is addressing the space with its many-core Xeon Phi processors and coprocessors and, as we’ve talked about at The Next Platform, other acceleration technologies like field-programmable gate arrays (FPGAs) are pushing their way into the picture. Parallel computing is a booming field.

However, the future was not always so assured.

Chinese Researchers One Step Closer to Parallel Turing Machine was written by Nicole Hemsoth at The Next Platform.

Serving Up Serverless Science

The “serverless” trend has become the new hot topic in cloud computing. Instead of running Infrastructure-as-a-Service (IaaS) instances to provide a service, individual functions are executed on demand.

This has been a boon to the web development world, as it allows the creation of UI-driven workloads without the administrative overhead of provisioning, configuring, monitoring, and maintaining servers. Of course, the industry has not yet reached the point where computation can be done in thin air, so there are still servers involved somewhere. The point is that the customer is not concerned with mundane tasks such as operating system patching and

Serving Up Serverless Science was written by Nicole Hemsoth at The Next Platform.

Peering Through Opaque HPC Benchmarks

If Xzibit worked in the HPC field, he might be heard to say “I heard you like computers, so we modeled a computer with your computer so you can simulate your simulations.”

But simulating the performance of HPC applications is more than just recursion for comedic effect, it provides a key mechanism for the study and prediction of application behavior under different scenarios. While actually running the code on the system will yield a measure of the wallclock time, it does little to provide an explanation of what factors impacted that wallclock time. And of course it requires the system

Peering Through Opaque HPC Benchmarks was written by Nicole Hemsoth at The Next Platform.

Strong FBI Ties for Next Generation Quantum Computer

It is a good time to be the maker of a machine that excels in large-scale optimization problems for cybersecurity and defense. And it is even better to be the only maker of such a machine at a time when the need for a post-Moore’s Law system is in high demand.

We have already described the U.S. Department of Energy’s drive to place a novel architecture at the heart of one of the future exascale supercomputers, and we have also explored the range of options that might fall under that novel processing umbrella. From neuromorphic chips, deep learning PIM-based architectures,

Strong FBI Ties for Next Generation Quantum Computer was written by Nicole Hemsoth at The Next Platform.

Apache Kafka Gives Large-Scale Image Processing a Boost

The digital world is becoming ever more visual. From webcams and drones to closed-circuit television and high-resolution satellites, the number of images created on a daily basis is increasing and in many cases, these images need to be processed in real- or near-real-time.

This is a computationally-demanding task on multiple axes: both computation and memory. Single-machine environments often lack sufficient memory for processing large, high-resolution streams in real time. Multi-machine environments add communication and coordination overhead. Essentially, the issue is that hardware configurations are often optimized on a single axis. This could be computation (enhanced with accelerators like GPGPUs or

Apache Kafka Gives Large-Scale Image Processing a Boost was written by Nicole Hemsoth at The Next Platform.

An Early Look at Startup Graphcore’s Deep Learning Chip

As a thought exercise, let’s consider neural networks as massive graphs and begin considering the CPU as a passive slave to some higher order processor—one that can sling itself across multiple points on an ever-expanding network of connections feeding into itself, training, inferencing, and splitting off into multiple models on the same architecture.

Plenty of technical naysay can happen in this concept, of course, and only a slice of it has to do with algorithmic complexity. For one, memory bandwidth is pushed to limit even on specialized devices like GPUs and FPGAs—at least for a neural net problem. And second,

An Early Look at Startup Graphcore’s Deep Learning Chip was written by Nicole Hemsoth at The Next Platform.

High Times for Low-Precision Hardware

Processor makers are pushing down the precision for a range of new and forthcoming devices, driven by a need that balances accuracy with energy-efficient performance for an emerging set of workloads.

While there will always be plenty of room at the server table for double-precision requirements, especially in high performance computing (HPC). machine learning and deep learning are spurring a fresh take on processor architecture—a fact that will have a trickle-down (or up, depending on how you consider it) effect on the hardware ecosystem in the next few years.

In the last year alone, the emphasis on lowering precision has

High Times for Low-Precision Hardware was written by Nicole Hemsoth at The Next Platform.

The Rise of Flash Native Cache

Burst buffers are growing up—and growing out of the traditional realm of large-scale supercomputers, where they were devised primarily to solve the problems of failure at scale.

As we described in an interview with the creator of the burst buffer concept, Los Alamos National Lab’s Gary Grider, the “simple” problem of checkpointing and restarting a massive system after a crash with a fast caching layer would be more important as system sizes expanded—but the same approach could also extend to application acceleration. As the notion of burst buffers expanded beyond HPC, companies like EMC/NetApp, Cray, and DataDirect Networks (DDN)

The Rise of Flash Native Cache was written by Nicole Hemsoth at The Next Platform.

Stanford’s TETRIS Clears Blocks for 3D Memory Based Deep Learning

The need for speed to process neural networks is far less a matter of processor capabilities and much more a function of memory bandwidth. As the compute capability rises, so too does the need to keep the chips fed with data—something that often requires going off chip to memory. That not only comes with a performance penalty, but an efficiency hit as well, which explains why so many efforts are being made to either speed that connection to off-chip memory or, more efficiently, doing as much in memory as possible.

The advent of 3D or stacked memory opens new doors,

Stanford’s TETRIS Clears Blocks for 3D Memory Based Deep Learning was written by Nicole Hemsoth at The Next Platform.

Japan to Unveil Pascal GPU-Based AI Supercomputer

A shared appetite for high performance computing hardware and frameworks is pushing both supercomputing and deep learning into the same territory. This has been happening in earnest over the last year, and while most efforts have been confined to software and applications research, some supercomputer centers are spinning out new machines dedicated exclusively to deep learning.

When it comes to such supercomputing sites on the bleeding edge, Japan’s RIKEN Advanced Institute for Computational Science is at the top of the list. The center’s Fujitsu-built K Computer is the seventh fastest machine on the planet according to the Top 500 rankings

Japan to Unveil Pascal GPU-Based AI Supercomputer was written by Nicole Hemsoth at The Next Platform.

CPU, GPU Potential for Visualization and Irregular Code

Conventional wisdom says that choosing between a GPU versus CPU architecture for running scientific visualization workloads or irregular code is easy. GPUs have long been the go-to solution, although recent research shows how the status quo could be shifting.

At SC 16 in Salt Lake City in a talk called CPUs versus GPUs, Dr. Aaron Knoll of the University of Utah, and Professor Hiroshi Nakashima of Kyoto University, presented comparisons of various CPU and GPU-based architectures running visualizations and irregular code. Notably, both researchers have found that Intel Xeon Phi processor-based systems show stand-out performance compared to GPUs for

CPU, GPU Potential for Visualization and Irregular Code was written by Nicole Hemsoth at The Next Platform.

For Big Banks, Regulation is the Mother of GPU Invention

There is something to be said for being at the right place at the right time.

While there were plenty of folks who were in the exact wrong spot when the financial crisis hit in 2007-2008, some technologies were uniquely well timed to meet the unexpected demands of a new era.

In the aftermath of the crash, major investment banks and financial institutions had a tough task ahead to keep up with the wave of regulations instituted to keep them straight. This has some serious procedural impacts, and also came with some heady new demands on compute infrastructure. Post-regulation, investment

For Big Banks, Regulation is the Mother of GPU Invention was written by Nicole Hemsoth at The Next Platform.

Solving HPC Conflicts with Containers

It’s an unavoidable truth of information technology that the operators and users are sometimes at odds with each other.

Countless stories, comics, and television shows have driven home two very unpleasant stereotypes: the angry, unhelpful system administrator who can’t wait to say “no!” to a user request, and the clueless, clumsy user always a keystroke away from taking down the entire infrastructure. There is a kernel of truth to them. While both resource providers and resource users may want the same end result — the successful completion of computational tasks — they have conflicting priorities when it comes to achieving

Solving HPC Conflicts with Containers was written by Nicole Hemsoth at The Next Platform.

Looking Down The Long Enterprise Road With Hadoop

Just five years ago, the infrastructure space was awash in stories about the capabilities cooked into the Hadoop platform—something that was, even then, only a few pieces of code cobbled onto the core HDFS distributed storage with MapReduce serving as the processing engine for analytics at scale.

At the center of many of the stories was Cloudera, the startup that took Hadoop to the enterprise with its commercial distribution of the open source framework. As we described in a conversation last year marking the ten-year anniversary of Hadoop with Doug Cutting, one of its creators at Yahoo, the platform

Looking Down The Long Enterprise Road With Hadoop was written by Nicole Hemsoth at The Next Platform.

Scaling Compute to Meet Large-Scale CT Scan Demands

Computed tomography (CT) is a widely-used process in medicine and industry. Many X-ray images taken around a common axis of rotation are combined to create a three-dimensional view of an object, including the interior.

In medicine, this technique is commonly used for non-invasive diagnostic applications such as searching for cancerous masses. Industrial applications include examining metal components for stress fractures and comparing produced materials to the original computer-aided design (CAD) specifications. While this process provides invaluable insight, it also presents an analytical challenge.

State-of-the-art CT scanners use synchrotron light, which enables very fine resolution in four dimensions. For example, the

Scaling Compute to Meet Large-Scale CT Scan Demands was written by Nicole Hemsoth at The Next Platform.

Exascale Leaders on Next Horizons in Supercomputing

One way to characterize the challenges of achieving exascale, is to look at how advancing compute, memory/storage, software, and fabric will lead to a future-generation balanced system. Recently Al Gara of Intel, Jean-Philippe Nominé of the French Alternative Energies and Atomic Energy Commission (CEA), and Katherine Riley of Argonne National Lab were on a panel that weighed in on these and a host of other interrelated challenges.

Exascale will represent a watershed achievement in computer science. More than just a nice, round number (“exa-” denotes a billion billion), exascale computing is also supposed1 by the Human Brain Project and

Exascale Leaders on Next Horizons in Supercomputing was written by Nicole Hemsoth at The Next Platform.

Current Trends in Tools for Large-Scale Machine Learning

During the past decade, enterprises have begun using machine learning (ML) to collect and analyze large amounts of data to obtain a competitive advantage. Now some are looking to go even deeper – using a subset of machine learning techniques called deep learning (DL), they are seeking to delve into the more esoteric properties hidden in the data. The goal is to create predictive applications for such areas as fraud detection, demand forecasting, click prediction, and other data-intensive analyses.

The computer vision, speech recognition, natural language processing, and audio recognition applications being developed using DL techniques need large amounts of

Current Trends in Tools for Large-Scale Machine Learning was written by Nicole Hemsoth at The Next Platform.

Baidu Targets Deep Learning Scalability Challenges

When it comes to solving deep learning cluster and software stack problems at scale, few companies are riding the bleeding edge like Chinese search giant, Baidu. As we have detailed in the past, the company’s Silicon Valley AI Lab (SVAIL) has some unique hardware and framework implementations that put AI to the test at scale. As it turns out, scalability of the models they specialize in (beginning with speech recognition) is turning out to be one of the great challenges ahead on all fronts—hardware, compiler/runtime, and framework alike.

As we have described across multiple use cases, at Baidu and elsewhere

Baidu Targets Deep Learning Scalability Challenges was written by Nicole Hemsoth at The Next Platform.

3D Memory Sparks New Thinking in HPC System Design

Whether being built for capacity or capability, the conventional wisdom about memory provisioning on the world’s fastest systems is changing quickly. The rise of 3D memory has thrown a curveball into the field as HPC centers consider the specific tradeoffs between traditional, stacked, and hybrid combinations of both on next-generation supercomputers. In short, allocating memory on these machines is always tricky—with a new entrant like stacked memory into the design process, it is useful to gauge where 3D devices might fit.

While stacked memory is getting a great deal of airplay, for some HPC application areas, it might fall just

3D Memory Sparks New Thinking in HPC System Design was written by Nicole Hemsoth at The Next Platform.

Inside Exxon’s Effort to Scale Homegrown Codes, Keep Architectural Pace

Many oil and gas exploration shops have invested many years and many more millions of dollars into homegrown codes, which is critical internally (competitiveness, specialization, etc.) but leaves gaps in the ability to quickly exploit new architectures that could lead to better performance and efficiency.

That tradeoff between architectural agility and continuing to scale a complex, in-house base of codes is one that many companies with HPC weigh—and as one might imagine, oil and gas giant, ExxonMobil is no different.

The company came to light last week with news that it scaled one of its mission-critical simulation codes on the

Inside Exxon’s Effort to Scale Homegrown Codes, Keep Architectural Pace was written by Nicole Hemsoth at The Next Platform.

1 25 26 27 28 29 35