Nicole Hemsoth

Author Archives: Nicole Hemsoth

Risk or Reward: First Nvidia DGX-1 Boxes Hit the Cloud

If you can’t beat the largest cloud players at economies of scale, the only option is to try to outrun them in performance, capabilities, or price.

While go head to head with Amazon, Google, Microsoft, or IBM on cloud infrastructure prices is a challenge, one way to gain an edge is by being the first to deliver bleeding-edge hardware to those users with emerging, high-value workloads. The trick is to be at the front of the wave, often with some of the most expensive iron, which is risky with AWS and others nipping at heels and quick to follow. It

Risk or Reward: First Nvidia DGX-1 Boxes Hit the Cloud was written by Nicole Hemsoth at The Next Platform.

Singularity Containers for HPC, Reproducibility, and Mobility

Containers are an extremely mobile, safe and reproducible computing infrastructure that is now ready for production HPC computing. In particular, the freely available Singularity container framework has been designed specifically for HPC computing. The barrier to entry is low and the software is free.

At the recent Intel HPC Developer Conference, Gregory Kurtzer (Singularity project lead and LBNL staff member) and Krishna Muriki (Computer Systems Engineer at LBNL) provided a beginning and advanced tutorial on Singularity. One of Kurtzer’s key takeaways: “setting up workflows in under a day is commonplace with Singularity”.

Many people have heard about code modernization and

Singularity Containers for HPC, Reproducibility, and Mobility was written by Nicole Hemsoth at The Next Platform.

From Mainframes to Deep Learning Clusters: IBM’s Speech Journey

Here at The Next Platform, we tend to focus on deep learning as it relates to hardware and systems versus algorithmic innovation, but at times, it is useful to look at the co-evolution of both code and machines over time to see what might be around the next corner.

One segment of the deep learning applications area that has generated a great deal of work is in speech recognition and translation—something we’ve described in detail via efforts from Baidu, Google, Tencent, among others. While the application itself is interesting, what is most notable is how codes

From Mainframes to Deep Learning Clusters: IBM’s Speech Journey was written by Nicole Hemsoth at The Next Platform.

First In-Depth Look at Google’s TPU Architecture

Four years ago, Google started to see the real potential for deploying neural networks to support a large number of new services. During that time it was also clear that, given the existing hardware, if people did voice searches for three minutes per day or dictated to their phone for short periods, Google would have to double the number of datacenters just to run machine learning models.

The need for a new architectural approach was clear, Google distinguished hardware engineer, Norman Jouppi, tells The Next Platform, but it required some radical thinking. As it turns out, that’s exactly

First In-Depth Look at Google’s TPU Architecture was written by Nicole Hemsoth at The Next Platform.

Flare Gives Spark SQL a Performance Boost

Spark has grown rapidly over the past several years to become a significant tool in the big data world. Since emerging from the AMPLab at the University of California at Berkeley, Spark adoption has increased quickly as the open-source in-memory processing platform has become a key framework for handling workloads for machine learning, graph processing and other emerging technologies.

Developers continue to add more capabilities to Spark, including a SQL front-end for SQL processing and APIs for relational query optimization to build upon the basic Spark RDD API. The addition of the Spark SQL module promises greater performance and opens

Flare Gives Spark SQL a Performance Boost was written by Nicole Hemsoth at The Next Platform.

Machine Learning, Analytics Play Growing Role in US Exascale Efforts

Exascale computing promises to bring significant changes to both the high-performance computing space and eventually enterprise datacenter infrastructures.

The systems, which are being developed in multiple countries around the globe, promise 50 times the performance of current 20 petaflop-capable systems that are now among the fastest in the world, and that bring corresponding improvements in such areas as energy efficiency and physical footprint. The systems need to be powerful run the increasingly complex applications being used by engineers and scientists, but they can’t be so expensive to acquire or run that only a handful of organizations can use them.

At

Machine Learning, Analytics Play Growing Role in US Exascale Efforts was written by Nicole Hemsoth at The Next Platform.

Google Researchers Measure the Future in Microseconds

The IT industry has gotten good at developing computer systems that can easily work at the nanosecond and millisecond scales.

Chip makers have developed multiple techniques that have helped drive the creation of nanosecond-scale devices, while primarily software-based solutions have been rolled out for slower millisecond-scale devices. For a long time, that has been enough to address the various needs of high-performance computing environments, where performance is a key metric and issues such as the simplicity of the code and the level of programmer productivity are not as great of concerns. Given that, programming at the microsecond level as not

Google Researchers Measure the Future in Microseconds was written by Nicole Hemsoth at The Next Platform.

Argonne National Lab Lead Details Exascale Balancing Act

It’s easy when talking about the ongoing push toward exascale computing to focus on the hardware architecture that will form the foundation of the upcoming supercomputers. Big systems packed with the latest chips and server nodes and storage units still hold a lot of fascination, and the names of those vendors involved – like Intel, IBM and Nvidia – still resonate broadly across the population. And that interest will continue to hold as exascale systems move from being objects of discussion now to deployed machines over the next several years.

However, the development and planning of these systems is a

Argonne National Lab Lead Details Exascale Balancing Act was written by Nicole Hemsoth at The Next Platform.

An Opaque Alternative to Oblivious Cloud Analytics

Data security has always been a key concern as organizations look to leverage the operational and cost efficiencies that come with cloud computing. Huge volumes of critical and sensitive data often are in transit and distributed among multiple systems, and increasingly are being collected and analyzed in cloud-based big data platforms, putting them at higher risk of being hacked and compromised.

Even as encryption methods and security procedures have improved, the data is still at risk of being attacked through such vulnerabilities as access pattern leakage through memory or the network. It’s the threat of an attack via access pattern

An Opaque Alternative to Oblivious Cloud Analytics was written by Nicole Hemsoth at The Next Platform.

Neuromorphic, Quantum, Supercomputing Mesh for Deep Learning

It is difficult to shed a tear for Moore’s Law when there are so many interesting architectural distractions on the systems horizon.

While the steady tick-tock of the tried and true is still audible, the last two years have ushered a fresh wave of new architectures targeting deep learning and other specialized workloads, as well as a bevy of forthcoming hybrids with FPGAs, zippier GPUs, and swiftly emerging open architectures. None of this has been lost on system architects at the bleeding edge, where the rush is on to build systems that can efficiently chew through ever-growing datasets with

Neuromorphic, Quantum, Supercomputing Mesh for Deep Learning was written by Nicole Hemsoth at The Next Platform.

Scaling Deep Learning on an 18,000 GPU Supercomputer

It is one thing to scale a neural network on a single GPU or even a single system with four or eight GPUs. But it is another thing entirely to push it across thousands of nodes. Most centers doing deep learning have relatively small GPU clusters for training and certainly nothing on the order of the Titan supercomputer at Oak Ridge National Laboratory.

The emphasis on machine learning scalability has often been focused on node counts in the past for single-model runs. This is useful for some applications, but as neural networks become more integrated into existing workflows, including those

Scaling Deep Learning on an 18,000 GPU Supercomputer was written by Nicole Hemsoth at The Next Platform.

Stanford Brainstorm Chip to Hints at Neuromorphic Computing Future

If the name Kwabena Boahen sounds familiar, you might remember silicon that emerged in the late 1990s that emulated the human retina.

This retinomorphic vision system, which Boahen developed while at Caltech under VLSI and neuromorphic computing pioneer, Carver Meade, introduced ideas that are just coming around into full view again in the last couple of years—computer vision, artificial intelligence, and of course, brain-inspired architectures that route for efficiency and performance. The rest of his career has been focused on bringing bioinspired engineering to a computing industry that is hitting a major wall in coming years—and at a time

Stanford Brainstorm Chip to Hints at Neuromorphic Computing Future was written by Nicole Hemsoth at The Next Platform.

China Making Swift, Competitive Quantum Computing Gains

Chinese officials have made no secret out of their desire to become the world’s dominant player in the technology industry. As we’ve written about before at The Next Platform, China has accelerated its investments in IT R&D over the past several years, spending tens of billions of dollars to rapidly expand the capabilities of its own technology companies to better compete with their American counterparts, while at the same time forcing U.S. tech vendors to clear various hurdles in their efforts to access the fast-growing China market.

This is being driven by a combination of China’s desire to increase

China Making Swift, Competitive Quantum Computing Gains was written by Nicole Hemsoth at The Next Platform.

Rapid GPU Evolution at Chinese Web Giant Tencent

Like other major hyperscale web companies, China’s Tencent, which operates a massive network of ad, social, business, and media platforms, is increasingly reliant on two trends to keep pace.

The first is not surprising—efficient, scalable cloud computing to serve internal and user demand. The second is more recent and includes a wide breadth of deep learning applications, including the company’s own internally developed Mariana platform, which powers many user-facing services.

When the company introduced its deep learning platform back in 2014 (at a time when companies like Baidu, Google, and others were expanding their GPU counts for speech and

Rapid GPU Evolution at Chinese Web Giant Tencent was written by Nicole Hemsoth at The Next Platform.

Fujitsu Looks to 3D ICs, Silicon Photonics to Drive Future Systems

The rise of public and private clouds, the growth of the Internet of Things, the proliferation of mobile devices and the massive amounts of data that need to be collected, stored, moved and analyzed that are being generated by such fast-growing emerging trends promise to drive significant changes in both software and hardware development in the coming years.

Depending on who you’re talking to, there could be anywhere from 10 billion to 25 billion connected devices worldwide, self-driving cars are expected to rapidly grow in use in the next decade and corporate data is no longer housed primarily in stationary

Fujitsu Looks to 3D ICs, Silicon Photonics to Drive Future Systems was written by Nicole Hemsoth at The Next Platform.

KAUST Hackathon Shows OpenACC Global Appeal

OpenACC’s global attraction can be seen in the recent February 2017 OpenACC mini-hackathon and GPU conference at KAUST (King Abdullah University of Science & Technology) in Saudi Arabia. OpenACC was created so programmers can insert pragmas to provide information to the compiler about parallelization opportunities and data movement operations to and from accelerators. Programmers use pragmas to work in concert with the compiler to create, tune and optimize parallel codes to achieve high performance.

Demand was so high to attend this mini-hackathon that the organizers had to scramble to find space for ten teams, even though the hackathon was originally

KAUST Hackathon Shows OpenACC Global Appeal was written by Nicole Hemsoth at The Next Platform.

Roadblocks, Fast Lanes for China’s Enterprise IT Spending

The 13th Five Year Plan and other programs to bolster cloud adoption among Chinese businesses like the Internet Plus effort have lit a fire under China’s tech and industrial sectors to modernize IT operations.

However, the growth of China’s cloud and overall enterprise IT market is far slower than in other nations. while there is a robust hardware business in the country, the traditional view of enterprise-class software is still sinking in, leaving a gap between hardware and software spending. Further, the areas that truly drive tech spending, including CPUs and enterprise software and services, are the key areas

Roadblocks, Fast Lanes for China’s Enterprise IT Spending was written by Nicole Hemsoth at The Next Platform.

Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Learning?

Continued exponential growth of digital data of images, videos, and speech from sources such as social media and the internet-of-things is driving the need for analytics to make that data understandable and actionable.

Data analytics often rely on machine learning (ML) algorithms. Among ML algorithms, deep convolutional neural networks (DNNs) offer state-of-the-art accuracies for important image classification tasks and are becoming widely adopted.

At the recent International Symposium on Field Programmable Gate Arrays (ISFPGA), Dr. Eriko Nurvitadhi from Intel Accelerator Architecture Lab (AAL), presented research on Can FPGAs beat GPUs in Accelerating Next-Generation Deep Neural Networks. Their research

Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Learning? was written by Nicole Hemsoth at The Next Platform.

Google Team Refines GPU Powered Neural Machine Translation

Despite the fact that Google has developed its own custom machine learning chips, the company is well-known as a user of GPUs internally, particularly for its deep learning efforts, in addition to offering GPUs in its cloud.

At last year’s Nvidia GPU Technology Conference, Jeff Dean, Senior Google Fellow offered a vivid description of how the search giant has deployed GPUs for a large number of workloads, many centered around speech recognition and language-oriented research projects as well as various computer vision efforts. What was clear from Dean’s talk—and from watching other deep learning shops with large GPU cluster

Google Team Refines GPU Powered Neural Machine Translation was written by Nicole Hemsoth at The Next Platform.

Increasing HPC Utilization with Meta-Queues

Solving problems by the addition of abstractions is a tried and true approach in technology. The management of high-performance computing workflows is no exception.

The Pegasus workflow engine and HTCondor’s DAGman are used to manage workflow dependencies. GridWay and DRIVE route jobs to different resources based on suitability or available capacity. Both of these approaches are important, but they share a key potential drawback: jobs are still treated as distinct units of computation to be scheduled individually by the scheduler.

As we have written previously, the aims of HPC resource administrators and HPC resource users are sometimes at odds.

Increasing HPC Utilization with Meta-Queues was written by Nicole Hemsoth at The Next Platform.

1 24 25 26 27 28 35