Taking the Heavy Lifting Out of TensorFlow at Extreme Scale
There is no real middle ground when it comes to TensorFlow use cases. Most implementations take place either in a single node or at the drastic Google-scale, with few scalability stories in between.
This is starting to change, however, as more users find an increasing array of open source tools based on MPI and other approaches to hop to multi-GPU scalability for training, but it still not simple to scale Google’s own framework across larger machines. Code modifications get hairy beyond single node and for the MPI uninitiated, there is a steep curve to scalable deep learning.
Although high performance …
Taking the Heavy Lifting Out of TensorFlow at Extreme Scale was written by Nicole Hemsoth at The Next Platform.