Applying computationally expensive deep learning applications at large scale can be an enormous challenge. Using TensorFlowOnSpark we can distribute these computationally expensive processes in the cluster, enabling us to perform computations at a larger scale. In this chapter, we will explore Yahoo's TensorFlowOnSpark framework for distributed deep learning on spark clusters. And we will apply TensorFlowOnSpark on a large scale dataset of images and train the network to detect objects. This chapter will cover:
- The need for distributed AI
- Introduction to Apache Spark platform for big data
- TensorFlowOnSpark a Python framework to run TensorFlow on Spark clusters
- Performing object detection using TensorFlowOnSpark and Sparkdl API