deniscast/Movie-processing
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
This is an "amazing" scala project using spark and kafka! (1): startKafka: starts the kafka server initKafkaPython: add new movies data to kafka (2): analyzeData: reads from kafka new movies, and write analyzed movies to kafka (3): persist: persister analyzed movies from kafka to file system (4): sparkNotebook: use analyzed movies from kafka to print cool graphics More explanation about the pipeline in Hadoop-spark.pdf