- Gen.py
- Stage 1:
- MapperStg1.py
- ReducerStg1.py
- Stage 2:
- MapperStg2.py
- ReducerStg2.py
Functions of each of the files will be updated at a later date.
- Run gen.py to create the DataSet in dataPoints.txt
- To get a list of Canopy Centers pipe the files of Stage 1.
-
- "cat dataPoints.txt | ./mapperStg1.py | sort | ./reducerStg1.py"
-
-
- Output will be a list of Canopy Centers stored in canopyCenters.txt
-
-
-
-
- Output will be in the format "1\tDataPoint"
-
-
- Pipe that to Stage 2 to assign each data point to a Canopy Center.
-
- Output will be in the format "CanopyCenter\tDataPoint"
Note:
- If running on windows cmd, you have to create your own Sort function to sort input from the mapper.
- Personally, I'd recommend just using a linux OS to smoothen it all out.