Spotifeature is a project for the Data Wrangling course @ Vrije Universiteit Amsterdam aiming to investigate possible relations between audio features of a playlist and playlist metadata (e.g., popularity measured in followers). The project report outlines the findings of the research.
The code for this project is distributed over multiple notebooks as many parts can be seen as individual (isolated) steps.
visualization_initial.ipynbis used for gaining a first understanding of the dataset based on which further decisions (e.g., the minimum followers threshold) are based. The notebook offers tools for generating
-
acquisition_playlists.ipynbis used for processing the initial dataset (1,000,000 public Spotify playlists). The notebook can be used to create 2 different outputs:- The original dataset, stripped of some (unneeded) playlist attributes, serialized as a Python pickle file.
- A list of all unique track IDs (used for audio feature acquisition), also as a Python pickle file.
-
acquisition_features.ipynbis used for building a dataset of audio features for the track IDs identified in the previous notebook. The resulting dataset is saved as acsvfile.
processing.ipynb(withprocess_playlist.py) is used for generating playlist metrics based on the audio features. The results are stored serialized in a Python pickle file.
data_visualizatoin.ipynbis used to generate all graphs used for feature trend discovery according to the research questions, and the individual graphs used for the report/presentation.
As most of the generated (intermediate) data is substantial in size, the files are stored separately in a Google Drive Folder.
| Name | Profile |
|---|---|
| Lennart K.M. Schulz | GitHub, LinkedIn |
| Laura I.M. Stampf | GitHub, LinkedIn |
| Dovydas Vadišius | GitHub, LinkedIn |