Skip to content

mkrsteska/DMML2019_Team_Tesla

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

122 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DMML2019_Team_Tesla

The repo for the project of DM ML course
Master: Information Systems - HEC Lausanne - 2019

Video: https://www.youtube.com/watch?v=UdqfCPRODcw&t=180s

Structure of the project

code

    1. Download Spotify Data (don’t run this notebook)
    1. Download Lyrics (don’t run this notebook)
    1. Cleaning Lyrics (don’t run this notebook)
    1. Exploratory Data Analysis - Spotify Audio Features
    1. Exploratory Data Analysis - Lyrics
    1. Classification
    1. Clustering
    1. Glove model

data

  • top_hits.json
  • songs.json
  • top_hits_merged_clean_lyrics_audio_features.json
  • not_hits_merged_clean_lyrics_audio_features.json

In order to run the notebooks, please download the GloVe model from this link: https://drive.google.com/open?id=126qGJC9o1da-_deqGwfN6iyuU2S0AtMa and place them in data folder.

Data mining:

Our data come from Spotify using their API.

Spotify allows us to collect only information about a song.

More information about the data we collected:
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/

To get the lyrics of a song, we scraped the follwoing websites:

  • genius.com
  • lyricsmode.com
  • songlyrics.com
  • metrolyrics.com

Machine learning:

We would like to predict whether a song will be a hit or not, based on lyrics and audio features.

The team:

mkrsteska : Marija Krsteska
thil5 : Samuel Lew
Martynas2 : Martynas Savickas
yassinhediger: Yassin Hediger

About

the repo for the project of DM ML course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors