Skip to content

Commit 4534f09

Browse files
committed
Dummy Variables + Changes
1 parent 9569e63 commit 4534f09

2 files changed

Lines changed: 30 additions & 4 deletions

File tree

1. Clustering/Clustering.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,8 @@
5353

5454

5555
# ===================== CREDITS =====================
56-
# Jake VanderPlas:
57-
# -> https://jakevdp.github.io/PythonDataScienceHandbook/05.11-k-means.html
56+
### Jake VanderPlas:
57+
# https://jakevdp.github.io/PythonDataScienceHandbook/05.11-k-means.html
5858

59-
# Aletta Smits:
60-
# -> Big Data and Social Media / Data Learning Class (week 1 - Day 1)
59+
### Aletta Smits:
60+
# Big Data and Social Media / Data Learning Class (week 1 - Day 1)

1. Clustering/Dummy_Variables.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import pandas as pd
2+
3+
# Read dataset
4+
absenteeism_df = pd.read_csv("Absenteeism_at_work.csv", sep = ";")
5+
6+
# Transform a categorical variable to dummy variables (one hot)
7+
dummy_var = pd.get_dummies(absenteeism_df["Day of the week"])
8+
9+
# Add dummy_var to the original dataset
10+
absenteeism_df = pd.concat([absenteeism_df, dummy_var], axis = 1)
11+
12+
# Remove old column to prevent errors
13+
absenteeism_df = absenteeism_df.drop("Day of the week", axis = 1)
14+
15+
# See the results
16+
print(absenteeism_df.head())
17+
18+
# ===================== CREDITS =====================
19+
### Aletta Smits:
20+
# Big Data and Social Media / Data Learning Class (week 1 - Day 2)
21+
22+
### Rowan Langford:
23+
# https://towardsdatascience.com/the-dummys-guide-to-creating-dummy-variables-f21faddb1d40
24+
25+
### Shanelynn:
26+
# https://www.shanelynn.ie/using-pandas-dataframe-creating-editing-viewing-data-in-python/

0 commit comments

Comments
 (0)