Tracking models

Simple framework to save and track your model versions performance on the exploration and training phase

How it works

The framework only uses the save_model function and the ClassModelResults class in order to effectively and easily track your model versions

save_model function

The save_model function locally saves:

A pickle file of your model
The hyperparameters of the algorithim you used
The perfomance metrics
Some stats that characterize your model data
The features and target variable
The mean X_train feature values for tracking purposes
Any other metric you want to add

ClassModelResults class

This class loads all the resutls from saved from the save_model function and summarize it in a dictionary with the following dataframes:

params: The hyperparameters of the algorithim every model was built
metrics: The perfomance metrics of every model
stats: Some stats that characterize every model data
features_train_cols: The features and target variable of every model
features_train_mean: The mean X_train feature values of every model

NOTE: you can also specify that you only want to load the n last results, using ClassModelResults(last_results=n)

Try it yourself

Try it yourself and test it using the Tracking model results - Test notebook.ipynb

Example

Build a basic model

df = pd.read_csv('Titanic_train.csv')

df = pd.get_dummies(df,columns=['Sex'])

target = 'Survived'
X = df[['Pclass','Age','Sex_female','Fare']]
X = X.fillna(0)
y = df[target]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

n = 10

### Training
from sklearn import svm
import xgboost as xgb

#clf = RandomForestClassifier(n_estimators=50+(n*2), max_depth=(n*3), random_state=n)
#clf = svm.SVC(C=n,probability=True)
clf = xgb.XGBClassifier(max_depth=n)
clf.fit(X_train,y_train)

### PREDICTIONS
y_pred = clf.predict(X_test)
#y_probas = clf.predict_proba(X_test)
labels = y.unique()
feature_names = list(X_train.columns)

### Save model
save_model(clf,X_train,X_test,y_train,y_test,"Your_user_name",1,"",
           {},
           {},
           {},
           1,
           save = 1
          )

saved metrics 
MODEL ID:  Your_user_name-20230226153020
Tipo_metrica: Clasificacion
Tipo_modelo: 1
Algoritmo: XGBClassifier
umbral: 0.5
AUC: 0.8466666666666667
Gini: 0.6933333333333334
PRAUC: 0.8108876087759457
F1_score: 0.7467811158798284
Accuracy: 0.8
Recall: 0.725
Precision: 0.7699115044247787
Fecha: 2023-02-26 15:30:00
Comentario: 

Modelo, parametros, metricas y stats guardados exitosamente

Initiate object model results

model_results = ClassModelResults()

Update all models results

dir_results_files = 'results_files' #Default path
dir_models = 'models' #Default path

df_results = model_results.get_model_results(dir_results_files)

Get metrics from all the stored models

df_results["metrics"]

Model	Tipo_metrica	Tipo_modelo	Algoritmo	umbral	AUC	Gini	PRAUC	F1_score	Accuracy	Recall	Precision	Fecha	Comentario
Your_user_name-20230226152641	Clasificacion	1	XGBClassifier	0.5	0.873905	0.747810	0.857136	0.737327	0.806780	0.666667	0.824742	2023-02-26 15:26:00	NaN
Your_user_name-20230226152648	Clasificacion	1	XGBClassifier	0.5	0.846667	0.693333	0.810888	0.746781	0.800000	0.725000	0.769912	2023-02-26 15:26:00	NaN
Your_user_name-21214708	Clasificacion	1	RandomForestClassifier	0.5	0.833333	0.666667	NaN	0.776812	0.786441	0.778095	0.780085	2023-02-21 21:47:00	NaN
Your_user_name-20230226153020	Clasificacion	1	XGBClassifier	0.5	0.846667	0.693333	0.810888	0.746781	0.800000	0.725000	0.769912	2023-02-26 15:30:00	NaN
Your_user_name-21214715	Clasificacion	1	SVC	0.5	0.785476	0.570952	NaN	0.648842	0.684746	0.653095	0.682529	2023-02-21 21:47:00	NaN
Your_user_name-21214718	Clasificacion	1	SVC	0.5	0.763810	0.527619	NaN	0.640441	0.688136	0.645476	0.691590	2023-02-21 21:47:00	NaN

Get columns

df_results["features_train_cols"]

Model	0	1	2	3	4
Your_user_name-20230226152648	Pclass	Age	Sex_female	Fare	Survived
Your_user_name-21214718	Pclass	Age	Sex_female	Fare	Survived
Your_user_name-20230226153020	Pclass	Age	Sex_female	Fare	Survived
Your_user_name-21214708	Pclass	Age	Sex_female	Fare	Survived
Your_user_name-21214715	Pclass	Age	Sex_female	Fare	Survived
Your_user_name-20230226152641	Pclass	Age	Sex_female	Fare	Survived

Load specific model

load_model(f"Your_user_name-20230226152641",dir_models=dir_models)

{'chosen_model': XGBClassifier(base_score=0.5, booster='gbtree', callbacks=None,
               colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1,
               early_stopping_rounds=None, enable_categorical=False,
               eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',
               importance_type=None, interaction_constraints='',
               learning_rate=0.300000012, max_bin=256, max_cat_to_onehot=4,
               max_delta_step=0, max_depth=1, max_leaves=0, min_child_weight=1,
               missing=nan, monotone_constraints='()', n_estimators=100,
               n_jobs=0, num_parallel_tree=1, predictor='auto', random_state=0,
               reg_alpha=0, reg_lambda=1, ...)}

Get best model

model_results.load_best_model('AUC')

{'chosen_model': XGBClassifier(base_score=0.5, booster='gbtree', callbacks=None,
               colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1,
               early_stopping_rounds=None, enable_categorical=False,
               eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',
               importance_type=None, interaction_constraints='',
               learning_rate=0.300000012, max_bin=256, max_cat_to_onehot=4,
               max_delta_step=0, max_depth=1, max_leaves=0, min_child_weight=1,
               missing=nan, monotone_constraints='()', n_estimators=100,
               n_jobs=0, num_parallel_tree=1, predictor='auto', random_state=0,
               reg_alpha=0, reg_lambda=1, ...)}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
models		models
results_files		results_files
Pokemon.csv		Pokemon.csv
README.md		README.md
Titanic_train.csv		Titanic_train.csv
Tracking model results - Test notebook.ipynb		Tracking model results - Test notebook.ipynb
requirements.txt		requirements.txt
track_model_utils.py		track_model_utils.py
track_model_utils_black.py		track_model_utils_black.py
track_model_utils_old.py		track_model_utils_old.py
track_model_utils_s3_version.py		track_model_utils_s3_version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tracking models

How it works

save_model function

ClassModelResults class

Try it yourself

Example

Build a basic model

Initiate object model results

Update all models results

Get metrics from all the stored models

Get columns

Load specific model

Get best model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tracking models

How it works

save_model function

ClassModelResults class

Try it yourself

Example

Build a basic model

Initiate object model results

Update all models results

Get metrics from all the stored models

Get columns

Load specific model

Get best model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages