Skip to content

Latest commit

 

History

History
24 lines (18 loc) · 696 Bytes

File metadata and controls

24 lines (18 loc) · 696 Bytes

Classifiyng mushrooms into poisonous and nonpoisonous

About the code

  • Language used: Python
  • Packages used: pandas, numpy, sklearn

About the data

Process

  1. Data cleaning:

    • Convert categorical features to dummy variables
    • Convert response to binary
    • Split data into groups manually (not using a package)
  2. Train models:

    • Cross validation was used within GridSearchCV
    • Models trained: Random Forest, Support Vector Machine, XGBOOst, Neural Network
  3. Assess model performance on test data

    • Performance assessed with: Accuracy, F1 and ROC_AUC