Skip to content

nredick/mais-202

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stone Classifier

Final project created for McGill AI Society: Accelerated Introduction to Machine Learning (Winter 2020).

Training and test datasets created specifically for this project via webscraping and sourced from the Smithsonian NMNH Geology Collections Data Portal [https://geogallery.si.edu/portal] (educational use only).

Project Description

The stone classifier project is a webapp that classifies images of stones into four categories (rock, fossil, gemstone, mineral). The dataset was created by scraping and parsing the NMNH Geology Collections Data Portal with BeautifulSoup. Image preprocessing included removal of duplicates, resizing, Gaussian blur to reduce noise.

I built the model using Keras/Tensorflow on Google Colab, utilizing a convolutional neural network (CNN). The stone-classifier-webapp backend is based on the Flask module.

The final model had an accuracy of 93.8% and 0.148 on test data.

Deploying the Webapp

The webapp runs locally from the terminal or command line.

To utilize the webapp, clone the repository to your local machine. Install all packages from requirements.txt

pip install -r requirements.txt

The app requires Tensorflow, which utilizes Python 3.5-3.7 and can be run easily within a pipenv shell.

Installing pipenv and activating a shell:

brew update
brew install pipenv 

Within main directory (stone-classifier-webapp) run:

pipenv shell

Navigate to the 'stone-classifier-webapp' directory and run the command:

python app.py

The 'Stone Classifier' webapp is based on: [https://github.com/mtobeiyf/keras-flask-deploy-webapp/blob/master/app.py]

Repository Organization

This repository contains the scripts used to webscrape and create the datasets, preprocess and label the original images, train the model, and build the webapp.

  • deliverables/

    • Contains final project deliverables and proposals for the MAIS 202 course.
  • model/

    • Python script written on Google Colab to build the CNN.
  • stone-classifier-webapp/

    • models/
      • Final CNN model.
  • static/

    • CSS and JS scripts for the landing page.
  • templates/

    • HTML for the landing page.
  • webscraping/

    • DataCollection/

    • GatherData/

      • Methods used to send GET requests for image data, parse HTML responses using Beautiful Soup, retrieve labels for each image.
    • Preprocess/

      • Python script to sort images into 4 distinct classes, resize images and remove duplicate images.

About

McGill AI Society: Accelerated Introduction to Machine Learning - Winter 2020

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors