Waseem Khan blog

Building a Car Model Classifier (Part 3)

2020-01-28T06:30:00+00:00

This is the final post and is a continuation of the previous post and the first post.

Now that we have a trained model that can recognize make and model, we can build a web server to host the Tensorflow models and do an inference on the passed images. However, we don’t want to load all the models into memory sequentially. We want them running in parallel so that one request is not waiting for another to finish before it can proceed.

Serving Libraries

Tensorflow was chosen for this project, specifically because it has a library called Tensorflow Serving that integrates easily with Tensorflow and makes it easy to deploy models.

We will also use Flask, a minimal python framework, which gives us just enough to get our server off the ground and connect everything together.

Pipeline

Since our training set is cropped images, we need to have a pipeline that will detect the vehicle in the image and crop it before passing it to our classifier model. We can display the top k results with their probabilities.

Additionally, we would like to include other features of the car, such as the license plate and the color of the vehicle, so we will add that to the pipeline as well, since they can be done in parallel.

For car detection and cropping, we will use YOLOv3 again for its fast and accurate results.

For color recognition, we will use a simple python script found at this repository: Vehicle-Color-Detector.

For license plate recognition, we will use a library that can recognize plates from skewed and unconstrained scenarios: alpr-unconstrained. This has its own pipeline which includes a YOLO-like model for license plate detection. However, instead of using DarkNet, a C library, that is used to write YOLO, we will use OpenCV’s support for YOLO models.

We convert the YOLOv3, our Make-Model Detection Model, and LPR (Lincese Plate Recognition) models into Tensorflow Serving’s model format, and store them seperately. Each represents a served model that can be run on a different server and can process multiple inputs at a time as a REST API. However, for this project, they will all be served on the same machine.

Results

Here we see the server live, and processing an image on both web and mobile. Both results are correct (the first make and model guess, license plate and color).

Improvements

Although the results are correct for the two cases shown below. There are cases when the results are not good: 1) The image is blurry and the features of the car aren’t discernible, 2) The lighting or shadows make the car look a different color than it is (black -> grey),

These issues can be fixed by taking multiple pictures and choosing the best one and developing a neural network to detect car color despite changes in lighting. However, for the sake of this project, these results are satisfactory.

Building a Car Model Classifier (Part 2)

2020-01-27T06:30:00+00:00

This post is a continuation of the previous post.

Using feature extraction and matching to nearest neighbors can only go so far with the pretrained model on VMMRdb. What if we want to generalize to newer classes of cars like the Tesla Model S, not present in the VMMRdb dataset? For that, we need to collect more data for a more updated and robust model.

Data Collection

Since there is not a collection of images of the newest car models out there, we will need to create our own dataset. And where can we find images? Google Images!

But searching, selecting N images, downloading them and organizing them by car name would take too long, so we automate it, and we do this using a web scraper.

We use the NHTSA (National Highway Traffic and Safety Administration) Vehicle API, to collect a list of all the car, truck, and mpv (multi-purpose vehicle) makes and models from the year 2000 to 2018 (when this project was first started). Then we use a web scraper to create a list of links from Google Images, and finally we download them using multiprocessing because they are essentially parallel tasks.

Link to the scraper and custom scripts specifically for this project found here: scrapeCars

Data Cleaning and Preprocsesing

Data cleaning and organization is usually the most time-consuming part of a data science project, and it’s no different for this project.

Once we have all the image data, we first delete any duplicates in the same class, since we want unique samples and we can know the true size of a class, if class weighting needs to be done later.

But how can we automate duplicate removal? We can hash the images by pixel data and then store them by that hash in a database, compare the hashes and delete if they are the same. There’s a nice library that does this, found here: duplicate-images

(Images from pyimagesearch)

Next, our focus is classification and not detection, the difference being detection is classification + localization (finding if and where an object is present in the scene), we crop the images with only the car present, and remove most of the background. We accomplish this using YOLO v3: detect the largest bounding box of a truck or car, and crop it out. We use this library for ease of use: ImageAI

Lastly, we want to remove images that include the interior of a car: pictures steering wheels, seats, A/C, etc. I tried training a smaller model detect the differences between an image of a car and its interior to automatically remove irrelevant pictures. However, this didn’t prove to be effective because of the large variation in this type of pictures. So, I spent two to three days, going through 200k images and manually removing the bad ones.

We apply the same cropping to VMMRdb, and add it to our scraped dataset, making it effectively larger and more robust for training models since it contains good examples of certain cars and adds to smaller class sizes.

We finally have our complete dataset, and we are ready to train!

Training

Data Split

The dataset needs to be split into training, validation, and testing sets. For this we write a quick script in python to divide each class randomly into 80/10/10 for each set, respectively.

The training set is used for training the model and validation set is used to calculate the loss and accuracy after each epoch or cycle of training. Finally, the test set is untouched and unseen by the model until all the training is complete and a final loss and accuracy needs to be calculated.

Benchmarking

As a benchmark, we try out Xception, and use Tensorflow to train our model. Tensorflow 2.0 has keras built-in and it makes it easy to do a bunch of quick experiments to see if the model is training properly.

We freeze the top layers (layers closer to the input), since they are usually simple edge and shape detectors, and we unfreeze the last half of layers, allowing them to be trained. This is known as Transfer Learning, and is much faster than training a model from scratch.

We choose an image size 448x448 to preserve as much of the image as possible and apply simple image augmentations to create more varied data and thus make the model more robust to perturbations (Slight rotations, horizontal flips, color variations). Additionally, we only use the scraped dataset and not the combined dataset, since it is smaller.

We achieve a validation accuracy of 64.41% and test accuracy of 64.22%. Satisfied with this, we move on a better model and better training methods.

Better Model, Better Results

Next, we choose EfficientNet, a newer model that has ranges of models starting from B0 to B7, where the depth, width, and resolution of the models scale appropriately based on the computing resources available. They had reached the state of the art in ImageNet with a top-1 accuracy of 84.4% accuracy.

Choosing EfficientNet B0, the smallest of the set, we unfreeze the last two layers and switch to 224x224 image sizes since it takes much longer to train with 448x448 size images. We use the full dataset (VMMRdb + scraped dataset) this time and use better image augmentations as well, from this great library that has tons of augmentations: imgaug

This time we train for as long as the validation accuracy does not improve for 8 epochs. We use Adam with a starting learning rate of 1e-4, this was discovered with trial and error. After about 18 epochs of training, we stop and get a final training accuracy of 86.66% and validation accuracy of 80.18%.

On the test set, we get an accuracy of 80.1%! Almost a 16% increase from using the combined dataset, a better model, and better image augmentations.

This is part 2 of my progress towards this problem, stay tuned for my next and last post on this topic, where I build a final pipeline for the classification.

Building a Car Model Classifier (Part 1)

2020-01-12T15:50:21+00:00

The code for this project can be found on my github.

Problem Introduction

I was interested in the problem of recognizing the make (e.g. Toyota) and model (e.g. Camry) of a car from its profile. This has many applications from Intelligent Transportation Services (ITS), to Surveillance. It is hard for a non-expert to effectively distinguish and remember the differences between makes and models, and so an automated system can supplement or improve current systems.

The most popular car image classification dataset, the Stanford Cars Dataset, only has 16,185 images of 196 classes of cars and was released in 2013, which is only a small subset of the actual amount of different models of cars out there.

A larger dataset, the Vehicle Make and Model Recognition Dataset, or VMMRdb its acronym, has a subset of 246,173 images of 3036 classes (minimum of 20 images/class) and was released in 2017. This serves as a better starting point in providing more data to train classifier to identify subtle changes between models. This specific type of problem is referred to as Fine-Grained Image Classification, and a pre-trained ResNet-50 model is provided by VMMRdb.

Feature Extraction + Approximate Nearest Neighbors

The provided VMMRdb trained model only classifies 3036 models, to have it generalize to the larger VMMRdb (which contains 291,752 images of 9,170 classes) called VMMRdb-9170, we can just remove the final layer to get a feature representation of an input image and then find the nearest neighbor to it in the larger VMMRdb to classify an image.

However, in such a large feature space (2048 dimensions) and dataset (~300k points), nearest neighbors is too slow, so we opt for Approximate Nearest Neighbors or ANN for short:

ANN is how fast similarity search is done for anything that can have its features extracted (images, video, music). We use the Facebook library, faiss, along with PyTorch to extract features from an image and classify based on its approximate nearest neighbor image’s class.

The neat part about this is that this will give us the k-closest looking images to our input image as well, a form of reverse image search. For example, feeding in this picture of a Honda-Pilot 2011 model results in the following results:

As can be clearly seen, the retrieved images are all Honda Pilot’s from the year 2010 or 2011 and viewed from the same angle (front). This is most likely due to the embedding (2048-feature vector) encoding this information.

Results and Further Work

However, this is just a sample image, to see the accuracy of this approach, we randomly divide the VMMRdb-9170 into train (80%), dev (10%), and test (10%) splits. Using the best feature indexing method based on accuracy on the dev set, we achieved a Top-1 and Top-5 accuracy of 46.4% and 73.9%, respectively on the test set of VMMRdb-9170. This result is not bad compared to the model trained on VMMRdb-3036, which achieved the Top-1 and Top-5 accuracy of 51.76% and 92.90%, respectively, according to the VMMRdb paper.

This is only part 1 of my progress towards this problem, stay tuned for my next post, where I go into a better approach by collecting and cleaning data and training from scratch.

References

3D Object Representations for Fine-Grained Categorization Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei 4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.
A Large and Diverse Dataset for Improved Vehicle Make and Model Recognition Faezeh Tafazzoli, Keishin Nishiyama, Hichem Frigui In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2017.
Billion-scale similarity search with GPUs Johnson, Jeff and Douze, Matthijs and J’egou, Herv’e arXiv preprint arXiv:1702.08734 2017

Starting a Blog with Jekyll

2019-07-26T05:47:21+00:00

In this short guide, I will show you how to setup your own blog with Jekyll. That way you have complete control of your content and formatting.

Jekyll is a static site generator, that means it processes markup language text (e.g. markdown, the format used for github’s README file) and creates a static website.

Jekyll is built-in to Github and so it’s as simple as creating a repository(repo), editing and previewing entries locally, and then pushing your changes live to https://username.github.io, all hosted for free!

Requirements

Since Jekyll is Ruby Gem, a packaged library for the Ruby programming language, Ruby needs to be installed, you can see if you have it by opening a terminal and running:
```
$ ruby -v
```
This displays the version number, make sure you have version > 2.1.0 or higher. Follow this guide, if it’s not installed.
Now we install Jekyll and Bundler. Bundler allows you to manage dependencies with a Gemfile, a list of Gems required for the site:
```
$ gem install jekyll bundler
```

Github Repo

Create a public github repo with the name username.github.io where username is your github username. You can follow the first part of this guide to do that. Clone the repo locally on your computer:
```
$ git clone https://github.com/username/username.github.io
$ cd username.github.io
```

Generating Template

Create a Gemfile $ touch Gemfile in the local repo and edit it in your favorite text editor to add the following two lines:
```
source 'https://rubygems.org'
gem 'github-pages', group: :jekyll_plugins
```
Save the Gemfile, and run $ bundle install to install all the dependencies for the Github Pages Gem.
Make a Jekyll template site in a temp folder in your local repo and move its contents out into the main folder and then delete it:
```
$ bundle exec jekyll new temp
$ cp -r temp/. .
$ rm -rf temp
```
Now all the generated jekyll files should be in your local repo.
Open the Gemfile again in a text editor and comment out the line with Jekyll version with #, so it looks like this:
```
# gem "jekyll", "~> 3.8.5"
```
And then uncomment the line with Github Pages Gem, by deleting the # from the front of the line, so it looks like this:
```
gem "github-pages", group: :jekyll_plugins
```

Personalization

Edit the _config.yml with the appropriate title, description, email, url, etc.:
```
title: My blog
...
url: "https://username.github.io"
```
Now run your Jekyll site locally in your browser:
```
$ bundle exec jekyll serve
```
Preview your site locally at localhost:4000
Enter the _posts folder, create a file with the naming scheme YYYY-MM-DD-title-with-dashes.markdown, and edit it in a text editor:
```
$ cd _posts
$ nano 2019-07-26-first-post.markdown
```
Now, we write a simple markdown blog post, as shown:
```
 ---
 layout: post
 title:  "First Post"
 date:   2019-07-26 01:00:00
 ---

 Markdown is *cool* 8-)
```
Save the file and refresh localhost:4000 to see your new post appearing on the homepage!
Add all your changes to a commit and push to Github:
```
$ git add .
$ git commit -m "First blog commit"
$ git push
```
Wait a few minutes and visit username.github.io from your web browser to see your page live!

And with that comes the end of the tutorial, have fun and blog away!