Jekyll2020-01-29T03:34:58+00:00https://wasdkhan.github.io/feed.xmlWaseem Khan blogThoughts and reflections by a technology enthusiast. Building a Car Model Classifier (Part 3)2020-01-28T06:30:00+00:002020-01-28T06:30:00+00:00https://wasdkhan.github.io/2020/01/28/building-car-recognition-part-3<center><img src="/assets/building-car-recognition-part-3/flask-and-tf-logos.png" width="100%" /></center> <p>(Images from <a href="https://becominghuman.ai/creating-restful-api-to-tensorflow-models-c5c57b692c10">becominghuman.ai</a>)</p> <p>This is the final post and is a continuation of the <a href="/2020/01/27/building-car-recognition-part-2.html">previous post</a> and the <a href="/2020/01/12/building-car-recognition.html">first post</a>.</p> <p>Now that we have a trained model that can recognize make and model, we can build a web server to host the Tensorflow models and do an inference on the passed images. However, we don’t want to load all the models into memory sequentially. We want them running in parallel so that one request is not waiting for another to finish before it can proceed.</p> <h3 id="serving-libraries">Serving Libraries</h3> <center><img src="/assets/building-car-recognition-part-3/flask-and-tf-serving.png" width="50%" /></center> <p>Tensorflow was chosen for this project, specifically because it has a library called <a href="https://www.tensorflow.org/tfx/guide/serving">Tensorflow Serving</a> that integrates easily with Tensorflow and makes it easy to deploy models.</p> <p>We will also use <a href="https://github.com/pallets/flask/">Flask</a>, a minimal python framework, which gives us just enough to get our server off the ground and connect everything together.</p> <h3 id="pipeline">Pipeline</h3> <p>Since our training set is cropped images, we need to have a pipeline that will detect the vehicle in the image and crop it before passing it to our classifier model. We can display the top k results with their probabilities.</p> <p>Additionally, we would like to include other features of the car, such as the license plate and the color of the vehicle, so we will add that to the pipeline as well, since they can be done in parallel.</p> <p>For car detection and cropping, we will use <a href="https://arxiv.org/abs/1804.02767">YOLOv3</a> again for its fast and accurate results.</p> <center><img src="/assets/building-car-recognition-part-3/yolo-realtime.png" width="50%" /></center> <p>For color recognition, we will use a simple python script found at this repository: <a href="https://github.com/bmoyles0117/Vehicle-Color-Detector">Vehicle-Color-Detector</a>.</p> <p>For license plate recognition, we will use a library that can recognize plates from skewed and unconstrained scenarios: <a href="https://github.com/sergiomsilva/alpr-unconstrained">alpr-unconstrained</a>. This has its own pipeline which includes a YOLO-like model for license plate detection. However, instead of using DarkNet, a C library, that is used to write YOLO, we will use <a href="https://opencv.org/">OpenCV</a>’s support for YOLO models.</p> <p>We convert the YOLOv3, our Make-Model Detection Model, and LPR (Lincese Plate Recognition) models into Tensorflow Serving’s model format, and store them seperately. Each represents a served model that can be run on a different server and can process multiple inputs at a time as a REST API. However, for this project, they will all be served on the same machine.</p> <h3 id="results">Results</h3> <center> <img src="/assets/building-car-recognition-part-3/make-model-web.png" width="40%" /> <img src="/assets/building-car-recognition-part-3/make-model-mobile.jpg" width="30%" /> </center> <p>Here we see the server live, and processing an image on both web and mobile. Both results are correct (the first make and model guess, license plate and color).</p> <h3 id="improvements">Improvements</h3> <p>Although the results are correct for the two cases shown below. There are cases when the results are not good: 1) The image is blurry and the features of the car aren’t discernible, 2) The lighting or shadows make the car look a different color than it is (black -&gt; grey),</p> <p>These issues can be fixed by taking multiple pictures and choosing the best one and developing a neural network to detect car color despite changes in lighting. However, for the sake of this project, these results are satisfactory.</p>Putting everything together on a Web Server.Building a Car Model Classifier (Part 2)2020-01-27T06:30:00+00:002020-01-27T06:30:00+00:00https://wasdkhan.github.io/2020/01/27/building-car-recognition-part-2<p>This post is a continuation of the <a href="/2020/01/12/building-car-recognition.html">previous post</a>.</p> <center><img src="/assets/building-car-recognition-part-2/tesla-model-3.jpg" width="50%" /></center> <p>Using feature extraction and matching to nearest neighbors can only go so far with the pretrained model on VMMRdb. What if we want to generalize to newer classes of cars like the Tesla Model S, not present in the VMMRdb dataset? For that, we need to collect more data for a more updated and robust model.</p> <h3 id="data-collection">Data Collection</h3> <p>Since there is not a collection of images of the newest car models out there, we will need to create our own dataset. And where can we find images? Google Images!</p> <p>But searching, selecting N images, downloading them and organizing them by car name would take too long, so we automate it, and we do this using a web scraper.</p> <center><img src="/assets/building-car-recognition-part-2/web-scraping.jpg" width="60%" /></center> <p>We use the NHTSA (National Highway Traffic and Safety Administration) <a href="https://vpic.nhtsa.dot.gov/api/">Vehicle API</a>, to collect a list of all the car, truck, and mpv (multi-purpose vehicle) makes and models from the year 2000 to 2018 (when this project was first started). Then we use a web scraper to create a list of links from Google Images, and finally we download them using multiprocessing because they are essentially parallel tasks.</p> <p>Link to the scraper and custom scripts specifically for this project found here: <a href="https://github.com/wasdkhan/scrapeCars">scrapeCars</a></p> <h3 id="data-cleaning-and-preprocsesing">Data Cleaning and Preprocsesing</h3> <p>Data cleaning and organization is usually the most time-consuming part of a data science project, and it’s no different for this project.</p> <p>Once we have all the image data, we first delete any duplicates in the same class, since we want unique samples and we can know the true size of a class, if class weighting needs to be done later.</p> <p>But how can we automate duplicate removal? We can hash the images by pixel data and then store them by that hash in a database, compare the hashes and delete if they are the same. There’s a nice library that does this, found here: <a href="https://github.com/philipbl/duplicate-images">duplicate-images</a></p> <center><img src="/assets/building-car-recognition-part-2/image-hashing-blueprint.png" width="40%" /><img src="/assets/building-car-recognition-part-2/image-hashing.png" width="40%" /></center> <p>(Images from <a href="https://www.pyimagesearch.com/2017/11/27/image-hashing-opencv-python/">pyimagesearch</a>)</p> <p>Next, our focus is classification and not detection, the difference being detection is classification + localization (finding if and where an object is present in the scene), we crop the images with only the car present, and remove most of the background. We accomplish this using YOLO v3: detect the largest bounding box of a truck or car, and crop it out. We use this library for ease of use: <a href="https://github.com/OlafenwaMoses/ImageAI">ImageAI</a></p> <center> <img src="/assets/building-car-recognition-part-2/yolo-detection-result.jpg" width="40%" /> <img src="/assets/building-car-recognition-part-2/non-max-suppression.png" width="40%" /> </center> <p>Lastly, we want to remove images that include the interior of a car: pictures steering wheels, seats, A/C, etc. I tried training a smaller model detect the differences between an image of a car and its interior to automatically remove irrelevant pictures. However, this didn’t prove to be effective because of the large variation in this type of pictures. So, I spent two to three days, going through 200k images and manually removing the bad ones.</p> <p>We apply the same cropping to VMMRdb, and add it to our scraped dataset, making it effectively larger and more robust for training models since it contains good examples of certain cars and adds to smaller class sizes.</p> <p>We finally have our complete dataset, and we are ready to <strong>train</strong>!</p> <h2 id="training">Training</h2> <h3 id="data-split">Data Split</h3> <p>The dataset needs to be split into training, validation, and testing sets. For this we write a quick script in python to divide each class randomly into 80/10/10 for each set, respectively.</p> <p>The training set is used for training the model and validation set is used to calculate the loss and accuracy after each epoch or cycle of training. Finally, the test set is untouched and unseen by the model until all the training is complete and a final loss and accuracy needs to be calculated.</p> <center><img src="/assets/building-car-recognition-part-2/train-val-test-split.png" width="40%" /></center> <h3 id="benchmarking">Benchmarking</h3> <p>As a benchmark, we try out <a href="https://arxiv.org/abs/1610.02357">Xception</a>, and use Tensorflow to train our model. Tensorflow 2.0 has keras built-in and it makes it easy to do a bunch of quick experiments to see if the model is training properly.</p> <p>We freeze the top layers (layers closer to the input), since they are usually simple edge and shape detectors, and we unfreeze the last half of layers, allowing them to be trained. This is known as <strong>Transfer Learning</strong>, and is much faster than training a model from scratch.</p> <center><img src="/assets/building-car-recognition-part-2/transfer-learning-lecture-notes.jpeg" width="70%" /></center> <p>We choose an image size 448x448 to preserve as much of the image as possible and apply simple image augmentations to create more varied data and thus make the model more robust to perturbations (Slight rotations, horizontal flips, color variations). Additionally, we only use the scraped dataset and not the combined dataset, since it is smaller.</p> <p>We achieve a validation accuracy of 64.41% and <strong>test accuracy of 64.22%</strong>. Satisfied with this, we move on a better model and better training methods.</p> <h3 id="better-model-better-results">Better Model, Better Results</h3> <p>Next, we choose <a href="https://arxiv.org/abs/1905.11946">EfficientNet</a>, a newer model that has ranges of models starting from B0 to B7, where the depth, width, and resolution of the models scale appropriately based on the computing resources available. They had reached the state of the art in ImageNet with a top-1 accuracy of 84.4% accuracy.</p> <center><img src="/assets/building-car-recognition-part-2/efficientnet-accuracies.png" width="50%" /></center> <p>Choosing EfficientNet B0, the smallest of the set, we unfreeze the last two layers and switch to 224x224 image sizes since it takes much longer to train with 448x448 size images. We use the full dataset (VMMRdb + scraped dataset) this time and use better image augmentations as well, from this great library that has tons of augmentations: <a href="https://github.com/aleju/imgaug">imgaug</a></p> <center><img src="/assets/building-car-recognition-part-2/imgaug.jpg" width="50%" /></center> <p>This time we train for as long as the validation accuracy does not improve for 8 epochs. We use Adam with a starting learning rate of 1e-4, this was discovered with trial and error. After about 18 epochs of training, we stop and get a final training accuracy of 86.66% and validation accuracy of 80.18%.</p> <p>On the test set, we get an accuracy of <strong>80.1%</strong>! Almost a 16% increase from using the combined dataset, a better model, and better image augmentations.</p> <p>This is part 2 of my progress towards this problem, stay tuned for my next and last post on this topic, where I build a final pipeline for the classification.</p>Web scraping to build an Image Dataset and Training a Model.Building a Car Model Classifier (Part 1)2020-01-12T15:50:21+00:002020-01-12T15:50:21+00:00https://wasdkhan.github.io/2020/01/12/building-car-recognition<p>The <a href="https://github.com/wasdkhan/car-reverse-image-search">code</a> for this project can be found on my <a href="https://github.com/wasdkhan/">github</a>.</p> <h3 id="problem-introduction">Problem Introduction</h3> <p>I was interested in the problem of recognizing the make (e.g. Toyota) and model (e.g. Camry) of a car from its profile. This has many applications from Intelligent Transportation Services (ITS), to Surveillance. It is hard for a non-expert to effectively distinguish and remember the differences between makes and models, and so an automated system can supplement or improve current systems.</p> <p>The most popular car image classification dataset, the <a href="http://ai.stanford.edu/~jkrause/cars/car_dataset.html">Stanford Cars Dataset</a>, only has 16,185 images of 196 classes of cars and was released in 2013, which is only a small subset of the actual amount of different models of cars out there.</p> <p>A larger dataset, the Vehicle Make and Model Recognition Dataset, or <a href="http://vmmrdb.cecsresearch.org/">VMMRdb</a> its acronym, has a subset of 246,173 images of 3036 classes (minimum of 20 images/class) and was released in 2017. This serves as a better starting point in providing more data to train classifier to identify subtle changes between models. This specific type of problem is referred to as <a href="https://paperswithcode.com/task/fine-grained-image-classification/latest">Fine-Grained Image Classification</a>, and a pre-trained ResNet-50 model is provided by VMMRdb.</p> <center><img src="/assets/building-car-recognition-part-1/vmmrAmbiguity.png" width="75%" /></center> <h3 id="feature-extraction--approximate-nearest-neighbors">Feature Extraction + Approximate Nearest Neighbors</h3> <p>The provided VMMRdb trained model only classifies 3036 models, to have it generalize to the larger VMMRdb (which contains 291,752 images of 9,170 classes) called VMMRdb-9170, we can just remove the final layer to get a feature representation of an input image and then find the nearest neighbor to it in the larger VMMRdb to classify an image.</p> <p>However, in such a large feature space (2048 dimensions) and dataset (~300k points), nearest neighbors is too slow, so we opt for <a href="https://en.wikipedia.org/wiki/Nearest_neighbor_search#Approximate_nearest_neighbor">Approximate Nearest Neighbors</a> or ANN for short:</p> <center><img src="/assets/building-car-recognition-part-1/approximate-nearest-neighbor.jpeg" width="50%" /></center> <p>ANN is how fast similarity search is done for anything that can have its features extracted (images, video, music). We use the Facebook library, <a href="https://github.com/facebookresearch/faiss">faiss</a>, along with PyTorch to extract features from an image and classify based on its approximate nearest neighbor image’s class.</p> <p>The neat part about this is that this will give us the k-closest looking images to our input image as well, a form of reverse image search. For example, feeding in this picture of a Honda-Pilot 2011 model results in the following results:</p> <center><img src="/assets/building-car-recognition-part-1/honda-pilot-2011.jpg" width="50%" /><img src="/assets/building-car-recognition-part-1/faiss.png" height="100%" /></center> <p>As can be clearly seen, the retrieved images are all Honda Pilot’s from the year 2010 or 2011 and viewed from the same angle (front). This is most likely due to the embedding (2048-feature vector) encoding this information.</p> <h3 id="results-and-further-work">Results and Further Work</h3> <p>However, this is just a sample image, to see the accuracy of this approach, we randomly divide the VMMRdb-9170 into train (80%), dev (10%), and test (10%) splits. Using the best feature indexing method based on accuracy on the dev set, we achieved a Top-1 and Top-5 accuracy of 46.4% and 73.9%, respectively on the test set of VMMRdb-9170. This result is not bad compared to the model trained on VMMRdb-3036, which achieved the Top-1 and Top-5 accuracy of 51.76% and 92.90%, respectively, according to the VMMRdb paper.</p> <p>This is only part 1 of my progress towards this problem, stay tuned for my next post, where I go into a better approach by collecting and cleaning data and training from scratch.</p> <h3 id="references">References</h3> <ol> <li> <p>3D Object Representations for Fine-Grained Categorization Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei 4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.</p> </li> <li> <p>A Large and Diverse Dataset for Improved Vehicle Make and Model Recognition Faezeh Tafazzoli, Keishin Nishiyama, Hichem Frigui In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2017.</p> </li> <li> <p>Billion-scale similarity search with GPUs Johnson, Jeff and Douze, Matthijs and J’egou, Herv’e arXiv preprint arXiv:1702.08734 2017</p> </li> </ol>Face Recognition but for Cars, finding similar looking images of a car to determine its model.Starting a Blog with Jekyll2019-07-26T05:47:21+00:002019-07-26T05:47:21+00:00https://wasdkhan.github.io/2019/07/26/staring-jekyll-blog<p>In this short guide, I will show you how to setup your own blog with Jekyll. That way you have complete control of your content and formatting.</p> <p><a href="https://jekyllrb.com/">Jekyll</a> is a static site generator, that means it processes markup language text (e.g. markdown, the format used for github’s README file) and creates a static website.</p> <p>Jekyll is built-in to <a href="https://pages.github.com/">Github</a> and so it’s as simple as creating a repository(repo), editing and previewing entries locally, and then pushing your changes live to <code class="language-plaintext highlighter-rouge">https://username.github.io</code>, all <strong>hosted for free!</strong></p> <h4 id="requirements">Requirements</h4> <ol> <li>Since Jekyll is <a href="https://www.ruby-lang.org/en/libraries/">Ruby Gem</a>, a packaged library for the Ruby programming language, Ruby needs to be installed, you can see if you have it by opening a terminal and running: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ruby <span class="nt">-v</span> </code></pre></div> </div> <p>This displays the version number, make sure you have version &gt; 2.1.0 or higher. Follow this <a href="https://jekyllrb.com/docs/installation/">guide</a>, if it’s not installed.</p> </li> <li>Now we install Jekyll and <a href="https://rubygems.org/gems/bundler">Bundler</a>. Bundler allows you to manage dependencies with a Gemfile, a list of Gems required for the site: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>gem <span class="nb">install </span>jekyll bundler </code></pre></div> </div> </li> </ol> <h4 id="github-repo">Github Repo</h4> <ol> <li>Create a public github repo with the name <code class="language-plaintext highlighter-rouge">username.github.io</code> where <code class="language-plaintext highlighter-rouge">username</code> is your github username. You can follow the first part of this <a href="https://help.github.com/en/articles/create-a-repo">guide</a> to do that. Clone the repo locally on your computer: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git clone https://github.com/username/username.github.io <span class="nv">$ </span><span class="nb">cd </span>username.github.io </code></pre></div> </div> </li> </ol> <h4 id="generating-template">Generating Template</h4> <ol> <li>Create a Gemfile <code class="language-plaintext highlighter-rouge">$ touch Gemfile</code> in the local repo and edit it in your favorite text editor to add the following two lines: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">source</span> <span class="s1">'https://rubygems.org'</span> <span class="n">gem</span> <span class="s1">'github-pages'</span><span class="p">,</span> <span class="ss">group: :jekyll_plugins</span> </code></pre></div> </div> <p>Save the Gemfile, and run <code class="language-plaintext highlighter-rouge">$ bundle install</code> to install all the dependencies for the Github Pages Gem.</p> </li> <li>Make a Jekyll template site in a temp folder in your local repo and move its contents out into the main folder and then delete it: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>bundle <span class="nb">exec </span>jekyll new temp <span class="nv">$ </span><span class="nb">cp</span> <span class="nt">-r</span> temp/. <span class="nb">.</span> <span class="nv">$ </span><span class="nb">rm</span> <span class="nt">-rf</span> temp </code></pre></div> </div> <p>Now all the generated jekyll files should be in your local repo.</p> </li> <li>Open the Gemfile again in a text editor and comment out the line with Jekyll version with <code class="language-plaintext highlighter-rouge">#</code>, so it looks like this: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># gem "jekyll", "~&gt; 3.8.5"</span> </code></pre></div> </div> <p>And then uncomment the line with Github Pages Gem, by deleting the <code class="language-plaintext highlighter-rouge">#</code> from the front of the line, so it looks like this:</p> <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gem</span> <span class="s2">"github-pages"</span><span class="p">,</span> <span class="ss">group: :jekyll_plugins</span> </code></pre></div> </div> </li> </ol> <h4 id="personalization">Personalization</h4> <ol> <li>Edit the <code class="language-plaintext highlighter-rouge">_config.yml</code> with the appropriate title, description, email, url, etc.: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="ss">title: </span><span class="no">My</span> <span class="n">blog</span> <span class="o">...</span> <span class="ss">url: </span><span class="s2">"https://username.github.io"</span> </code></pre></div> </div> </li> <li>Now run your Jekyll site locally in your browser: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>bundle <span class="nb">exec </span>jekyll serve </code></pre></div> </div> <p>Preview your site locally at <a href="localhost:4000"><code class="language-plaintext highlighter-rouge">localhost:4000</code></a></p> </li> <li>Enter the _posts folder, create a file with the naming scheme YYYY-MM-DD-title-with-dashes.markdown, and edit it in a text editor: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cd </span>_posts <span class="nv">$ </span>nano 2019-07-26-first-post.markdown </code></pre></div> </div> <p>Now, we write a simple markdown blog post, as shown:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nt">---</span> layout: post title: <span class="s2">"First Post"</span> <span class="nb">date</span>: 2019-07-26 01:00:00 <span class="nt">---</span> Markdown is <span class="k">*</span>cool<span class="k">*</span> 8-<span class="o">)</span> </code></pre></div> </div> <p>Save the file and refresh <a href="localhost:4000"><code class="language-plaintext highlighter-rouge">localhost:4000</code></a> to see your new post appearing on the homepage!</p> </li> <li>Add all your changes to a commit and push to Github: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git add <span class="nb">.</span> <span class="nv">$ </span>git commit <span class="nt">-m</span> <span class="s2">"First blog commit"</span> <span class="nv">$ </span>git push </code></pre></div> </div> <p>Wait a few minutes and visit <code class="language-plaintext highlighter-rouge">username.github.io</code> from your web browser to see your page <strong>live!</strong></p> </li> </ol> <p>And with that comes the end of the tutorial, have fun and blog away!</p>Jekyll is a static site generator that can be used to host a blog on github. A simple step by step guide on how to set it up.