-
The model correctly identifies a Sisley painting (1/2)
-
The model correctly identifies a Sisley painting (2/2)
-
The model correctly identifies a Klee painting (1/2)
-
The model correctly identifies a Klee painting (2/2)
-
A section of the landing page shows the artists that can be predicted (4/17 shown here)
-
A brief discussion of the performance of this model
Inspiration
I changed gears about halfway through the bootcamp to create this project. I love art, particularly impressionism, and I love being nerdy about it. The idea here was to learn if every artist leaves some sort of systematic fingerprint in their work that can be used to identify the painter with high accuracy.
What it does
This classifier sorts input images between 17 different European and American artists. These were chosen as the artists with at least 150 paintings in the best artworks of all time dataset.
How I built it
The model is composed of a feature extractor (VGG16, using Keras) with a SVM classifier on top, implemented with Scikit-learn. As the numbers of paintings per artist were quite unbalanced from one artist to another, class weights were used to balance out the dataset. Classes were selected as the artists with at least 150 identified paintings, ranging from 164 (Botticelli) to 877 paintings (Van Gogh). Also, data augmentation was used to remedy the fact that the number of training and testing images was small, by the nature of the data: even the most prolific artists can at most have a few hundred paintings in their repertoire.
Challenges I ran into
The model's accuracy is about 56.3%, meaning that a given label is predicted correctly, on average, 56% of the time. The model could perform better if trained on more consistent data from each artist (i.e. not a full catalogue of their work, but works in a select style). Unfortunately, it is natural for the number of training samples to be limited for this sort of data.
Many artists' folders included sketches and early works that may not be representative of the artist's main style, that a human would also have trouble identifying. In some cases, for example in Van Gogh's data, a significant portion of the artworks had both the sketch and the finished painting (see below). Many paintings in the dataset had "doubles" like this, increasing the importance of the content over the style.


Accomplishments that I'm proud of
The accent color of the second page changes depending on the predicted artist!
What I learned
This was my first complete AI project, and my first time creating a webapp, so it was overall a very enriching experience. I learned about image preprocessing, about transferring a model to a webapp, and about making a webapp in general.
What's next ?
- Finding a more original name. Suggestions welcome!
- Selecting better-quality data and retraining

Log in or sign up for Devpost to join the conversation.