UnRembrandt

Final Poster

Final Project Outline: For Check-In #2

UnRembrandt: Painting to Photo “decoder”

Daniel Wang dwang44, Harys Dalvi hdalvi, Sebastian Park spark265

Introduction: We are trying to convert portrait paintings into photos. This is interesting because there are many paintings of historical figures before photography existed. We arrived at this topic because we were interested in different types of applications of image-to-image models. This specific task was inspired by a video on Instagram that we saw where someone visualized what historical US presidents would look like now based on their paintings. We likened this task to the denoising of the MNIST dataset shown in lecture, as well as a sort of reverse style transfer problem.

Related Work: Are you aware of any, or is there any prior work that you drew on to do your project?

CycleGAN
Neural Network Art Generates Realistic Faces From Iconic Paintings (mymodernmet.com)
Artist Uses AI to Make Realistic Pics of Historical Figures - Nerdist
CycleGAN was a 2017 paper (implemented in LUA) that performed something like a style transfer. It was able to convert a Monet painting into a quasi-realistic photograph of a landscape. It was able to do more specific tasks as well, such as converting a video of a horse into a video of a zebra doing the same motions. This paper used a GAN, which is beyond our skill/knowledge level for the scope of this project.
Additional related works are the linked projects (by artists, not scientists) that involve creating realistic portraits form old paintings.

Data: What data are you using (if any)?

We are using a dataset from Kaggle: [https://www.kaggle.com/datasets/ashwingupta3012/human-faces].
The dataset is 2 GB. Our preprocessing will consist of feeding samples from the dataset through an existing style transfer network from TensorFlow.

Methodology:

The architecture of our model will be a denoising autoencoder for images.
We will train the model by first using an existing style transfer network to generate paintings from a dataset of photographs of people. We will then give these paintings to the model and judge its performance on reconstructing the original photograph from the painting. Our design may be justified by the fact that we are using an existing style transfer network to generate the paintings, so our “noise” should be similar across training examples. However, there may be issues. As a backup, we may look more closely at existing architectures that have successfully converted paintings into photographs in other contexts besides just portraits.

Metrics:

Success would be if we are able to accomplish our goal of reconstructing photos from paintings reasonably well as determined by a low loss function and human judgement of the outputs.

What experiments do you plan to run?

We plan to feed existing paintings into the model once it has been trained and judging the results. We will also run ablation experiments.
For most of our assignments, we have looked at the accuracy of the model. Does the notion of “accuracy” apply for your project, or is some other metric more appropriate? The notion of accuracy does not apply for our project. We will have a loss function which we can use to judge the performance of the model, but we will also be subjectively judging its outputs as a generative model.

If you are implementing an existing project, detail what the authors of that paper were hoping to find and how they quantified the results of their model. If you are doing something new, explain how you will assess your model’s performance.

We will assess our model’s performance both numerically with a loss function on training data and subjectively by feeding it paintings and judging the photorealism of its outputs.

What are your base, target, and stretch goals?

Base goal: train a model that preserves the general features of portrait paintings that are fed into it.
Target goal: train a model that creates somewhat consistently photorealistic images from a limited set of painting styles.
Stretch goal: train a model that creates highly photorealistic images from a wide variety of painting styles.

Ethics: Choose 2 of the following bullet points to discuss; not all questions will be relevant to all projects so try to pick questions where there’s interesting engagement with your project. (Remember that there’s not necessarily an ethical/unethical binary; rather, we want to encourage you to think critically about your problem setup.)

~~What broader societal issues are relevant to your chosen problem space?~~

Why is Deep Learning a good approach to this problem?

Deep learning is good for image-related tasks that are difficult to model algorithmically, such as classification and especially generative models like in our case.

What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain?

We will use a dataset of human faces from Kaggle: [https://www.kaggle.com/datasets/ashwingupta3012/human-faces.] The dataset was designed to be inclusive across demographic groups, so ideally it would be representative. However, there will necessarily be some groups that are less well-represented in the dataset than others, reflecting societal biases of data on the internet that the authors used. ~~Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm?~~ ~~How are you planning to quantify or measure error or success? What implications does your quantification have?~~ ~~Add your own: if there is an issue about your algorithm you would like to discuss or explain further, feel free to do so.~~