I recently learned one art technique for drawing heads. It is called Loomis Method. It is taught by Andrew Loomis in his book, Drawing the Head & Hands.
This method requires understanding the form of the head. I watched related videos on Stan Prokopenko's YouTube channel and found it very useful.
Anyway, a picture is worth a thousand words. One of my drawings with the Loomis method is below (Elif):
Apart from learning this art technique, I wanted to develop a machine learning model to transform a photo into a Loomis head drawing (and vice versa).
Namely, I tried two GAN models: pix2pix and cyclegan.
pix2pix is mapping an image in one domain to another. For example, it can colorize a black and white image. In my case, it will take a portrait photo and return a drawing with the Loomis method.
How does it work?I can simply explain as follows:
The architecture has two main components: the generator and the discriminator.
The generator generates an image. It is just random pixels in its first try.
The discriminator, on the other hand, is just predicting whether an image pair is produced by the generator or not. Technically speaking, it is doing a binary classification.
Over time, the generator produces better images, and the discriminator is getting better in distinguishing what is real and what is produced. In other words, the generator tries to fool the discriminator.
To learn more about pix2pix, I would recommend you to watch this video by Two Minutes paper.
pix2pix requires a paired dataset. I created the following pairs. I have nearly 100 pairs. It is still a limited dataset, but I wanted to give it a try.
I trained the model a few hours on AWS EC2 instance with a GPU.
The results were not great, but OK for now. I didn't (couldn't) expect more than these.
As you can see in the below illustration, in most of the outputs, the angle of the head doesn't look correct. I assume it is because my training dataset doesn't have enough examples to cover all the possible combinations.
A few outputs:
It is a pain to get such pairs. Luckily, there is another framework for doing image to image translation, and it doesn't require paired examples. It is CycleGAN.
This time I need a bunch of input images (portraits), and a bunch of target images (drawings). No need to have pairs.
There is a constraint in this architecture: Generated images are translated back to the original image so that there will be a meaningful relationship between the input images and the generated images. A good explanation is here.
I picked 600 samples out of 100K StyleGAN generated photos. In other words, they are photos of people who don't exist. This dataset doesn't have sample portrait photos with extreme angles. For simplicity, I picked the ones with natural lighting, soft background, not smiling.
It couldn't translate photo to drawing well. The output is like edge detection. However, drawing to photo translation was much better than I expected. In the paper, and found this statement:
On translation tasks that involve color and texture changes, like many of those reported above, the method often succeeds. We have also explored tasks that require geometric changes, with little success.
One sample for drawing to photo translation:
See the below video to see how it is translating.
Finally, I also wanted to see how it performs with a training set with less variance. So, I trained it with photos with studio lighting, white background.
This task, extracting artistic form from a photo, is requiring a geometric transformation.
pix2pix seems to work; however, it requires the paired examples, and it is difficult to find in most cases.I thought it could work better if I had 3d models in both domains so that I could render images in a way that they will be pairs. However, I didn't spend time on that.
CycleGAN works well when there are no pairs. However, it can't do a geometric transformation. It is good if the task is about changing texture or color.
I found both pix2pix and CycleGAN amazing. They would bring much better results if I have had more training examples and if I have trained them longer.
Let me know if you detect an mistake. I will be developing this content.