This project at styletransfer.gregorycooke.me was a fun learning project I did to practice with serverless machine learning. I came across this project - Cartoonify - and thought it would be fun to build a very similar application with my own stack and design.
The front-end of this project was written in C# using Blazor WebAssembly. This generates a static front-end page that I host in a serverless fashion via Azure Static WebApps. Blazor has the ability to interop with javascript, so I was able to use the Angular’s ImageCompare
component to allow the user to interact with the original image and the styled image.
The network used here is discussed in the paper CartoonGAN: Generative Adversarial Networks for Photo Cartoonization. In short, the idea is that real life images exist in some manifold within the space of all possible images, and cartoon images exist on some other manifold in the same space. The neural networks learns to transform from the real manifold to the cartoon manifold. This paper’s method does not require paired images for training, but rather uses unstructured training data (a big deal, as paired training images for style transfer are difficult to get/create). The overall architecture is a generative adversarial network (GAN), with the generator being responsible for the manifold projection from the real manifold to the cartoon manifold, and the discriminator being responsbile for judging whether an input image is real or cartoon. The pretrained models used in this project are from this repository.
When an image is “uploaded” to the static page, the page sends a request to an Azure Function. That Azure Function is a python application that pulls and caches the models discussed above, then runs the image through the neural network, and send back the output cartoonized image to the static page. The static page then renders the cartoonized image and the original image in a React ImageCompare
component.
This design does deal with the common “Cold Start” problem of serverless functions (Azure Functions, AWS Lambda, etc). Basically, when it’s called for the first time in a while, the resources to run the function have to spin back up. In this case, that involves setting up a python environment with some reasonably sized packages as well as pulling ~120Mb of models from Azure Blob Storage. Therefore, the first request that hits this function can take upwards of 30 seconds. After the Azure Function environment spins up, it stays up while continually serving requests and keeps the models cached, so subsequent requests are significantly faster. It can stay idle for roughly 10 minutes before Azure spins it down. Solutions for this problem include priming the function when the web page first loads, or continually hitting it so the environment stays active; however, as this is just a for-fun project, it’s okay for cold starts to happen :)
Overall, this design costs me on the order of pennies per month as Azure Functions are extremely cheap for the low loads I am serving.
Project link: https://styletransfer.gregorycooke.me