by Dan Whitenack4/14/2017
Unless you have been hiding under a rock for the past few months, you have likely seen Christopher Hesse’s demo of image-to-image translation (a Tensorflow port of pix2pix by Isola et al.). In case you missed it, search for edge2cat, and a whole new world of cat-infused artificial intelligence will be opened to you. The model is trained on cat images, and it can translate hand drawn cats to realistic images of cats! Here are a few of our personal favorite “edge” image to cat translations generated by Chris’s model, ranging from accurate to horrifying:
But models like this aren’t restricted to creating weird cat images that take over the Internet. They are actually very generic and can be trained to perform a variety of translations including day to night, building layout to buildings, black and white to color, and many more:
You might even be thinking that you want to try using your own set of images to see what kinds of crazy things this model will start pumping out. But, that has to be a tricky process, right?
Nope! Pachyderm has created a totally reusable and generic pipeline that takes care of all the training, pre-processing, etc. for you, so you can jump right into the fun parts! They utilize this machine learning pipeline template (produced by the team at Pachyderm in collaboration with Chris) to show how easy it can be to deploy and manage image generation models (like those pictured above). Everything you need to run the reusable pipeline can be found here on Github, and is described below.
Christopher Hesse’s image-to-image demos use a Tensorflow implementation of the Generative Adversarial Networks (or GANs) model presented in this article. Chris’s full Tensorflow implementation of this model can be found on Github and includes documentation about how to perform training, testing, pre-processing of images, exporting of the models for serving, and more.
To deploy and manage the model, we will execute it’s training, model export, pre-processing, and image generation in the reusable Pachyderm pipeline mentioned above. This will allow us to:
The general structure of our pipeline looks like this:
The cylinders represent data “repositories” in which Pachyderm will version training, model, etc. data (think “git for data”). These data repositories are then input/output of the linked data processing stages (represented by the boxes in the figure).
You can experiment with this pipeline locally using a quick local installation of Pachyderm. Alternatively, you can quickly spin up a real Pachyderm cluster in any one of the popular cloud providers. Check out the Pachyderm docs for more details on deployment.
Once deploy, you will be able to use the Pachyderm’s
pachctl CLI tool to create data repositories and start our deep learning pipeline.
First, let’s prepare our training and model export stages. Chris Hesse’s
pix2pix.py script includes:
Thus, our “Model training and export” stage can be split into a training stage (called “checkpoint”) producing a model checkpoint and an export stage (called “model”) producing a persisted model used for image generation:
We can deploy this part of the pipeline in two quick steps:
pachctl create-repo training.
training_and_export.json, telling Pachyderm to: (i) run Chris’s
pix2pix.pyscript in “train” mode on the data in the “training” repository outputting a checkpoint to the “checkpoint” repository, and (ii) run the
pix2pix.pyscript in “export” mode on the data in the “checkpoint” repository outputting a persisted model to the “model” repository. This can be done by running
pachctl create-pipeline -f training_and_export.json.
Next, let’s prepare our pre-processing and image generation stages. Our trained model will be expecting PNG image data with certain properties (256 x 256 in size, 8-bit/color RGB, non-interlaced). As such, we need to pre-process (specifically resize) our images as they are coming into our pipeline, and Chris has us covered with a
process.py script to perform the resizing.
To actually perform our image-to-image translation, we need to use a
process_local.py script. This script will take our pre-processed images and persisted model as input and output the generated, translated result:
Again, we can deploy this part of the pipeline in two quick steps:
pachctl create-repo input_images.
pre-processing_and_generation.json, telling Pachyderm to: (i) run the
process.pyscript in on the data in the “input_images” repository outputting to the “preprocess_images” repository, and (ii) run the
process_local.pywith the model in the “model” repository and the images in the “preprocess_images” repository as input. This can be done by running
pachctl create-pipeline -f pre-processing_and_generation.json.
Now that we have created our input data repositories (“input_images” and “training”) and we have told Pachyderm about all of our processing stages, our production-ready deep learning pipeline will run automatically when we put data into “training” and “input_images.” It’s just works.
Chris has provides a nice guide for preparing training sets here. You can use cat images, dog images, buildings, or anything that might interest you. Be creative and show us what you come up with! When you have your training and input images ready, you can get them into Pachyderm using the
pachctl CLI tool or one of the Pachyderm clients (discussed in more detail here).
For some inspiration, we ran Pachyderm’s pipeline with Google map images paired with satellite images to create a model that translates Google map screenshots into pictures resembling satellite images. Once we had our model trained, we could stream Google maps screenshots through into the pipeline to create translations like this:
Prepare your training data set, deploy the above template pipeline, and be sure to share your results! We can’t wait to see what crazy stuff you come up with.
Be sure to:
Dan is a data scientist at Pachyderm (YC W15).