Monday, August 3, 2015

Deep dream: Visualizing every layer of GoogLeNet

Last week I introduced bat-country, my implementation of a lightweight, extendible, easy to use Python package for deep dreaming and inceptionism.

The reception of the library was very good, so I decided that it would be interesting to do a follow up post — but instead of generating some really trippy images like on the Twitter #deepdream stream, I thought it would be more captivating to instead visualize every layer of GoogLeNet using bat-country.

Looking for the source code to this post?
Jump right to the downloads section.

Visualizing every layer of GoogLeNet with Python

Below follows my Python script to load an image, loop over every layer of the network, and then write each output image to file:

# import the necessary packages
from __future__ import print_function
from batcountry import BatCountry
from PIL import Image
import numpy as np
import argparse
import warnings
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-b", "--base-model", required=True, help="base model path")
ap.add_argument("-i", "--image", help="path to image file")
ap.add_argument("-o", "--output", help="path to output directory")
args = ap.parse_args()

# filter warnings, initialize bat country, and grab the layer names of
# the CNN
warnings.filterwarnings("ignore")
bc = BatCountry(args.base_model)
layers = bc.layers()

# extract the filename and extension of the input image
filename = args.image[args.image.rfind("/") + 1:]
(filename, ext) = filename.split(".")

# loop over the layers
for (i, layer) in enumerate(layers):
        # perform visualizing using the current layer
        print("[INFO] processing layer `{}` {}/{}".format(layer, i + 1, len(layers)))

        try:
                # pass the image through the network
                image = bc.dream(np.float32(Image.open(args.image)), end=layer,
                        verbose=False)

                # draw the layer name on the image
                image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
                cv2.putText(image, layer, (5, image.shape[0] - 10),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.95, (0, 255, 0), 2)

                # construct the output path and write the image to file
                p = "{}/{}_{}.{}".format(args.output, filename, str(i + 1).zfill(4), ext)
                cv2.imwrite(p, image)

        except KeyError, e:
                # the current layer can not be used
                print("[ERROR] cannot use layer `{}`".format(layer))

# perform housekeeping
bc.cleanup()

This script requires three command line arguments: the

--base-model

directory where our Caffe model lives, the path to our input

--image

, and finally the

--output

directory where our images will be stored after being passed through the network.

As you’ll also see, I am using a

try/except

block to catch any layers that cannot be used for visualization.

Below is the image that I inputted to the network:

Figure 1: The iconic input image of Dr. Grant and the T-Rex from Jurassic Park.

I then executed the Python script using the following command:

$ python visualize_layers.py \
        --base-model $CAFFE_ROOT/caffe/models/bvlc_googlenet \
        --image images/jp.jpg --output output/jp

And the visualization process will kick off. I generated my results on an Amazon EC2 g2.2xlarge instance with GPU support enabled so the script finished up within 30 minutes.

You can see a .gif of all layer visualizations below:

Figure 2: Visualizing every layer of GoogLeNet using bat-country.

The .gif is pretty large at 9.6mb, so give it a few seconds to load, especially if you are on a slow connection.

In the meantime, here are some of my favorite layers:

Figure 2: This is my far my favorite one of the bunch. The lower layers of the network reflect edge-like regions in the input image.

Figure 3: This is my far my favorite one of the bunch. The lower layers of the network reflect edge-like regions in the input image.

Figure 3: The inception_3a/3x3 layer also products a nice effect.

Figure 4: The inception_3a/3×3 layer also products a nice effect.

Figure 4: The same goes for the inception_3b/3x3_reduce layer.

Figure 5: The same goes for the inception_3b/3x3_reduce layer.

Figure 4: This one I found amusing -- it seems that Dr. Grant has developed a severe case of dog-ass.

Figure 6: This one I found amusing — it seems that Dr. Grant has developed a severe case of dog-ass.

Figure 6: Eventually, our Dr. Grant and T-Rex have morphed into something else entirely.

Figure 7: Eventually, our Dr. Grant and T-Rex have morphed into something else entirely.

Summary

This blog post was a quick “just for fun” tutorial on visualizing every layer of a CNN using the bat-country library. It also served as a good demonstration on how to use the

bat-country

library.

If you haven’t had a chance to play around with deep dreaming or inceptionism, definitely give the original post on bat-country a read — I think you’ll find it amusing and enjoyable.

See you next week!