Computer Vision, Fall 2016

Project #5: A Neural Algorithm of Artistic Style

Click here for source codes!

Date Submitted: 20 Nov 2016

446216 (Ying Wang) 446489 (Wenjia Zhang)

Project Description

Implement a state-of-the-art research paper from a recent computer vision conference or journal (CVPR, ICCV, ECCV, SIGGRAPH, PAMI, IJCV).
-- Courtesy of Course Website

Goal

We want to maintain the global arrangement from the original photograph and apply the colors and local structure of the artwork to it. We end up with a new synthetic image which has the style of the artwork and the content of the photograph.

Approach

Image is represented as a set of filtered images. Size of filtered images decreases (max-pool) while number of filters increase. Use the filtered images to reconstruct the original image. The main idea is to use the filter responses from different layers of CNN to build the style. In the early layers, the reconstruction is almost perfect, while in the deeper layers it will lose detailed pixel information but reserve high level concepts. In this way we can get different levels of details from low (strokes, points, corners) to high (patterns and objects, etc).

1. Install TensorFlow, Scipy, Numpy and download the pre-trained VGG19 model.

2. Define the style representation function (textures)
Style representation: correlations between the different filter responses over the spactial extent of feature maps, to provide colors and local structures
Key equations: Correlation matrix, Cost for style reconstruction, and accumulate cost for lower layers.

3. Define the content representation function
The content representation: The feature responses in higher layers of the network.

4. Combine content and style
Define different loss function of style, content and total.

5. Modify different hyperparameters(such as content weight, style weight, iteration times, learning rate)

Results

We want to maintain the global arrangement from the original photograph and apply the colors and local structure of the artwork to it. We end up with a new synthetic image which has the style of the artwork and the content of the photograph.

Here are some other interesting results.