This project implements Neural Style Transfer (NST) using TensorFlow 2 and a pretrained VGG19 network. The goal is to generate a new image that combines the content of one image with the artistic style of another.
Neural Style Transfer is a deep learning technique that separates and recombines content and style from two images using feature maps extracted from a convolutional neural network.
- Content Image: Defines the structure or semantics.
- Style Image: Defines texture, color, and patterns.
- Generated Image: Optimized to preserve content from the content image and style from the style image.
Here is how Neural Style Transfer blends the content and style images:
The output preserves the structure of the content image and adopts the texture, brush strokes, and colors of the style image.
NST uses a pretrained CNN (typically VGG19) to extract features from content and style images. These features are used to calculate losses that guide the generated image to match the desired output.
- Extract intermediate feature maps from VGG19.
- Use
block5_conv2for content representation. - Use multiple layers (
block1_conv1, ...,block5_conv1) for style representation.
Measures the difference between the feature representations of the content image and the generated image.
Where:
-
$F_{ij}^C$ : Feature map of the content image at position (i, j) -
$F_{ij}^G$ : Feature map of the generated image
Measures the difference between the style (texture and pattern) of the style image and the generated image using Gram matrices.
The Gram matrix captures the correlation between feature maps:
In code, this is implemented efficiently using:
tf.linalg.einsum('bijc,bijd->bcd', input_tensor, input_tensor)This computes the inner products between feature channels to form the Gram matrix.
Where:
-
$L$ : Number of selected style layers -
$N_l$ : Number of filters in layer$l$ -
$M_l$ : Spatial size of the feature map
Combines content and style losses:
Where:
-
$\alpha$ : Weight for content loss (e.g., 1e4) -
$\beta$ : Weight for style loss (e.g., 1e-2)
“Art enables us to find ourselves and lose ourselves at the same time.” – Thomas Merton
- Initialize the generated image as a copy of the content image.
- Extract target features from the content and style images.
- Use
tf.GradientTapeto compute gradients of the total loss w.r.t. the generated image. - Update the image using an optimizer (e.g., Adam).
- Clip pixel values between 0 and 1.
- Repeat for multiple epochs.
| Type | Layer Name | Purpose |
|---|---|---|
| Content | block5_conv2 |
Preserve structure |
| Style | block1_conv1 |
Capture low-level textures |
block2_conv1 |
||
block3_conv1 |
||
block4_conv1 |
||
block5_conv1 |
Capture abstract style |
- Gatys et al. (2015), “A Neural Algorithm of Artistic Style” — https://arxiv.org/abs/1508.06576
- TensorFlow Style Transfer Tutorial — https://www.tensorflow.org/tutorials/generative/style_transfer








