Tutorial

A neural network can learn a mapping from (x, y) coordinates to pixel brightness. To get an intuitive understanding of how that would work, feel free to visit http://playground.tensorflow.org and play around.

Let's use the following greyscale image as example:

image = [
    [0, 130, 255],
    [40, 170, 255],
    [80, 210, 255]
]

In this image, the top left pixel is black, the center pixel is grey and the rightmost pixels are white.

The image can be represented as a list of coordinates to pixel brightness values, like this:

(0, 0) => 0
(1, 0) => 130
(2, 0) => 255
(0, 1) => 40
(1, 1) => 170
(2, 1) => 255
(0, 2) => 80
(1, 2) => 210
(2, 2) => 255

A neural network can be viewed as a function approximator that can learn this mapping. However, first we need to apply some pre-processing to the data. Ideally, the input and output should be centered around zero and have unit variance. In other words, a series of numbers like [-1, 0, 1] is fine, but a series like [0, 130, 255] is not. For now, it's good enough to simply scale all our values so they lie between 0 and 1. That means we'll

divide x coordinates by the image width
divide y coordinates by the image height
divide pixel brightness values by 255

Here's some code to represent our little image and preprocess the data:

import numpy as np

image = [
    [0, 130, 255],
    [40, 170, 255],
    [80, 210, 255]
]
image = np.array(image)
image = np.divide(image, 255.0)
image_width, image_height = image.shape
print('Image with shape {}:'.format(image.shape))
print(image)

x = []
y = []
for i in range(image_height):
    for j in range(image_width):
        x.append(
            [i / image_height, j / image_width]
        )
        y.append(
            [image[i][j]]
        )
x = np.array(x)
y = np.array(y)

print('\nScaled coordinates (input):')
print(x)

print('\nScaled pixel brightness values (output):')
print(y)

Go ahead and run the code. All good? Great! Let's move on and actually train a neural network on this data.

For now, let's use a tiny neural network, something like this:

Simple neural network

The data flows from left to right. In each node, two things happen: 1) All incoming signals are added together and 2) that sum is passed through an activation function. This function can be as simple as a rectifier, f(x) = max(0, x). This function is actually quite popular, and is typically referred to as "ReLU" (Rectified Linear Unit). The output of the activation function becomes the output of the node.

Given that we use the Keras library, the following code corresponds to the architecture described above:

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(5, input_dim=2))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('relu'))

When it comes to training this model, we need to define a loss function and an optimizer. Read up on it if you want, but don't worry too much about the details.

model.compile(loss='mean_squared_error', optimizer='sgd')

To train the model on the neatly prepared data, do

model.fit(x, y, epochs=500)

epochs=500 means it'll sweep the data 500 times during the training process. The loss should go down from around [0.5, 0.3] to somewhere around [0.05, 0.1]

To see what the neural network learned, do

predicted_image = model.predict(x, verbose=False).reshape(image.shape)
print('\nPredicted image:')
print(predicted_image)

The result may not be perfect, but at least it's a start. The code from this mini-tutorial is available here: tutorial.py

When you're ready to move on, take a look at the assignments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tutorial

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally