neuralnetwork

About

This repo is my implementation of a basic feedforward neural network with stochastic gradient descent and a convolutional neural network, to help me learn the theory behind them. I consulted various research papers in order to learn and implement optimizations to make the neural net perform better. I've included comments in the code explaining where certain math comes from, and why it works.

This model was able to achieve 92% accuracy (9125/10000) on the known MNIST dataset using a convolutional neural network.

Optimizations

He initialization function

... leads to a zero-mean Gaussian distribution whose standard deviation (std) is sqrt(2/n_l). This is our way of initialization. We also initialize b = 0. [1]

Due to this, I included code in initializations.py for He initialization, and used it to initializ the weights matrix in linear.py as a Gaussian Distribution with std sqrt(2/n), and set the biases to 0.

initializations.py

def he(shape, input_size):
    return np.random.randn(*shape) * np.sqrt(2 / input_size)

linear.py

self.weights = he((neurons, input_size), input_size)
self.biases = np.zeros((neurons, 1))

Xavier initialization function

W ∼ U[- sqrt(6) / sqrt(n_j + n_j+1), sqrt(6) / sqrt(n_j + n_j+1)] [2]

This essentially says that the weights are initialized to a uniform distribution bounded by these values. n_j is the number of inputs, and n_j+1 is the number of outputs. I included code in initializations.py for a xavier initialization, and included code in linear.py to use it.

initializations.py

def xavier(shape, input_size, output_size):
    b = np.sqrt(6 / (input_size + output_size))
    return np.random.uniform(-b, b, *shape)

linear.py

self.weights = xavier((neurons, input_size), input_size, neurons)
self.biases = np.zeros((neurons, 1))

Momentum with Stochastic Gradient Descent

v_t+1 = µv_t - ε∇ f(⍬_t)
⍬_t+1 = ⍬_t + v_t+1 [3]

Where v_t is the how much we shift the weights in the t'th iteration, ⍬_t is the weights, ∇ f is the gradient of the weights, µ is the momentum coefficient, and ε is the learning rate.

I implemented this in linear.py by storing a momentum attribute for weights and biases, and updating it through every iteration, and using the stored attribute to update the weights and biases. I used 0.9 as the momentum coefficient. Similarly, momentum is implemented in convolutional.py for filters and biases.

self.weights_momentum = 0.9 * self.weights_momentum - learning_rate * weights_gradient
self.biases_momentum = 0.9 * self.biases_momentum - learning_rate * output_gradient

self.weights = self.weights + self.weights_momentum
self.biases = self.biases + self.biases_momentum

References:

[1] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034). https://doi.org/10.48550/arXiv.1502.01852

[2] Glorot, X. & Bengio, Y.. (2010). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:249-256 Available from https://proceedings.mlr.press/v9/glorot10a.html.

[3] Sutskever, I., Martens, J., Dahl, G. & Hinton, G.. (2013). On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1139-1147 Available from https://proceedings.mlr.press/v28/sutskever13.html.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
README.md		README.md
activations.py		activations.py
convolutional.py		convolutional.py
initializations.py		initializations.py
layer.py		layer.py
linear.py		linear.py
losses.py		losses.py
neuralnetwork.py		neuralnetwork.py
reshape.py		reshape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

neuralnetwork

About

Optimizations

He initialization function

Xavier initialization function

Momentum with Stochastic Gradient Descent

About

Uh oh!

Releases

Packages

Languages

kaushal2m2/neuralnetwork

Folders and files

Latest commit

History

Repository files navigation

neuralnetwork

About

Optimizations

He initialization function

Xavier initialization function

Momentum with Stochastic Gradient Descent

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages