Skip to content

Conversation

@ben-davidson-6
Copy link

@ben-davidson-6 ben-davidson-6 commented Jun 7, 2019

I found your paper on initializing ResNets interesting and so I have implemented fixup initialisation in tensorflow. The implementation allows you to build preactivation resnets, with bottleneck units, by defining the blocks, number of units in each block, and the depth of each unit. I have been able to achieve 93.5% accuracy training on cifar10 with ResNet50, with mixup for 200 epochs.

I'd be keen to receive any feedback you might have if I have made some mistake in the implementation, and can make any amendments you think are necessary, to be included.

The code is laid out into two main chunks.

First, the building of the resnet is done in fixup_resnet.py. To build a network you must first create a FixupResnet object:

resnet = FixupResnet(classes=10)

then define the type of resnet it will be by adding blocks. For example to define the ResNet50 from the original preactivation paper you would do:

resnet.add_block(units=3, depth=64)
resnet.add_block(units=4, depth=128)
resnet.add_block(units=6, depth=256)
resnet.add_block(units=3, depth=512)

So far we have only defined the configuration of the network, to actually build the graph we need to do:

image = tf.placeholder(tf.float32, [None, 32, 32, 3])
logits, probabilites = resnet.build_network(image)

where the image could of course be of different dimensions, or whatever tensor we want to feed into the network.

Secondly, the code to train on cifar10 is in train_estimator.py, which uses the estimator class to run training, and evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant