Tensorflow implementation #7
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I found your paper on initializing ResNets interesting and so I have implemented fixup initialisation in tensorflow. The implementation allows you to build preactivation resnets, with bottleneck units, by defining the blocks, number of units in each block, and the depth of each unit. I have been able to achieve 93.5% accuracy training on cifar10 with ResNet50, with mixup for 200 epochs.
I'd be keen to receive any feedback you might have if I have made some mistake in the implementation, and can make any amendments you think are necessary, to be included.
The code is laid out into two main chunks.
First, the building of the resnet is done in fixup_resnet.py. To build a network you must first create a FixupResnet object:
then define the type of resnet it will be by adding blocks. For example to define the ResNet50 from the original preactivation paper you would do:
So far we have only defined the configuration of the network, to actually build the graph we need to do:
where the image could of course be of different dimensions, or whatever tensor we want to feed into the network.
Secondly, the code to train on cifar10 is in train_estimator.py, which uses the estimator class to run training, and evaluation.