diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..b81e1c5 --- /dev/null +++ b/.gitignore @@ -0,0 +1,6 @@ +torch2caffe/*.pyc +*.t7b +*.caffemodel +*.prototxt +*.csv + diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md deleted file mode 100644 index e2e8d6d..0000000 --- a/CONTRIBUTING.md +++ /dev/null @@ -1,42 +0,0 @@ -# Contributing to `fb-caffe-exts` -We want to make contributing to this project as easy and transparent as -possible. - -## Our Development Process -We sync from an internal codebase, which is the source of truth for this -project. We will apply pull requests to this codebase. - -## Pull Requests -We actively welcome your pull requests. - -1. Fork the repo and create your branch from `master`. -2. If you've added code that should be tested, add tests. -3. If you've changed APIs, update the documentation. -4. Ensure the test suite passes. -5. Make sure your code lints. -6. If you haven't already, complete the Contributor License Agreement ("CLA"). - -## Contributor License Agreement ("CLA") -In order to accept your pull request, we need you to submit a CLA. You only need -to do this once to work on any of Facebook's open source projects. - -Complete your CLA here: - -## Issues -We use GitHub issues to track public bugs. Please ensure your description is -clear and has sufficient instructions to be able to reproduce the issue. - -Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe -disclosure of security bugs. In those cases, please go through the process -outlined on that page and do not file a public issue. - -## Coding Style -* 2 spaces for indentation rather than tabs -* 80 character line length -* Lua files should pass `luacheck` -* Python files should pass `flake8` -* C++ files should be run through `clang-format` - -## License -By contributing to `fb-caffe-exts` you agree that your contributions will be licensed -under its BSD license. diff --git a/README.md b/README.md new file mode 100644 index 0000000..b38b31c --- /dev/null +++ b/README.md @@ -0,0 +1,49 @@ +# Torch to Caffe Model Converter +(forked from [Cysu's branch](https://github.com/Cysu/fb-caffe-exts), originally from [fb-caffe-exts](https://github.com/facebook/fb-caffe-exts)) + +:x: **I have stopped maintaining this repo. If you still want to use the converter, please use the AWS option:** + +The easiest option is to launch an AWS EC2 g2.2xlarge instance I created. Choose N.California Sever and search for the instance name of FB-Torch2Caffe (ami-03542e63). [You can follow the [AWS tutorial](http://cs231n.github.io/aws-tutorial/)] + +### Get Started + +0. Please make sure Torch and Caffe (with pycaffe and python layer) are correctly installed. +0. Download the code and install the dependencies (:x: **If you still would like to set it up on you own workstation, please follow the steps here to install dependencies ([fbcunn/install](https://github.com/facebook/fbcunn/blob/master/INSTALL.md)). Good luck!**) + + ```bash + git clone https://github.com/zhanghang1989/fb-caffe-exts.git + sudo bash install-dep.sh + ``` + +0. Add Environment Variables (Change the path for your own machine) + + ```bash + echo "export LD_PRELOAD=/path/to/libcaffe.so; export PYTHONPATH=$PYTHONPATH:/path/to/caffe/python/:/path/to/fb-caffe-exts/;" >>~/.bashrc && source ~/.bashrc + source ~/.bashrc + ``` + +0. Convert your first model: + + ```bash + th convert.lua torch_model.t7b + ``` + +0. Or custormize the conversion: + + ```bash + th torch2caffe/torch2caffe.lua --input torch_model.t7b --preprocessing prepnv.lua --prototxt name.prototxt --caffemodel name.caffemodel --input_dims 1 3 224 224 + ``` + +### The Layers We Added Support +0. ``ELU`` +0. ``SpatialDropout`` We scale the weights of previous layers by (1-p) to hide the difference between torch and caffe. +0. ``SpatialMaxPooling`` It has slightly different behaviours in Torch and Caffe. Torch uses floor(n/s+1) and Caffe uses floor(n/s). Therefore, only the conversion of even featuremap size is supported. +0. ``SpatialBatchNormalization`` Caffe BatchNorm doesn't have bias. We only support non-affine BN. Alternatively, you can convert it into a customized version of BN as in [Cysu's branch](https://github.com/Cysu/fb-caffe-exts). + +### Known Issues +0. ``LD_PRELOAD`` crashes your gedit (If you know how to fix it, please update this wiki.) +0. The opencv package that Caffe relies on may cause the error of "libdc1394" failed to initialize, just create a fake device: + ```bash + sudo ln /dev/null /dev/raw1394 + ``` + diff --git a/README.org b/README.org deleted file mode 100644 index abf5604..0000000 --- a/README.org +++ /dev/null @@ -1,126 +0,0 @@ -* =fb-caffe-exts= -=fb-caffe-exts= is a collection of extensions developed at FB while using Caffe -in (mainly) production scenarios. - -** =predictor/= -A simple C++ library that wraps the common pattern of running a =caffe::Net= in -multiple threads while sharing weights. It also provides a slightly more -convenient usage API for the inference case. - -#+BEGIN_SRC c++ - #include "caffe/predictor/Predictor.h" - - // In your setup phase - predictor_ = folly::make_unique(FLAGS_prototxt_path, - FLAGS_weights_path); - - // When calling in a worker thread - static thread_local caffe::Blob input_blob; - input_blob.set_cpu_data(input_data); // avoid the copy. - const auto& output_blobs = predictor_->forward({&input_blob}); - return output_blobs[FLAGS_output_layer_name]; -#+END_SRC - -Of note is the =predictor/Optimize.{h,cpp}=, which optimizes memory -usage by automatically reusing the intermediate activations when this is safe. -This reduces the amount of memory required for intermediate activations by -around 50% for AlexNet-style models, and around 75% for GoogLeNet-style -models. - -We can plot each set of activations in the topological ordering of the network, -with a unique color for each reused activation buffer, with the height of the -blob proportional to the size of the buffer. - -For example, in an AlexNet-like model, the allocation looks like -#+ATTR_HTML: :height 300px -[[./doc/caffenet.png]] - -A corresponding allocation for GoogLeNet looks like -#+ATTR_HTML: :height 300px -[[./doc/googlenet.png]] - - -The idea is essentially linear scan register allocation. We - -- compute a set of "live ranges" for each =caffe::SyncedMemory= (due to sharing, - we can't do this at a =caffe::Blob= level) -- compute a set of live intervals, and schedule each =caffe::SyncedMemory= in a - non-overlapping fashion onto each live interval -- allocate a canonical =caffe::SyncedMemory= buffer for each live interval -- Update the blob internal pointers to point to the canonical buffer - -Depending on the model, the buffer reuse can also lead to some non-trivial -performance improvements at inference time. - -To enable this just pass =Predictor::Optimization::MEMORY= to the =Predictor= -constructor. - -** =torch2caffe/= -A library for converting pre-trained Torch models to the equivalent Caffe models. - -=torch_layers.lua= describes the set of layers that we can automatically -convert, and =test.lua= shows some examples of more complex models being -converted end to end. - -For example, complex CNNs ([[http://arxiv.org/abs/1409.4842][GoogLeNet]], etc), deep LSTMs (created in [[https://github.com/torch/nngraph][nngraph]]), -models with tricky parallel/split connectivity structures ([[http://arxiv.org/abs/1103.0398][Natural Language -Processing (almost) from Scratch]]), etc. - -This can be invoked as - -#+BEGIN_EXAMPLE - ∴ th torch2caffe/torch2caffe.lua --help - --input (default "") Input model file - --preprocessing (default "") Preprocess the model - --prototxt (default "") Output prototxt model file - --caffemodel (default "") Output model weights file - --format (default "lua") Format: lua | luathrift - --input-tensor (default "") (Optional) Predefined input tensor - --verify (default "") (Optional) Verify existing - (number) Input dimensions (e.g. 10N x 3C x 227H x 227W) - -#+END_EXAMPLE - - -This works by - -- (optionally) preprocessing the model provided in =--input=, (folding - BatchNormalization layers into the preceding layer, etc), -- walking the Torch module graph of the model provide in =--input=, -- converting it to the equivalent Caffe module graph, -- copying the weights into the Caffe model, -- Running some test inputs (of size =input_dims...=) through both models and - verifying the outputs are identical. -** =conversions/= -A simple CLI tool for running some simple Caffe network transformations. - -#+BEGIN_EXAMPLE - ∴ python conversions.py vision --help - Usage: conversions.py vision [OPTIONS] - - Options: - --prototxt TEXT [required] - --caffemodel TEXT [required] - --output-prototxt TEXT [required] - --output-caffemodel TEXT [required] - --help Show this message and exit. -#+END_EXAMPLE - -The main usage at the moment is automating the [[https://github.com/BVLC/caffe/blob/master/examples/net_surgery.ipynb][Net Surgery]] notebook. - - -** Building and Installing -As you might expect, this library depends on an up-to-date [[http://caffe.berkeleyvision.org/][BVLC Caffe]] installation. - -The additional dependencies are - -- The C++ libraries require [[https://github.com/facebook/folly][folly]]. -- The Python =conversions= libraries requires [[http://click.pocoo.org/5/][click]]. - -You can drop the C++ components into an existing Caffe installation. We'll -update the repo with an example modification to an existing =Makefile.config= -and a =CMake= based solution. - -** Contact -Feel free to open issues on this repo for requests/bugs, or contact [[mailto:tulloch@fb.com][Andrew -Tulloch]] directly. diff --git a/clean.sh b/clean.sh new file mode 100755 index 0000000..a0a8231 --- /dev/null +++ b/clean.sh @@ -0,0 +1,3 @@ +rm *.caffemodel +rm *.prototxt +rm *t7b diff --git a/conversions/conversions.py b/conversions/conversions.py deleted file mode 100644 index 5c9504f..0000000 --- a/conversions/conversions.py +++ /dev/null @@ -1,370 +0,0 @@ -""" -Copyright (c) 2015-present, Facebook, Inc. -All rights reserved. - -This source code is licensed under the BSD-style license found in the -LICENSE file in the root directory of this source tree. An additional grant -of patent rights can be found in the PATENTS file in the same directory. -""" -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -import logging -import itertools -import os -import tempfile - -import click -import numpy as np - -import caffe -import caffe.proto.caffe_pb2 as pb2 - -import google.protobuf.text_format - - -log = logging.getLogger(__name__) - -THRESHOLD = 1E-3 - -# TODO - refactor this in to a sequence of (prototxt, caffemodel) -> -# (prototxt, caffemodel) passes. - - -def flatmap(f, items): - return itertools.chain.from_iterable(itertools.imap(f, items)) - - -def load_prototxt(params_file): - params = pb2.NetParameter() - with open(params_file) as f: - google.protobuf.text_format.Merge(f.read(), params) - return params - - -def convert_fc_layer(net, fc_layer): - conv_layer = pb2.LayerParameter() - conv_layer.name = "{}_conv".format(fc_layer.name) - conv_layer.type = "Convolution" - conv_layer.bottom.extend(list(fc_layer.bottom)) - conv_layer.top.extend(list(fc_layer.top)) - # get input - assert len(fc_layer.bottom) == 1 - bottom_name = fc_layer.bottom[0] - bottom_shape = list(net.blobs[bottom_name].shape) - if len(bottom_shape) == 2: - bottom_shape.extend([1, 1]) - - num_output = net.params[fc_layer.name][0].data.shape[0] - assert bottom_shape[-1] == bottom_shape[-2], bottom_shape - conv_layer.convolution_param.kernel_size = bottom_shape[-1] - conv_layer.convolution_param.num_output = num_output - return conv_layer - - -def convert_fc_prototxt(params_file): - params = load_prototxt(params_file) - - def find_layer(name): - (layer,) = [l for l in params.layer if l.name == name] - return layer - - def f(layer): - if layer.type == "Flatten": - new_layer = pb2.LayerParameter() - new_layer.CopyFrom(layer) - new_layer.type = "Reshape" - new_layer.reshape_param.shape.dim.extend([0, 0, 0, 0]) - return new_layer - if layer.type == "InnerProduct": - new_layer = pb2.LayerParameter() - new_layer.CopyFrom(layer) - new_layer.inner_product_param.axis = 1 - return new_layer - return layer - - new_layers = [f(l) for l in params.layer] - new_params = pb2.NetParameter() - new_params.CopyFrom(params) - del new_params.layer[:] - new_params.layer.extend(new_layers) - return new_params - - -def convert_spatial_prototxt(params_file): - net = caffe.Net(str(params_file), caffe.TEST) - params = load_prototxt(params_file) - - def find_layer(name): - (layer,) = [l for l in params.layer if l.name == name] - return layer - - def f(layer): - if layer.type != "InnerProduct": - return [layer] - return [convert_fc_layer(net, layer)] - - new_layers = flatmap(f, params.layer) - new_params = pb2.NetParameter() - new_params.CopyFrom(params) - del new_params.layer[:] - new_params.layer.extend(new_layers) - return new_params - - -def convert_spatial_net(spatial_params_file, spatial_weights_file, - conv_params_file): - spatial_net = caffe.Net( - str(spatial_params_file), str(spatial_weights_file), caffe.TEST) - # Initialize from the SPATIAL layer - conv_net = caffe.Net( - str(conv_params_file), str(spatial_weights_file), caffe.TEST) - spatial_params = load_prototxt(spatial_params_file) - - converted_layer_names = [ - (layer.name, convert_fc_layer(spatial_net, layer).name) - for layer in spatial_params.layer - if layer.type == "InnerProduct" - ] - - for layer_pair in converted_layer_names: - log.info("Converting layer pair: %s", layer_pair) - (spatial_layer_name, conv_layer_name) = layer_pair - spatial_params = spatial_net.params[spatial_layer_name] - conv_params = conv_net.params[conv_layer_name] - - assert len(spatial_params) == len(conv_params) - for spatial_param, conv_param in zip(spatial_params, conv_params): - log.info("Spatial Layer: %s - %s, Conv Layer: %s - %s", - spatial_layer_name, spatial_param.data.shape, - conv_layer_name, conv_param.data.shape) - assert(conv_param.data.size == spatial_param.data.size) - conv_param.data.flat = spatial_param.data.flat - return spatial_net, conv_net - - -def verify_equivalent(fc_net, conv_net): - log.info("Verifying convnets") - input_names = fc_net.inputs - log.info("Running on inputs: %s", input_names) - inputs = { - input_name: np.random.random( - size=tuple(list(fc_net.blobs[input_name].shape))) - for input_name in input_names} - - fc_outputs = fc_net.forward(**inputs) - conv_outputs = conv_net.forward(**inputs) - # Verify convolutional model works - for k, conv_output in conv_outputs.iteritems(): - log.info("%s: %s", k, conv_output.shape) - fc_output = fc_outputs[k] - delta = np.amax(np.abs(conv_output.flatten() - fc_output.flatten())) - log.info("Maximum delta: %s", delta) - if delta < THRESHOLD: - log.info("Delta: %s < threshold: %s", delta, THRESHOLD) - continue - - log.info("Conv output: %s", conv_output.flatten()) - log.info("FC output: %s", fc_output.flatten()) - for ((fcn, fcb), (cnn, cnb)) in zip( - list(fc_net.blobs.iteritems()), - list(conv_net.blobs.iteritems())): - log.info("FCN: %s - %s, CNN: %s - %s", - fcn, fcb.data.shape, cnn, cnb.data.shape) - log.info(np.amax(np.abs(fcb.data.flatten() - cnb.data.flatten()))) - raise Exception("Failed to precisely convert models") - - -@click.group() -def cli(): - pass - - -@cli.command() -@click.option("--conv-prototxt", type=str, required=True) -@click.option("--output-scanning-prototxt", type=str, required=True) -def scanning(conv_prototxt, output_scanning_prototxt): - """ - Add a scanning layer on top of all softmax layers, so we max-pool - the class probabilities over spatial locations. - """ - conv_params = load_prototxt(conv_prototxt) - - def add_scanning(layer): - if layer.type != "Softmax": - return [layer] - scanning_layer = pb2.LayerParameter() - scanning_layer.name = "{}_scanning".format(layer.name) - scanning_layer.bottom.extend(layer.top) - scanning_layer.top.extend([scanning_layer.name]) - scanning_layer.type = "Pooling" - scanning_layer.pooling_param.pool = pb2.PoolingParameter.MAX - scanning_layer.pooling_param.global_pooling = True - return [layer, scanning_layer] - - scanning_layers = flatmap(add_scanning, conv_params.layer) - scanning_params = pb2.NetParameter() - scanning_params.CopyFrom(conv_params) - del scanning_params.layer[:] - scanning_params.layer.extend(scanning_layers) - scanning_prototxt = tempfile.NamedTemporaryFile( - dir=os.path.dirname(output_scanning_prototxt), - delete=False).name - with open(scanning_prototxt, "w") as f: - f.write(google.protobuf.text_format.MessageToString(scanning_params)) - # Verify the net loads with the scanning change. - caffe.Net(str(scanning_prototxt), caffe.TEST) - log.info("Moving: %s to %s", scanning_prototxt, output_scanning_prototxt) - os.rename(scanning_prototxt, output_scanning_prototxt) - - -@cli.command() -@click.option("--fc-prototxt", type=str, required=True) -@click.option("--fc-caffemodel", type=str, required=True) -@click.option("--output-spatial-prototxt", type=str, required=True) -@click.option("--output-spatial-caffemodel", type=str, required=True) -def spatial(fc_prototxt, fc_caffemodel, output_spatial_prototxt, - output_spatial_caffemodel): - """ - Remove `Flatten` layers to preserve the spatial structure - """ - logging.basicConfig(level=logging.INFO) - - spatial_net_params = convert_fc_prototxt(fc_prototxt) - spatial_prototxt = tempfile.NamedTemporaryFile( - dir=os.path.dirname(output_spatial_prototxt), - suffix=".spatial_prototxt", - delete=False).name - with open(spatial_prototxt, "w") as f: - f.write(google.protobuf.text_format.MessageToString( - spatial_net_params)) - log.info("Spatial params: %s", spatial_prototxt) - fc_net = caffe.Net(str(fc_prototxt), str(fc_caffemodel), caffe.TEST) - spatial_net = caffe.Net(str(spatial_prototxt), str(fc_caffemodel), - caffe.TEST) - verify_equivalent(fc_net, spatial_net) - - spatial_caffemodel = tempfile.NamedTemporaryFile( - dir=os.path.dirname(output_spatial_caffemodel), - suffix=".spatial_caffemodel", - delete=False).name - spatial_net.save(str(spatial_caffemodel)) - log.info("Moving: %s to %s", spatial_prototxt, output_spatial_prototxt) - os.rename(spatial_prototxt, output_spatial_prototxt) - log.info("Moving: %s to %s", spatial_caffemodel, output_spatial_caffemodel) - os.rename(spatial_caffemodel, output_spatial_caffemodel) - - -@cli.command() -@click.option("--spatial-prototxt", type=str, required=True) -@click.option("--spatial-caffemodel", type=str, required=True) -@click.option("--output-conv-prototxt", type=str, required=True) -@click.option("--output-conv-caffemodel", type=str, required=True) -def convolutional(spatial_prototxt, spatial_caffemodel, - output_conv_prototxt, output_conv_caffemodel): - """ - Convert all fully connected layers to convolutional layers. - """ - logging.basicConfig(level=logging.INFO) - - conv_net_params = convert_spatial_prototxt(spatial_prototxt) - conv_prototxt = tempfile.NamedTemporaryFile( - dir=os.path.dirname(output_conv_prototxt), - suffix=".conv_prototxt", - delete=False).name - with open(conv_prototxt, "w") as f: - f.write(google.protobuf.text_format.MessageToString(conv_net_params)) - log.info("Conv params: %s", conv_prototxt) - - (spatial_net, conv_net) = convert_spatial_net( - spatial_prototxt, spatial_caffemodel, conv_prototxt) - verify_equivalent(spatial_net, conv_net) - conv_caffemodel = tempfile.NamedTemporaryFile( - dir=os.path.dirname(output_conv_caffemodel), - suffix=".conv_caffemodel", - delete=False).name - conv_net.save(str(conv_caffemodel)) - - log.info("Moving: %s to %s", conv_prototxt, output_conv_prototxt) - os.rename(conv_prototxt, output_conv_prototxt) - log.info("Moving: %s to %s", conv_caffemodel, output_conv_caffemodel) - os.rename(conv_caffemodel, output_conv_caffemodel) - - -@cli.command() -@click.option("--conv-prototxt", type=str, required=True) -@click.option("--scale", type=float, multiple=True) -def scales(conv_prototxt, scale): - """ - Examine the network output dimensions across a series of input scales. - """ - logging.basicConfig(level=logging.INFO) - - net = caffe.Net(str(conv_prototxt), caffe.TEST) - input_names = net.inputs - input_shapes = { - input_name: tuple(net.blobs[input_name].shape) - for input_name in input_names} - - for scalar in scale: - log.info("Running on scale: %s", scalar) - - def perturb(i, n): - # only perturb HxW in NxCxHxW - if i in (2, 3): - return int(n * scalar) - return n - - inputs = { - input_name: np.random.random( - size=tuple( - perturb(i, n) - for (i, n) in enumerate(shape))) - for input_name, shape in input_shapes.iteritems()} - - for input_name, input in inputs.iteritems(): - log.info("Input: %s, shape: %s", input_name, input.shape) - net.blobs[input_name].reshape(*input.shape) - net.reshape() - conv_outputs = net.forward(**inputs) - for output_name, conv_output in conv_outputs.iteritems(): - log.info("%s: %s", output_name, conv_output.shape) - - -@cli.command() -@click.option("--prototxt", required=True) -@click.option("--caffemodel", required=True) -@click.option("--output-prototxt", required=True) -@click.option("--output-caffemodel", required=True) -@click.pass_context -def vision(ctx, prototxt, caffemodel, output_prototxt, output_caffemodel): - spatial_prototxt = tempfile.NamedTemporaryFile( - suffix=".spatial_prototxt", delete=False).name - spatial_caffemodel = tempfile.NamedTemporaryFile( - suffix=".spatial_caffemodel", delete=False).name - ctx.invoke(spatial, - fc_prototxt=prototxt, - fc_caffemodel=caffemodel, - output_spatial_prototxt=spatial_prototxt, - output_spatial_caffemodel=spatial_caffemodel) - conv_prototxt = tempfile.NamedTemporaryFile( - suffix=".conv_prototxt", delete=False).name - conv_caffemodel = tempfile.NamedTemporaryFile( - dir=os.path.dirname(output_caffemodel), - suffix=".conv_prototxt", delete=False).name - ctx.invoke(convolutional, - spatial_prototxt=spatial_prototxt, - spatial_caffemodel=spatial_caffemodel, - output_conv_prototxt=conv_prototxt, - output_conv_caffemodel=conv_caffemodel) - - ctx.invoke(scanning, - conv_prototxt=conv_prototxt, - output_scanning_prototxt=output_prototxt) - log.info("Moving: %s to %s", conv_caffemodel, output_caffemodel) - os.rename(conv_caffemodel, output_caffemodel) - -if __name__ == "__main__": - cli() diff --git a/convert.lua b/convert.lua new file mode 100644 index 0000000..0addd48 --- /dev/null +++ b/convert.lua @@ -0,0 +1,40 @@ +require 'nn'; +require 'cunn'; +require 'cudnn'; +require 'torch2caffe/prepnv.lua' +local t2c=require 'torch2caffe.lib' + +-- Figure out the path of the model and load it +local path = arg[1] +local basename = paths.basename(path, 't7b') +local ext = path:match("^.+(%..+)$") +local model = nil +if ext == '.t7b' then + model = torch.load(path) +elseif ext == '.txt' then + error('wrong model') +else + assert(false, "We assume models end in either .t7b or .txt") +end + +if model.net then + model = model.net +end +model2 = model:clone() +model=g_t2c_preprocess(model, opts) + +local function check(module, module2,input_dims) + module:apply(function(m) m:evaluate() end) + local opts = { + prototxt = string.format('%s.prototxt', basename), + caffemodel = string.format('%s.caffemodel', basename), + inputs={{name="data", input_dims=input_dims}}, + } + t2c.convert(opts, module) + t2c.compare(opts, module2) + return opts +end + + +check(model, model2, {1,3,224,224}) + diff --git a/convert_all.sh b/convert_all.sh new file mode 100755 index 0000000..c5a97ed --- /dev/null +++ b/convert_all.sh @@ -0,0 +1,6 @@ +export LD_PRELOAD=$HOME/caffe/.build_release/lib/libcaffe.so; + +for f in *.t7b +do + th convert.lua $f +done diff --git a/doc/caffenet.png b/doc/caffenet.png deleted file mode 100644 index 5a80e52..0000000 Binary files a/doc/caffenet.png and /dev/null differ diff --git a/doc/googlenet.png b/doc/googlenet.png deleted file mode 100644 index bd5c66c..0000000 Binary files a/doc/googlenet.png and /dev/null differ diff --git a/install-dep-14.04.sh b/install-dep-14.04.sh new file mode 100644 index 0000000..9675614 --- /dev/null +++ b/install-dep-14.04.sh @@ -0,0 +1,126 @@ + +echo +echo This script will install fblualib and all its dependencies. +echo It has been tested on Ubuntu 13.10 and Ubuntu 14.04, Linux x86_64. +echo + +set -e +set -x + + +if [[ $(arch) != 'x86_64' ]]; then + echo "x86_64 required" >&2 + exit 1 +fi + +issue=$(cat /etc/issue) +extra_packages= +if [[ $issue =~ ^Ubuntu\ 14\.04 ]]; then + extra_packages=libiberty-dev +else + echo "Ubuntu 14.04 required" >&2 + exit 1 +fi + +dir=$(mktemp --tmpdir -d fblualib-build.XXXXXX) + +echo Working in $dir +echo +cd $dir + +echo Installing required packages +echo +sudo apt-get install -y \ + git \ + curl \ + wget \ + g++ \ + automake \ + autoconf \ + autoconf-archive \ + libtool \ + libboost-all-dev \ + libevent-dev \ + libdouble-conversion-dev \ + libgoogle-glog-dev \ + libgflags-dev \ + liblz4-dev \ + liblzma-dev \ + libsnappy-dev \ + make \ + zlib1g-dev \ + binutils-dev \ + libjemalloc-dev \ + $extra_packages \ + flex \ + bison \ + libkrb5-dev \ + libsasl2-dev \ + libnuma-dev \ + pkg-config \ + libssl-dev \ + libedit-dev \ + libmatio-dev \ + libpython-dev \ + libpython3-dev \ + python-numpy + +echo +echo Cloning repositories +echo +git clone -b v0.35.0 --depth 1 https://github.com/facebook/folly +git clone -b v0.24.0 --depth 1 https://github.com/facebook/fbthrift +git clone -b v1.0 https://github.com/facebook/thpp +git clone -b v1.0 https://github.com/facebook/fblualib + +echo +echo Building folly +echo + +cd $dir/folly/folly +autoreconf -ivf +./configure +make +sudo make install +sudo ldconfig + +echo +echo Building fbthrift +echo + +cd $dir/fbthrift/thrift +autoreconf -ivf +./configure +make +sudo make install + +echo +echo 'Installing TH++' +echo + +cd $dir/thpp/thpp +./build.sh + +echo +echo 'Installing FBLuaLib' +echo + +cd $dir/fblualib/fblualib +./build.sh +cd $dir/fblualib/fblualib/python +luarocks make rockspec/fbpython-0.1-1.rockspec + +echo +echo 'Almost done!' +echo + +git clone https://github.com/torch/nn && ( cd nn && git checkout getParamsByDevice && luarocks make rocks/nn-scm-1.rockspec ) + +git clone https://github.com/facebook/fbtorch.git && ( cd fbtorch && luarocks make rocks/fbtorch-scm-1.rockspec ) + +git clone https://github.com/facebook/fbnn.git && ( cd fbnn && luarocks make rocks/fbnn-scm-1.rockspec ) + + +echo +echo 'All done!' +echo diff --git a/install-dep-16.04.sh b/install-dep-16.04.sh new file mode 100755 index 0000000..f05b0ee --- /dev/null +++ b/install-dep-16.04.sh @@ -0,0 +1,141 @@ + +echo +echo This script will install fblualib and all its dependencies. +echo It has been Ubuntu 16.04, Linux x86_64. +echo + +set -e +set -x + + +if [[ $(arch) != 'x86_64' ]]; then + echo "x86_64 required" >&2 + exit 1 +fi + +issue=$(cat /etc/issue) +extra_packages= +current=0 +if [[ $issue =~ ^Ubuntu\ 16\.04 ]]; then + extra_packages=libiberty-dev + current=1 +else + echo "Ubuntu 16.04 required" >&2 + exit 1 +fi + +dir=$(mktemp --tmpdir -d fblualib-build.XXXXXX) + +echo Working in $dir +echo +cd $dir + +echo Installing required packages +echo +sudo apt-get install -y \ + git \ + curl \ + wget \ + g++ \ + automake \ + autoconf \ + autoconf-archive \ + libtool \ + libboost-all-dev \ + libevent-dev \ + libdouble-conversion-dev \ + libgoogle-glog-dev \ + libgflags-dev \ + liblz4-dev \ + liblzma-dev \ + libsnappy-dev \ + make \ + zlib1g-dev \ + binutils-dev \ + libjemalloc-dev \ + $extra_packages \ + flex \ + bison \ + libkrb5-dev \ + libsasl2-dev \ + libnuma-dev \ + pkg-config \ + libssl-dev \ + libedit-dev \ + libmatio-dev \ + libpython-dev \ + libpython3-dev \ + python-numpy + +echo +echo Cloning repositories +echo + +git clone --depth 1 https://github.com/facebook/folly +git clone --depth 1 https://github.com/facebook/fbthrift +git clone https://github.com/facebook/thpp +git clone https://github.com/facebook/fblualib +git clone https://github.com/facebook/wangle + +echo +echo Building folly +echo + +cd $dir/folly/folly +autoreconf -ivf +./configure +make +sudo make install +sudo ldconfig + +if [ $current -eq 1 ]; then + echo + echo Wangle + echo + + cd $dir/wangle/wangle + cmake . + make + sudo make install +fi + +echo +echo Building fbthrift +echo + +cd $dir/fbthrift/thrift +autoreconf -ivf +./configure +make +sudo make install + +echo +echo 'Installing TH++' +echo + +cd $dir/thpp/thpp +./build.sh + +echo +echo 'Installing FBLuaLib' +echo + +cd $dir/fblualib/fblualib +./build.sh +cd $dir/fblualib/fblualib/python +luarocks make rockspec/fbpython-0.1-1.rockspec + +echo +echo 'Almost done!' +echo + +git clone https://github.com/torch/nn && ( cd nn && git checkout getParamsByDevice && luarocks make rocks/nn-scm-1.rockspec ) + +git clone https://github.com/facebook/fbtorch.git && ( cd fbtorch && luarocks make rocks/fbtorch-scm-1.rockspec ) + +git clone https://github.com/facebook/fbnn.git && ( cd fbnn && luarocks make rocks/fbnn-scm-1.rockspec ) + + +echo +echo 'All done!' +echo diff --git a/predictor/Optimize.cpp b/predictor/Optimize.cpp deleted file mode 100644 index d4759b1..0000000 --- a/predictor/Optimize.cpp +++ /dev/null @@ -1,217 +0,0 @@ -/** - * Copyright (c) 2015-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. An additional grant - * of patent rights can be found in the PATENTS file in the same directory. - */ -#include "Optimize.h" - -#include -#include - -#include - -#include -#include - -#include "caffe/net.hpp" -#include "caffe/syncedmem.hpp" - -namespace caffe { -namespace fb { - -namespace { - -constexpr int64_t kNotDefined = 0; -constexpr int64_t kNotUsed = -1; -constexpr int64_t kAlwaysLive = 10000; -constexpr int64_t kMinimumCountForSharing = 10000; - -struct LiveRange { - int64_t defined{kNotDefined}, used{kNotUsed}; -}; - -template -using Analysis = std::unordered_map; -template -using OrderedAnalysis = std::vector>; -using SyncedMemoryRange = std::pair; -using Assignment = std::vector; -using Assignments = std::vector; - -template -T& findOrInsert(OrderedAnalysis* analysis, SyncedMemory* needle) { - for (auto& kv : *analysis) { - if (kv.first == needle) { - return kv.second; - } - } - analysis->push_back({needle, T()}); - return analysis->back().second; -} - -OrderedAnalysis analyze(const caffe::Net& net) { - // Build up the liveness analysis by walking the SyncedMemory - // pointers attached the the blobs in the network. - const auto& bottoms = net.bottom_vecs(); - const auto& tops = net.top_vecs(); - OrderedAnalysis analysis; - for (int64_t i = 0; i < bottoms.size(); ++i) { - for (const auto* bottom : bottoms[i]) { - auto& range = findOrInsert(&analysis, bottom->data().get()); - if (range.used == kNotUsed) { - range.used = i; - continue; - } - range.used = std::max(range.used, i); - } - } - for (int64_t i = 0; i < tops.size(); ++i) { - for (const auto* top : tops[i]) { - auto& range = findOrInsert(&analysis, top->data().get()); - if (range.defined == kNotDefined) { - range.defined = i; - continue; - } - range.defined = std::min(range.defined, i); - } - } - for (const auto* input : net.input_blobs()) { - findOrInsert(&analysis, input->data().get()).defined = -kAlwaysLive; - findOrInsert(&analysis, input->data().get()).used = kAlwaysLive; - } - return analysis; -} - -// Is the candidate range compatible with this assignment? -bool isCompatible(const SyncedMemoryRange& candidate, - const Assignment& assignment) { - if (candidate.second.used == kNotUsed || - assignment.back().second.used == kNotUsed) { - return false; - } - if (candidate.first->size() <= kMinimumCountForSharing) { - return false; - } - CHECK_GE(assignment.size(), 1); - return candidate.second.defined > assignment.back().second.used; -}; - -Analysis> blobNames(const caffe::Net& net) { - Analysis> names; - const auto& blobs = net.blobs(); - for (auto i = 0; i < blobs.size(); ++i) { - names[blobs[i]->data().get()].push_back(net.blob_names().at(i)); - } - return names; -} - -// Compute an assignment of blobs to non-overlapping blobs. -Assignments assign(const Net& net, OrderedAnalysis analysis) { - const auto& names = blobNames(net); - std::stable_sort(analysis.begin(), - analysis.end(), - [](const SyncedMemoryRange& a, const SyncedMemoryRange& b) { - return a.second.used < b.second.used; - }); - for (const auto& kv : analysis) { - LOG(INFO) << names.at(kv.first) - << folly::format(": {}->{}", kv.second.defined, kv.second.used); - } - - Assignments assignments; - for (const auto& candidate : analysis) { - auto assigned = false; - for (auto& assignment : assignments) { - if (isCompatible(candidate, assignment)) { - assignment.push_back(candidate); - assigned = true; - break; - } - } - if (assigned) { - continue; - } - assignments.push_back({candidate}); - } - return assignments; -} - -template -void logAssignmentMetrics(const OrderedAnalysis& analysis, - const Assignments& assignments) { - size_t beforeTotalSize = 0; - for (const auto& kv : analysis) { - beforeTotalSize += kv.first->size(); - } - size_t afterTotalSize = 0; - for (const auto& assignment : assignments) { - size_t assignmentMaxSize = 0; - for (const auto& kv : assignment) { - assignmentMaxSize = std::max(assignmentMaxSize, kv.first->size()); - } - LOG(INFO) << "Assignment max size: " << assignmentMaxSize; - afterTotalSize += assignmentMaxSize; - } - LOG(INFO) - << folly::format("Before: {}, After: {}, Compression: {:.2f}%", - beforeTotalSize, - afterTotalSize, - 100.0 * (1.0 - afterTotalSize * 1.0 / beforeTotalSize)); -} - -void applyAssignments(caffe::Net* net, const Assignments& assignments) { - const auto& names = blobNames(*net); - Analysis>> reusedBlobs; - for (const auto& assignment : assignments) { - auto reused = boost::make_shared>(1, 1, 1, 1); - // Instantiate so blob->data() is valid. - reused->cpu_data(); - LOG(INFO) << "Assignment: "; - for (const auto& kv : assignment) { - LOG(INFO) << "Blob: " << names.at(kv.first); - reusedBlobs[kv.first] = reused; - } - } - - using BV = std::vector*>; - using SBV = std::vector>>; - for (auto& blob : const_cast(net->input_blobs())) { - reusedBlobs.at(blob->data().get())->ReshapeLike(*blob); - blob = reusedBlobs.at(blob->data().get()).get(); - } - for (auto& blob : const_cast(net->output_blobs())) { - blob = reusedBlobs.at(blob->data().get()).get(); - } - for (auto& vec : net->top_vecs()) { - for (auto& blob : const_cast(vec)) { - blob = reusedBlobs.at(blob->data().get()).get(); - } - } - for (auto& vec : net->bottom_vecs()) { - for (auto& blob : const_cast(vec)) { - blob = reusedBlobs.at(blob->data().get()).get(); - } - } - for (auto& blob : const_cast(net->blobs())) { - auto reusedBlob = reusedBlobs.at(blob->data().get()); - blob = reusedBlob; - } -} -} - -void optimizeMemory(caffe::Net* net) { - net->Reshape(); - // If the net does sharing (e.g. SplitLayer), run a forward pass to - // get the sharing setup so that it is indentified when we use the - // SyncedMemory addresses as identifiers for def/use ranges. - net->ForwardPrefilled(); - const auto& analysis = analyze(*net); - const auto& assignments = assign(*net, analysis); - logAssignmentMetrics(analysis, assignments); - applyAssignments(net, assignments); -} -} -} diff --git a/predictor/Optimize.h b/predictor/Optimize.h deleted file mode 100644 index e80d54c..0000000 --- a/predictor/Optimize.h +++ /dev/null @@ -1,21 +0,0 @@ -/** - * Copyright (c) 2015-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. An additional grant - * of patent rights can be found in the PATENTS file in the same directory. - */ -#pragma once - -namespace caffe { -template -class Net; -} - -namespace caffe { namespace fb { - -void optimizeMemory(caffe::Net* net); - -} -} diff --git a/predictor/Predictor.cpp b/predictor/Predictor.cpp deleted file mode 100644 index 8e5a03a..0000000 --- a/predictor/Predictor.cpp +++ /dev/null @@ -1,173 +0,0 @@ -/** - * Copyright (c) 2015-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. An additional grant - * of patent rights can be found in the PATENTS file in the same directory. - */ -#include "Predictor.h" - -#include "caffe/net.hpp" -#include "caffe/util/io.hpp" -#include "caffe/util/upgrade_proto.hpp" -#include "folly/Memory.h" - -#include -#include -#include - -#include - -#include "Optimize.h" - -namespace caffe { namespace fb { - -namespace { - -template -bool vectorContains(const C& container, const typename C::value_type& value) { - return std::find(container.begin(), container.end(), value) != - container.end(); -} -} - -namespace detail { -void disable_blas_threading() { - // Disable threading for users of this Predictor. - // Ideally, we'd be able to just link against either mkl_lp64_gomp - // or mkl_lp64_seq, but Buck's build system doesn't allow this. - // Instead, just link to _gomp everywhere (including in tp2, etc), - // and for users of this library (people who explicitly instantiate - // Predictor), set mkl_num_threads/omp_num_threads to 1. - // See t8682905 for details. - LOG(INFO) << "Setting BLAS (MKL, OMP) threads to 1"; - mkl_set_num_threads(1); -} -} - -std::unique_ptr Predictor::Predictor::paths( - const std::string& prototxt_path, - const std::string& weights_path, - Optimization optimization) { - auto prototxt = folly::make_unique(); - CHECK(caffe::ReadProtoFromTextFile(prototxt_path.c_str(), prototxt.get())); - CHECK(caffe::UpgradeNetAsNeeded(prototxt_path, prototxt.get())); - - auto weights = folly::make_unique(); - CHECK(caffe::ReadProtoFromBinaryFile(weights_path, weights.get())); - CHECK(caffe::UpgradeNetAsNeeded(weights_path, weights.get())); - // Can't make_unique b/c of private constructor - return std::unique_ptr( - new Predictor(*prototxt, *weights, optimization)); -} - -std::unique_ptr Predictor::Predictor::strings( - const std::string& text_prototxt, - const std::string& binary_weights, - Optimization optimization) { - auto prototxt = folly::make_unique(); - CHECK(google::protobuf::TextFormat::ParseFromString(text_prototxt, - prototxt.get())); - CHECK(caffe::UpgradeNetAsNeeded("", prototxt.get())); - auto weights = folly::make_unique(); - auto input_stream = - folly::make_unique( - binary_weights.data(), binary_weights.size()); - auto stream = folly::make_unique( - input_stream.get()); - // from caffe/util/io.cpp - constexpr auto kProtoReadBytesLimit = - INT_MAX; // Max size of 2 GB minus 1 byte. - stream->SetTotalBytesLimit(kProtoReadBytesLimit, 536870912); - CHECK(weights->ParseFromCodedStream(stream.get())); - CHECK(caffe::UpgradeNetAsNeeded("", weights.get())); - // Can't make_unique b/c of private constructor - return std::unique_ptr( - new Predictor(*prototxt, *weights, optimization)); -} - -Predictor::Predictor(const caffe::NetParameter& param, - const caffe::NetParameter& weights, - Optimization optimization) - : optimization_(optimization) { - detail::disable_blas_threading(); - - // Check that we have some layers - empty strings/files, for - // example, are forgivingly deserialized. - CHECK(param.layer().size()); - CHECK(weights.layer().size()); - param_ = std::make_shared(param); - param_->mutable_state()->set_phase(caffe::TEST); - weights_ = folly::make_unique>(*param_); - weights_->CopyTrainedLayersFrom(weights); -} - -void Predictor::runForward( - const std::vector*>& input_blobs) { - if (!predictors_.get()) { - auto predictor = - folly::make_unique>(*param_); - predictor->ShareTrainedLayersWith(weights_.get()); - if (optimization_ == Optimization::MEMORY) { - optimizeMemory(predictor.get()); - } - predictors_.reset(predictor.release()); - } - auto* predictor = predictors_.get(); - CHECK(predictor); - CHECK_EQ(input_blobs.size(), predictor->input_blobs().size()); - for (auto i = 0; i < input_blobs.size(); ++i) { - auto& input_blob = input_blobs[i]; - CHECK(input_blob); - predictor->input_blobs()[i]->ReshapeLike(*input_blob); - // mutable_cpu_data b/c the interface demands it, but logically const. - predictor->input_blobs()[i]->set_cpu_data(input_blob->mutable_cpu_data()); - } - predictor->Reshape(); - predictor->ForwardPrefilled(); -} - -void Predictor::forward( - const std::vector*>& input_blobs, - const std::vector& output_layer_names, - std::vector*>* output_blobs) { - runForward(input_blobs); - auto* predictor = predictors_.get(); - output_blobs->reserve(output_layer_names.size()); - for (const auto& layer_name: output_layer_names) { - auto& output_blob = predictor->blob_by_name(layer_name); - CHECK(output_blob) << "Misspecified layer_name: " << layer_name; - if (optimization_ == Optimization::MEMORY) { - CHECK(vectorContains(predictor->output_blobs(), output_blob.get())); - } - output_blobs->push_back(output_blob.get()); - } -} - -std::vector*> Predictor::forward( - const std::vector*>& input_blobs, - const std::vector& output_layer_names) { - std::vector*> output_blobs; - output_blobs.reserve(input_blobs.size()); - forward(input_blobs, output_layer_names, &output_blobs); - return output_blobs; -} - -std::unordered_map*> Predictor::forward( - const std::vector*>& input_blobs) { - runForward(input_blobs); - auto* predictor = predictors_.get(); - auto blob_names = predictor->blob_names(); - std::unordered_map*> output_blobs; - for (const auto& blob_name: blob_names) { - auto& output_blob = predictor->blob_by_name(blob_name); - if (optimization_ == Optimization::MEMORY) { - CHECK(vectorContains(predictor->output_blobs(), output_blob.get())); - } - output_blobs[blob_name] = output_blob.get(); - } - return output_blobs; -} -} -} diff --git a/predictor/Predictor.h b/predictor/Predictor.h deleted file mode 100644 index 4c46d3a..0000000 --- a/predictor/Predictor.h +++ /dev/null @@ -1,70 +0,0 @@ -/** - * Copyright (c) 2015-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. An additional grant - * of patent rights can be found in the PATENTS file in the same directory. - */ -#pragma once -#include -#include -#include - -namespace caffe { -template -class Net; -template -class Blob; -class NetParameter; -} - -namespace caffe { -namespace fb { - -class Predictor { - public: - enum Optimization { - NONE, - MEMORY - }; - static std::unique_ptr strings( - const std::string& text_prototxt, - const std::string& binary_weights, - Optimization optimization = Optimization::NONE); - - static std::unique_ptr paths( - const std::string& prototxt_path, - const std::string& weights_path, - Optimization optimization = Optimization::NONE); - - std::vector*> forward( - const std::vector*>& input_blobs, - const std::vector& output_layer_names); - - void forward(const std::vector*>& input_blobs, - const std::vector& output_layer_names, - std::vector*>* output_blobs); - - std::unordered_map*> forward( - const std::vector*>& input_blobs); - - caffe::Net* canonicalNet() const { return weights_.get(); } - - private: - Predictor(const caffe::NetParameter& params, - const caffe::NetParameter& weights, - Optimization optimization = Optimization::NONE); - - void runForward( - const std::vector*>& input_blobs); - - // Shared for forward declaration - std::shared_ptr param_; - std::shared_ptr> weights_; - const Optimization optimization_; - - folly::ThreadLocalPtr> predictors_; -}; -} -} diff --git a/predictor/PredictorTest.cpp b/predictor/PredictorTest.cpp deleted file mode 100644 index 3d3a21a..0000000 --- a/predictor/PredictorTest.cpp +++ /dev/null @@ -1,117 +0,0 @@ -/** - * Copyright (c) 2015-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. An additional grant - * of patent rights can be found in the PATENTS file in the same directory. - */ -#include "Predictor.h" -#include "caffe/blob.hpp" -#include "caffe/filler.hpp" - -#include -#include -#include - -namespace caffe { -namespace fb { - -enum class InputTy { - PATHS = 0, - STRINGS = 1, -}; - -struct ModelSpec { - std::string prototxt; - std::string caffemodel; - std::vector inputDims; - std::string outputLayer; - std::vector> outputValues; -}; - -using Param = std::tuple; - -class PredictorTest : public ::testing::TestWithParam {}; - -TEST_P(PredictorTest, ConsistentAcrossThreads) { - const auto &inputTy = std::get<0>(GetParam()); - const auto &optimization = std::get<1>(GetParam()); - const auto &ms = std::get<2>(GetParam()); - Caffe::set_random_seed(1701); - - std::unique_ptr pp; - if (inputTy == InputTy::PATHS) { - pp = Predictor::paths(ms.prototxt, ms.caffemodel, optimization); - } else if (inputTy == InputTy::STRINGS) { - std::string prototxt_str; - folly::readFile(ms.prototxt.c_str(), prototxt_str); - std::string caffemodel_str; - folly::readFile(ms.caffemodel.c_str(), caffemodel_str); - pp = Predictor::strings(prototxt_str, caffemodel_str, optimization); - } - CHECK(pp); - auto &p = *pp; - FillerParameter param; - param.set_min(-1000); - param.set_max(1000); - UniformFiller filler(param); - Blob blob; - blob.Reshape(ms.inputDims); - filler.Fill(&blob); - auto output_blobs = p.forward({&blob}, {ms.outputLayer}); - // Test output blobs in-place. - EXPECT_EQ(1, output_blobs.size()); - output_blobs.clear(); - p.forward({&blob}, {ms.outputLayer}, &output_blobs); - EXPECT_EQ(1, output_blobs.size()); - for (const auto &kv : ms.outputValues) { - EXPECT_FLOAT_EQ(kv.second, output_blobs[0]->cpu_data()[kv.first]); - } - - auto output_blobs2 = p.forward({&blob}); - for (const auto &kv : ms.outputValues) { - EXPECT_FLOAT_EQ(kv.second, - output_blobs2[ms.outputLayer]->cpu_data()[kv.first]); - } - - // True across threads as well. - std::vector ts; - for (auto i = 0; i < 3; ++i) { - ts.emplace_back([&]() { - auto output_blobs = p.forward({&blob}, {ms.outputLayer}); - EXPECT_EQ(1, output_blobs.size()); - for (const auto &kv : ms.outputValues) { - EXPECT_FLOAT_EQ(kv.second, output_blobs[0]->cpu_data()[kv.first]); - } - }); - } - for (auto &t : ts) { - t.join(); - } -} - -INSTANTIATE_TEST_CASE_P( - P, - PredictorTest, - ::testing::Combine( - ::testing::Values(InputTy::PATHS, InputTy::STRINGS), - ::testing::Values(Predictor::Optimization::MEMORY, - Predictor::Optimization::NONE), - ::testing::Values()) - ModelSpec{ - "bvlc_caffenet/deploy.prototxt", - "bvlc_caffenet/bvlc_caffenet.caffemodel", - {1, 3, 227, 227}, - "prob", - {{5, 0.00015368311}}}, - ModelSpec{ - "bvlc_googlenet/deploy.prototxt", - "bvlc_googlenet/bvlc_googlenet.caffemodel", - {1, 3, 227, 227}, - "prob", - {{5, 0.0020543954}}}))); -} -} diff --git a/torch2caffe/caffe_layers.py b/torch2caffe/caffe_layers.py index ca14bd8..e4baeb8 100644 --- a/torch2caffe/caffe_layers.py +++ b/torch2caffe/caffe_layers.py @@ -84,7 +84,7 @@ def inner_product(torch_layer): num_output = int(torch_layer["num_output"]) weight = torch_layer["weight"] layer.inner_product_param.num_output = num_output - layer.inner_product_param.axis = -1 + # layer.inner_product_param.axis = -1 if "bias" in torch_layer: bias = torch_layer["bias"] layer.blobs.extend([as_blob(weight), as_blob(bias)]) @@ -174,9 +174,9 @@ def pooling(torch_layer): if not torch_layer["ceil_mode"]: # layer.pooling_param.torch_pooling = True - if dH > 1: + if dH > 1 and padH > 0: layer.pooling_param.pad_h = padH - 1 - if dW > 1: + if dW > 1 and padW > 0: layer.pooling_param.pad_w = padW - 1 return layer @@ -186,12 +186,25 @@ def dropout(torch_layer): layer = pb2.LayerParameter() layer.type = "Dropout" layer.dropout_param.dropout_ratio = torch_layer["p"] - assert torch_layer["v2"], "Only handle nn.Dropout v2" + #assert torch_layer["v2"], "Only handle nn.Dropout v2" train_only = pb2.NetStateRule() train_only.phase = pb2.TEST layer.exclude.extend([train_only]) return layer +def elu(torch_layer): + layer = pb2.LayerParameter() + layer.type = "ELU" + layer.elu_param.alpha = torch_layer["alpha"] + return layer + +def power(torch_layer): + layer = pb2.LayerParameter() + layer.type = "Power" + layer.power_param.power = 1 + layer.power_param.scale = 1-torch_layer["p"] + layer.power_param.shift = 0 + return layer def fbthreshold(torch_layer): layer = pb2.LayerParameter() @@ -282,6 +295,18 @@ def bn(torch_layer): return layer +def batchnorm(torch_layer): + layer = pb2.LayerParameter() + layer.type = "BatchNorm" + # Caffe BN doesn't support bias + assert torch_layer["affine"]==0 + layer.batch_norm_param.use_global_stats = 1 + blobs_weight = np.ones(1) + layer.blobs.extend([as_blob(torch_layer["running_mean"]), + as_blob(torch_layer["running_var"]), as_blob(blobs_weight)]) + return layer + + def build_converter(opts): return { 'caffe.Concat': concat, @@ -299,11 +324,14 @@ def build_converter(opts): 'caffe.SpatialConvolution': spatial_convolution, 'caffe.Pooling': pooling, 'caffe.Dropout': dropout, + 'caffe.ELU': elu, + 'caffe.Power': power, 'caffe.Flatten': ty('Flatten'), 'caffe.FBThreshold': fbthreshold, 'caffe.LSTM': lstm, 'caffe.Eltwise': eltwise, 'caffe.BN': bn, + 'caffe.BatchNorm': batchnorm, } diff --git a/torch2caffe/lib.lua b/torch2caffe/lib.lua index 495b703..76a5eab 100644 --- a/torch2caffe/lib.lua +++ b/torch2caffe/lib.lua @@ -53,9 +53,7 @@ local function debug_nets(caffe_net, torch_net) sums = torch.sum(m.output) end print("Layer %s, %s, Sum: %s", - torch.typename(m), - sizes, - sums) + torch.typename(m), sizes, sums) end end ) @@ -104,7 +102,7 @@ function M.compare(opts, torch_net) torch_outputs = torch_net:forward(torch_inputs) end) if not ok then - print("Got error running forward: %s", err) + print("\n\n\nGot error running forward: %s", err) torch_net:cuda() local torch_inputs = inputs_to_torch_inputs( inputs, 'torch.CudaTensor') @@ -120,8 +118,8 @@ function M.compare(opts, torch_net) end if #caffe_outputs ~= #torch_outputs then - error("Inconsistent output blobs: Caffe: %s, Torch: %s", - #caffe_outputs, #torch_outputs) + error(string.format("Inconsistent output blobs: Caffe: %s, Torch: %s", + #caffe_outputs, #torch_outputs)) error("Inconsistent output blobs") end @@ -129,22 +127,24 @@ function M.compare(opts, torch_net) local torch_output = torch_outputs[i] local caffe_output = caffe_outputs[i] print("Caffe norm: %s, Torch norm: %s", - torch.norm(caffe_output), torch.norm(torch_output)) + torch.norm(caffe_output), torch.norm(torch_output)) if not caffe_output:isSameSizeAs(torch_output) then - error("Inconsistent output size: Caffe: %s, Torch: %s", - caffe_output:size(), torch_output:size()) + error(string.format("Inconsistent output size: Caffe: %s, Torch: %s", + caffe_output:size(), torch_output:size())) error("Inconsistent output sizes") end local max_absolute_error = (caffe_output - torch_output):abs():max() print("Maximum difference between Caffe and Torch output: %s", max_absolute_error) - if (max_absolute_error > 0.001) then + if 1 then --(max_absolute_error > 0.001) then debug_nets(caffe_net, torch_net) if os.getenv('LUA_DEBUG_ON_ERROR') then require('fb.debugger').enter() end - error("Error in conversion!") + if (max_absolute_error > 0.001) then + error("Error in conversion!") + end end end if os.getenv('LUA_DEBUG_ON_ERROR') then diff --git a/torch2caffe/lib_py.py b/torch2caffe/lib_py.py index cff29d4..0f20f8c 100644 --- a/torch2caffe/lib_py.py +++ b/torch2caffe/lib_py.py @@ -141,6 +141,11 @@ def load(opts): assert net, "Net is none?" return net +def load_train(opts): + net = caffe.Net(opts["prototxt"], caffe.TRAIN) + assert net, "Net is none?" + return net + def check_layer_names(opts, expected_names): net = caffe.proto.caffe_pb2.NetParameter() with open(opts["prototxt"]) as f: diff --git a/torch2caffe/prep.lua b/torch2caffe/prep.lua deleted file mode 100644 index e2b0e47..0000000 --- a/torch2caffe/prep.lua +++ /dev/null @@ -1,67 +0,0 @@ -require 'nn' -require 'cunn' -require 'cudnn' - -local trans = require 'torch2caffe.transforms' - -local function adapt_conv1(layer) - local std = torch.FloatTensor({0.229, 0.224, 0.225}) * 255 - local sz = layer.weight:size() - sz[2] = 1 - layer.weight = layer.weight:cdiv(std:view(1,3,1,1):repeatTensor(sz)) - local tmp = layer.weight:clone() - tmp[{{}, 1, {}, {}}] = layer.weight[{{}, 3, {}, {}}] - tmp[{{}, 3, {}, {}}] = layer.weight[{{}, 1, {}, {}}] - layer.weight = tmp:clone() -end - -local function adapt_sequential_dropout(model) - -- does not support recursive sequential(dropout) - for k, block in pairs(model:findModules('nn.SequentialDropout')) do - -- find last conv / bn / linear layer and scale its weight by 1-p - local found = false - for j = #block.modules,1,-1 do - local block_type = torch.type(block.modules[j]) - if block_type == 'nn.SpatialConvolution' - or block_type == 'nn.Linear' - or block_type == 'nn.SpatialBatchNormalization' then - block.modules[j].weight:mul(1 - block.p) - if block.modules[j].bias then - block.modules[j].bias:mul(1 - block.p) - end - found = true - break - end - end - if not found then - error('SequentialDropout module cannot find weight to scale') - end - end -end - -g_t2c_preprocess = function(model, opts) - model = cudnn.convert(model, nn) - --model = trans.fold_batch_normalization_layers(model, opts) - for _, layer in pairs(model:findModules('nn.SpatialConvolution')) do - layer.weight = layer.weight:float() - if layer.bias then - layer.bias = layer.bias:float() - end - end - for _, layer in pairs(model:findModules('nn.Linear')) do - layer.weight = layer.weight:float() - if layer.bias then - layer.bias = layer.bias:float() - end - end - for _, layer in pairs(model:findModules('nn.SpatialBatchNormalization')) do - layer.weight = layer.weight:float() - layer.bias = layer.bias:float() - layer.running_mean = layer.running_mean:float() - layer.running_var = layer.running_var:float() - end - adapt_conv1(model.modules[1]) - adapt_sequential_dropout(model) - return model -end - diff --git a/torch2caffe/prepnv.lua b/torch2caffe/prepnv.lua new file mode 100644 index 0000000..28dd8bd --- /dev/null +++ b/torch2caffe/prepnv.lua @@ -0,0 +1,140 @@ +require 'nn' +require 'cunn' +require 'cudnn' + +local trans = require 'torch2caffe.transforms' + +local function adapt_conv1(layer) + local std = torch.FloatTensor({0.229, 0.224, 0.225}) * 255 + local sz = layer.weight:size() + sz[2] = 1 + layer.weight = layer.weight:cdiv(std:view(1,3,1,1):repeatTensor(sz)) + local tmp = layer.weight:clone() + tmp[{{}, 1, {}, {}}] = layer.weight[{{}, 3, {}, {}}] + tmp[{{}, 3, {}, {}}] = layer.weight[{{}, 1, {}, {}}] + layer.weight = tmp:clone() +end + +local function adapt_spatial_dropout(net) + --print (model) + for i = 1, #net.modules do + local c = net:get(i) + local t = torch.type(c) + if c == nil then + break + end + if c.modules then + adapt_spatial_dropout(c) + elseif t == 'nn.SpatialDropout' then + local found = false + -- find the previous layer and scale + for j = i,1,-1 do + local block_type = torch.type(net:get(j)) + if block_type == 'nn.SpatialConvolution' + or block_type == 'nn.Linear' then + --or block_type == 'nn.SpatialBatchNormalization' then + net.modules[j].weight:mul(1 - c.p) + if net.modules[j].bias then + net.modules[j].bias:mul(1 - c.p) + end + found = true + break + end + end + if not found then + error('SpatialDropout module cannot find weight to scale') + end + for j = i, net:size()-1 do + net.modules[j] = net.modules[j + 1] + end + net.modules[net:size()] = nil + end + end +end + +remove_flatten = function(net) + for i = 1, #net.modules do + local c = net:get(i) + local t = torch.type(c) + if c.modules then + remove_flatten(c) + elseif t == 'nn.Reshape' then + print('Flatten layer is founded!') + for j = i, #net.modules-1 do + net.modules[j] = net.modules[j+1] + end + net.modules[#net.modules] = nil + break + end + end +end + +g_t2c_preprocess = function(model, opts) + -- convert the model to cpu mode + if model.net then + model = model.net + end + model = cudnn.convert(model, nn) + model=nn.utils.recursiveType(model, 'torch.FloatTensor') + + for _, layer in pairs(model:findModules('nn.SpatialBatchNormalization')) do + if layer.save_mean==nil then + layer.save_mean = layer.running_mean + layer.save_std = layer.running_var + layer.save_std:pow(-0.5) + end + --layer.train = true + end + --adapt_spatial_dropout(model) + remove_flatten(model) + return model +end + +save_model_params = function(model, basename) + -- saving the model-parameters + local n_frames = model.parameters.nFrames + local n_channels = model.parameters.nChannels + local nGPU = model.parameters.n_gpu + local frameInterval = model.parameters.frame_interval + local patch_height = model.parameters.patch_height + local patch_width = model.parameters.patch_width + local roiWidth = model.parameters.roi_width + local roiVerticalOffset = model.parameters.roi_vertical_offset + local roiWidthMeters = model.parameters.roi_width_m + local roiCenterX = model.parameters.roi_center_x + local targetClamp = string.format('\'%s\'', paths.basename(model.parameters.target_clamp)) + local supervisor = string.format('\'%s\'', model.parameters.supervisor[1]) + local supervisorNorm = model.parameters.supervisor_norms.one_over_r + + local csvf = csv.File(string.format('%s-model-params.csv', basename), "w") + csvf:write({ + 'nChannels', + 'nFrames', + 'nGPU', + 'frameInterval', + 'patchHeight', + 'patchWidth', + 'roiWidth', + 'roiVerticalOffset', + 'roiWidthMeters', + 'roiCenterX', + 'baseClamp', + 'supervisor', + 'supervisorNorm'}) + + csvf:write({ + n_channels, + n_frames, + nGPU, + frameInterval, + patch_height, + patch_width, + roiWidth, + roiVerticalOffset, + roiWidthMeters, + roiCenterX, + targetClamp, + supervisor, + supervisorNorm}) + csvf:close() +end diff --git a/torch2caffe/torch_layers.lua b/torch2caffe/torch_layers.lua index f9a84b5..c865bf4 100644 --- a/torch2caffe/torch_layers.lua +++ b/torch2caffe/torch_layers.lua @@ -224,6 +224,8 @@ M.CONVERTER = { return layer end}, ['nn.Dropout'] = simple{typename='caffe.Dropout', inplace=true}, + ['nn.ELU'] = simple{typename='caffe.ELU', inplace=true}, + ['nn.SpatialDropout'] = simple{typename='caffe.Power', inplace=true}, ['nn.View'] = simple{ typename='caffe.Flatten', layer=function(layer) @@ -272,7 +274,7 @@ M.CONVERTER = { false)) end, ['nn.SpatialBatchNormalization'] = simple{ - typename='caffe.BN' + typename='caffe.BatchNorm'--'caffe.BN' }, } diff --git a/verify.lua b/verify.lua new file mode 100644 index 0000000..4a1117d --- /dev/null +++ b/verify.lua @@ -0,0 +1,68 @@ +require 'nn'; +require 'cunn'; +require 'cudnn'; +require 'paths'; +require 'image' +require 'torch2caffe/prepnv.lua' +local t2c=require 'torch2caffe.lib' + + +-- Figure out the path of the model and load it +local path = arg[1] +local intenpath = arg[2] +local basename = paths.basename(path, 't7b') +local ext = path:match("^.+(%..+)$") +local model = nil +if ext == '.t7b' then + model1 = torch.load(path) +else + assert(false, "We assume models end in either .t7b") +end + +if model1.net then + model1 = model1.net +end + +local function check_input(net, input_dims, input_tensor) + net:apply(function(m) m:evaluate() end) + local opts = { + prototxt = string.format('%s.prototxt', basename), + caffemodel = string.format('%s.caffemodel', basename), + inputs = {{ + name = "data", + input_dims = input_dims, + tensor = input_tensor + }} + } + t2c.compare(opts, net) + return opts +end + +local function check(net, input_dims) + net:apply(function(m) m:evaluate() end) + local opts = { + prototxt = string.format('%s.prototxt', basename), + caffemodel = string.format('%s.caffemodel', basename), + inputs = {{ + name = "data", + input_dims = input_dims, + }} + } + t2c.compare(opts, net) + return opts +end + +if intenpath ~= nil then + print('Using given input tensor', intenpath) + input = torch.load(intenpath):view(table.unpack(input_dims)) + check_input(model1, {1, n_channels, patch_height, patch_width}, input) + -- + torch.save(string.format("%s.t7b", input), testpatch) + image.save(string.format("%s.JPEG", input), image.toDisplayTensor(testpatch)) +else + print('Creating new tensor') + input = nil + check(model1, {1, 1, 66, 200}) +end + +