-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the RNNCPP wiki!
Some notes regarding the structure of the code. The notes should be cleaned up. For now, this is simply a copy of my template_for_cpp_code.txt file
We are using Markdown language for ease of writing. Note how to add syntax code: three backticks followed by C++. Add three backticks after the last line of code.
Classes:
class Gaussian
Types of Layers: Dense, LSTM, GMM (subclasses of Layer) Type of Activation: tanh, sigmoid, ReLU (not implemented) (subclasses of Activation)
Top level classes are Layers, Activation. Polymorphism is used to define the subclasses.
Other classes are Weights (wraps the weight matrix, at least until we decide on a more permanent choice for an array library. The wrapping approach will make it easier to adapt the code to different matrix and tensor libraries.
Classes
- Abstract class for Activation
abstract class Activation() {
float operator() = 0;
};- Abstract class for model
- Abstract class for layer
class Layer { private: int batch_size; public: Layer()... ~Layer()... Layer(Layer&)... void setBatchSize(int batch_size); int getBatchSize(); void setSeqLen(int seq_len); int getSeqLen(); // for experimentation: different learning rates per layer // if not set, use LR for model void setLearningRate(float lr); void getLearningRate();
void computeGradient();
};
class Dense : public class Layer { }
class LSTM : public class Layer { };
class TimeDistributed : public class Layer { };
class GMM : public class Layer { };
class Model(): public: Model(); ~Model(); setOptimizer(Optimizer* opt); getOptimizer(); setReturnState(bool state); getReturnState(); setLearningRate(float lr); // no required float getLearningRate();
Model* m = new Model(); Layer* l1 = new LSTM(); Layer* l2 = new LSTM(); Layer* gmm = new GMM(); l1->setActivation('Tanh'); l2->setActivation('Sigmoid'); m->add(l1); m->add(l2); m->add(gmm); Optimizer* rmsprop = new OptRMSProp(); rmsprop->setLearningRate(1.e-5); m->setOptimizer(rmsprop); // do not use strings for models. That way, use polymorphism m->setReturnState(true);
m->compile()
m->train() m->predict() m->test()
Aug. 8, 2016
The data input into the network is of type VF3D (batch, seq_len, dimensionality). See typedefs.h for definitions. As much as possible the armadillo network is hidden. Use loops for matrix multiplication if higher level operators are not available. Consider creating your own operator calculus (that is what I would do) using Armadillo matrices. Something like:
operator*(VF3D x, VF y, axis)
First create the layers, then the layer properties, then the model, then add the layers to the model.
// list of layers LAYERS layers = m->getLayers();
I am not currently checking for dimension compatibility between layers. I suggest you do (unless I do it first).
The execute method will execute a layer, or a model (not implemented) m->predict(VF3D x); // generate output to the network.
Keep all arrays 3D except weights that are 2D.
I very strongly suggest creating specialized operators to avoid writing complex loops.
Since we are using libraries that work with containers, pass everything by value (for now). Return by value as much as possible. We will profile the code at a later time.
WEIGHTS w; VF3D x; VF3D y; Activation f;
y = f(w*x)
should work. You can derive the operators to make this so.
You can find Armadillo documentation at http://arma.sourceforge.net/ .
If you have used Keras, you should understand all of this.
What have I done:
- created the infrastructure for this code (the templates you wanted).
- decided (with Nathan) on Armadillo as our matrix library. We may change in the future. So avoid references to Armadillo in the main code. I am violating this principle already when accessing elements such as: printf("rows, col= %d, %d\n", weights.n_rows, weights.n_cols);
and
weights = arma::randu<WEIGHTS>(arma::size(weights));
So in this case, it is best to create Utility class and put functions such as:
getDims(WEIGHTS& w); // or getDims(WEIGHT w);
and
getRandomUniform(WEIGHTS& w);
which will hide Armadillo from the main program classes.
If using a different library, the definition of WEIGHTS would change in typedefs.h .
Alternatively, one could create:
getRandomUniform(Weights& w)
which is a class wrapper around the weights (which will likely not change much from here on out. In the future, we can create an intermediate weight class responsible for aggregating weights for more efficient execution on a GPU.
=============================================================