-
Notifications
You must be signed in to change notification settings - Fork 11
Description
The ParamServer is responsible for storing model parameters. The ParamServer is a service that is made available over a network. The parameters are updated by applying deltas from stochastic gradient processes ( ParamServer clients ) in an asynchronous way.
Since the ParamServer is a separate process it can be implemented in a lower level language like C, so weight updates can be quick. At the moment weight updates takes up 50% of the training time. So it makes sense to move it out to a new service for 2 reasons.
- Allow for asynchronous training - Data Parallelism
- Move a bottleneck of the code into a specialized service, will place focus on it, and allow us to implement it in another language
One reason to delay this implementation is that it is not known how easily it will be to generalize this approach with in other types of networks. This approach is employed by Google and I have personal experience in applying it. In both cases it is used to manage the parameters of a feed-forward network. Maybe it is a good idea to delay this implementation until stronger patterns in the code has emerged.