The constant η0 is determined by performing preliminary experiments on a data subsample. http://leon.bottou.org/projects/sgd We could also have a `asgd.tune_...()` methods to "tune" speed and accuracy (here step_size0 would be optimized).