Add convergenceTol

Spark's GradientDescent optimizer has a `convergenceTol` which is very helpful. It would be good to add that here as well.

See https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala#L99