This is an alternative version of gdal_calc that allows better handling of nodata and 15-30% faster runtimes. Minimal changes to source code.
It has been submitted for inclusion in the trunk GDAL distribution via trac, but it may take a long time to show up. You can use it from here in the meantime if you like.