Would I expect a custom generated 13b adder for a given architecture (e.g. brent-kung) to be strictly better than using a 16b pregenerated adder from this library and hooking it up with the top 3 bits of output unconnected and the top 3 bits of each input wired to 1'b0?