Now that we've made bit-packing first class, and can be skipped trivially when W==T, we could move the generic width W into the FastLanes trait.