This isn't consistent with the format the NIST tool takes, and doesn't appear to be documented. For smaller files, the number of symbols will be adjusted, but for larger files, there appears to be no mechanism for processing more than 2^20 symbols. There shouldn't be an assumption that the number of symbols * the symbol size is divisible by 8.