I want to know that when I use xFT to test the qwen3-8B model, the dtype is bf16 and the kv cache is set to fp16. I would like to ask, what instruction set will I use? The CPU I use is 8592+, and it supports avx512 bf16 and amx bf16.
I have two questions:
1、Will both of these instruction sets be used when kv cache is not employed?
2、If the calculation involves caching, will avx512 fp16 be used?
