I have 2 8592+ EMR CPU,and have four node in my system, when runing run_benchmark.sh script with "-s" parameter, the program justice to create four process as follows:

the performace metrix show only one process as means only node performance:

Why is it designed this way? In fact, the results of all the core tests I used were not better than those of the single-node tests. Why