Currently gpu_min_ci_balanced_factor from nbnxn_ocl_data_mgmt.cpp is set to the same value used by the equivalent CUDA implementation.
Check if for AMD this parameter should have another value which could improve performance.
If necessary, have different coefficients depending on the vendor.