By checking the source code of the badgr, I found the MPPI planner generates the actions using:
action_h = self._beta * (shifted_mean[h, :] + eps[:, h, :]) + (1. - self._beta) * actions[-1]
So I think there is a typo in the paper Eqn. 3, which is

I guess it may be
