In light of the back-end server being live, it should probably be made to act as a real server, spinning threads (up to a threshold) to handle requests concurrently. One could make the argument for a load balancer being introduced, but for now a hard limit on the number of concurrent requests allowed would be fine.