As it happens now, the combination of global memory plus async causes multiple requests to the IA, making the demo either slow or expensive or both.