Implement Straming mode for Medium Model

The Medium model is slower because it makes more LLM calls. However, this is not intended as a client-facing review—it’s an internal behavior that may seem unusual. A possible solution is to use an AsyncGenerator for the response in chunks.