Feature Request: Model unload

Hi,
thx for the hard work!

Would it be possible to unload the model from VRAM after a certain time?
For testing and VRAM contraints, when using multiple services, that would be really helpfull.

Kinda like the "keep_alive" in ollama. 
"0" to unload instantly after the request, 
"-1" for never undload 
"5m" for 5 minutes after the last request.

thank you