Skip to content

Model / Session Memory Bloat with New Sessions #113

@james-333i

Description

@james-333i

When creating sessions, models are loaded in to memory but are never cleared. If you switch models for a new chat, additional models are loaded in to memory. Eventually if you load too many models, the device will crash due to OOM errors.

Requesting that a function be added to clear models from a session and/or ability to completely destroy a session so that memory can be cleared.

This is especially problematic when using local MLX models and not models like foundation or remote API models since the models themselves are the massive memory consumer.

In other similar libraries, the approach taken was to only keep models in memory as long as the session object is live. This means after a respond or streamResponse request and function end, it would automatically clear UNLESS you keep the session as a strong property. That way if you wanted to clear it later you could just set the strong property to nil.

Otherwise, it seems this needs a cleanup method.

This is critical when you need to use different models for different purposes.

Noting that Ollama handles this in a first in first out basis and options for how many models to keep in memory at once so that if you wanted to keep several in memory you could.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions