Description

This is CPU-only queue-based inference system for running local LLMs under constrained resources.
I built this out of curiosity to learn Queue + LLM inference on CPU.
I made this README.md as readable as possible i hope this helps.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
application		application
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
Systemprompt.md		Systemprompt.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
prometheus.yml		prometheus.yml
tsconfig.json		tsconfig.json

Provide feedback