Skip to content

[Doc] Add FAQ section and multi-model troubleshooting guide#227

Open
yurekami wants to merge 1 commit intoovg-project:mainfrom
yurekami:docs/multi-model-guide-and-faq
Open

[Doc] Add FAQ section and multi-model troubleshooting guide#227
yurekami wants to merge 1 commit intoovg-project:mainfrom
yurekami:docs/multi-model-guide-and-faq

Conversation

@yurekami
Copy link

Summary

Addresses issues #183 and #193:

Changes

FAQ Section (README.md)

  • Explain kvcached vs Paged Attention with comparison table
  • Document that --gpu-memory-utilization should NOT be used with kvcached
  • Add prefix caching compatibility note
  • Add kvctl monitoring instructions

Multi-Model Troubleshooting (controller/README.md)

  • Add troubleshooting section for memory allocation failures
  • Document startup delays between instances (launch_delay_seconds)
  • Add kvctl monitoring examples
  • Create comparison table for memory settings

Config Fix (controller/example-config.yaml)

  • Remove conflicting --gpu-memory-utilization and --mem-fraction-static flags
  • Add explanatory comments about kvcached memory management

Test plan

  • Verify markdown renders correctly
  • User feedback on documentation clarity

Closes #183
Closes #193

🤖 Generated with Claude Code

Addresses issues ovg-project#183 and ovg-project#193:

## FAQ Section (README.md)
- Add explanation of kvcached vs Paged Attention difference
- Document memory management settings (do NOT use --gpu-memory-utilization)
- Add prefix caching compatibility note
- Add kvctl monitoring instructions

## Multi-Model Troubleshooting (controller/README.md)
- Add troubleshooting section for memory allocation failures
- Document startup delays between instances
- Add kvctl monitoring examples
- Create comparison table for memory settings

## Config Fix (example-config.yaml)
- Remove conflicting --gpu-memory-utilization and --mem-fraction-static
- Add explanatory comments about kvcached memory management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

start mutiple models What is the difference between kvcached and Paged Attention ?

1 participant