This document answers a practical question for each profile:
what should I expect to become reachable after startup, and what should I check first?
For any profile:
scripts/aoa-profile-modules --profile <name> --paths
scripts/aoa-profile-endpoints --profile <name>
scripts/aoa-render-services --profile <name>
scripts/aoa-up --profile <name>
scripts/aoa-wait --profile <name>
scripts/aoa-smoke --profile <name>If the profile includes internal-only services, follow with:
scripts/aoa-internal-probes --profile <name>Or combine host-facing and internal-only checks in one pass:
scripts/aoa-smoke --with-internal --profile <name>For profiles that include canonical llama.cpp inference, aoa-up now performs a bounded readiness wait on the llama.cpp health surface before handing control back to the operator.
If you want a bounded alternate llama.cpp benchmark or promotion lane beyond the canonical runtime path, use:
scripts/aoa-llamacpp-pilot run --preset intel-fullThat lane is separate from the canonical profile-driven runtime and exists only for explicit benchmark or promotion work. Use the same explicit lane for additive Intel 285H host-profile work such as Gemma 4 screening, Vulkan-first validation, or KV-cache candidate checks. Keep those runs as benchmark or promotion artifacts until machine-fit and reviewed runtime docs say otherwise.
Common Intel 285H candidate examples:
scripts/aoa-llamacpp-pilot run --preset intel-full --overlay compose/tuning/llamacpp.intel-285h.cpu-safe.yml
scripts/aoa-llamacpp-pilot run --preset intel-full --overlay compose/tuning/llamacpp.intel-285h.cpu-balanced.yml --overlay compose/tuning/llamacpp.intel-285h.server-cache.yml
scripts/aoa-llamacpp-pilot run --preset intel-full --overlay compose/tuning/llamacpp.intel-285h.vulkan-lab.ymlvulkan-lab is a dedicated image-seam packet, not just a device flag. It
swaps llama-cpp to ghcr.io/ggml-org/llama.cpp:server-vulkan for that run.
When one of those challenger packets looks strong enough to replace the current live winner, move to RUNTIME_WINNER_PROMOTION_LOOP rather than promoting directly from the pilot output.
Before screening a new donor, open or create its entry in MODEL_CARDS and keep the donor explicit in the packet.
Model-card-first Intel text screening:
scripts/aoa-sync-configs
export AOA_OVMS_TEXT_SOURCE_MODEL=OpenVINO/Qwen3-8B-int4-ov
export AOA_OVMS_TEXT_MODEL_NAME=OpenVINO/Qwen3-8B-int4-ov
podman compose \
-f /srv/abyss-stack/Configs/compose/tuning/intel-text.ovms-gpu-lab.yml \
-f /srv/abyss-stack/Configs/compose/tuning/intel-text.ovms-qwen3-settings.yml \
up -d
scripts/aoa-qwen-check --case exact-reply --url http://127.0.0.1:5404/run
scripts/aoa-qwen-bench --profile intel --url http://127.0.0.1:5404/run --backend-label "langchain-api-intel-text -> ovms-openai" --model-label "OpenVINO/Qwen3-8B-int4-ov" --runtime-variant "OVMS text-generation sidecar on GPU" --target-label "intel-text-qwen3-8b-int4-gpu-lab"For the first explicit non-llama.cpp Intel text lane, use the standalone OVMS sidecar lab packet instead of rewriting the canonical profile:
scripts/aoa-sync-configs
export AOA_OVMS_TEXT_SOURCE_MODEL=OpenVINO/Qwen3-8B-int4-ov
export AOA_OVMS_TEXT_MODEL_NAME=OpenVINO/Qwen3-8B-int4-ov
podman compose \
-f /srv/abyss-stack/Configs/compose/tuning/intel-text.ovms-gpu-lab.yml \
-f /srv/abyss-stack/Configs/compose/tuning/intel-text.ovms-qwen3-settings.yml \
up -d
scripts/aoa-qwen-check --case exact-reply --url http://127.0.0.1:5404/run
scripts/aoa-qwen-bench --profile intel --url http://127.0.0.1:5404/run --backend-label "langchain-api-intel-text -> ovms-openai" --model-label "OpenVINO/Qwen3-8B-int4-ov" --runtime-variant "OVMS text-generation sidecar on GPU" --target-label "intel-text-qwen3-8b-int4-gpu-lab"Use LLAMACPP_PILOT for the full operator contract.
The smallest useful local substrate. Good for validating storage, orchestration, and local model-serving basics.
postgres->127.0.0.1:5432redis->127.0.0.1:6379qdrant->http://127.0.0.1:6333/neo4j->http://127.0.0.1:7474/n8n->http://127.0.0.1:5678/llama-cpp->http://127.0.0.1:11435/health
scripts/aoa-profile-endpoints --profile core
scripts/aoa-render-services --profile core
scripts/aoa-up --profile core
scripts/aoa-wait --profile core
scripts/aoa-smoke --profile coreThe generic local agent runtime.
This profile uses langchain-api -> llama.cpp as the canonical chat path and does not require OVMS.
All core endpoints, plus:
langchain-api->http://127.0.0.1:5403/health
scripts/aoa-profile-endpoints --profile agentic
scripts/aoa-render-services --profile agentic
scripts/aoa-up --profile agentic
scripts/aoa-wait --profile agentic
scripts/aoa-smoke --profile agentic
scripts/aoa-qwen-check --case exact-reply
scripts/aoa-qwen-bench --profile agenticThe Intel-aware agent runtime.
This profile adds OVMS and applies the Intel overlay for the canonical agent API.
In the current reviewed posture, embeddings move to OVMS while the canonical chat path stays on llama.cpp.
Broader Intel-serving lanes remain additive and separately reviewed rather than silently promoted through this profile.
If you are screening an explicit Intel-served text lane, point langchain-api at it through LC_BASE_URL, LC_API_KEY, and LC_MODEL in the secret langchain-api.env file rather than rewriting the profile itself.
All agentic endpoints, plus:
ovms rest->http://127.0.0.1:8200/v2/health/liveovms grpc->127.0.0.1:9200
scripts/aoa-doctor
scripts/aoa-profile-endpoints --profile intel
scripts/aoa-render-services --profile intel
scripts/aoa-up --profile intel
scripts/aoa-wait --profile intel
scripts/aoa-smoke --profile intel
scripts/aoa-qwen-check --case exact-reply
scripts/aoa-qwen-bench --profile intelA localhost-only federation seam that reads mirrored aoa-agents contracts, mirrored aoa-routing advisory surfaces, mirrored aoa-memo recall surfaces, mirrored aoa-evals eval-selection surfaces, mirrored aoa-playbooks activation/composition advisory surfaces, mirrored aoa-kag retrieval/regrounding surfaces, and a source-owned tos-source handoff companion from the runtime tree.
This profile is metadata-only for reads and does not change langchain-api, but it also enables filesystem-first memo export candidates and filesystem-first eval export candidates.
route-api->http://127.0.0.1:5402/health
scripts/aoa-sync-federation-surfaces --layer aoa-agents
scripts/aoa-sync-federation-surfaces --layer aoa-routing
scripts/aoa-sync-federation-surfaces --layer aoa-memo
scripts/aoa-sync-federation-surfaces --layer aoa-evals
scripts/aoa-sync-federation-surfaces --layer aoa-playbooks
scripts/aoa-sync-federation-surfaces --layer aoa-kag
scripts/aoa-sync-federation-surfaces --layer tos-source
scripts/aoa-profile-endpoints --profile federation
scripts/aoa-render-services --profile federation
scripts/aoa-up --profile federation
scripts/aoa-wait --profile federation
scripts/aoa-smoke --profile federationA route-first helper for Tree of Sophia graph curation.
This slice uses the storage substrate so neo4j is available, but it keeps the
helper itself projection-only, read-first, and localhost-only.
Machine-fit overlays that do not touch the selected services are skipped
automatically in this profile, so it does not silently pull in llama-cpp.
postgres->127.0.0.1:5432redis->127.0.0.1:6379qdrant->http://127.0.0.1:6333/neo4j->http://127.0.0.1:7474/tos-graph->http://127.0.0.1:5410/health
scripts/aoa-profile-endpoints --profile curation
scripts/aoa-render-services --profile curation
scripts/aoa-up --profile curation
scripts/aoa-wait --profile curation
scripts/aoa-smoke --profile curationBefore launch, ensure AOA_TOS_ROOT points at the real Tree-of-Sophia
checkout and Secrets/Configs/tos-graph.env exists in the deployed runtime.
If TOS_GRAPH_NEO4J_PASSWORD is not set there, tos-graph falls back to the
mounted ${AOA_STACK_ROOT}/Configs/stack.env and derives the password from
NEO4J_AUTH.
Optional helper surfaces for speech and browser-like tooling.
qwen-tts->http://127.0.0.1:5101/healthtts-router->http://127.0.0.1:5201/health
docs-apiis internal-onlyaoa-browseris internal-only
scripts/aoa-profile-endpoints --profile tools
scripts/aoa-render-services --profile tools
scripts/aoa-up --profile tools
scripts/aoa-wait --profile tools
scripts/aoa-smoke --profile tools
scripts/aoa-internal-probes --profile toolsOptional visibility into the body rather than the body itself.
prometheus->http://127.0.0.1:9090/-/readyalertmanager->http://127.0.0.1:9093/-/readygrafana->http://127.0.0.1:3000/api/health
cadvisoris internal-only
scripts/aoa-profile-endpoints --profile observability
scripts/aoa-render-services --profile observability
scripts/aoa-up --profile observability
scripts/aoa-wait --profile observability
scripts/aoa-smoke --profile observability
scripts/aoa-internal-probes --profile observabilityWhat it gives you:
- the generic local agent path
- speech endpoints on the host
- browser-tools surfaces kept internal-only
Try:
scripts/aoa-profile-modules --profile agentic --profile tools --paths
scripts/aoa-profile-endpoints --profile agentic --profile tools
scripts/aoa-render-services --profile agentic --profile tools
scripts/aoa-up --profile agentic --profile tools
scripts/aoa-smoke --with-internal --profile agentic --profile toolsPreset form:
aoa-preset-profiles --preset agent-tools --paths
aoa-up --preset agent-tools
aoa-smoke --with-internal --preset agent-tools
aoa-qwen-bench --preset agent-toolsWhat it gives you:
- the generic local agent path
- dashboards and metrics visibility
- internal-only
cadvisor
Try:
scripts/aoa-profile-modules --profile agentic --profile observability --paths
scripts/aoa-profile-endpoints --profile agentic --profile observability
scripts/aoa-render-services --profile agentic --profile observability
scripts/aoa-up --profile agentic --profile observability
scripts/aoa-smoke --with-internal --profile agentic --profile observabilityPreset form:
aoa-preset-profiles --preset agent-observability --paths
aoa-up --preset agent-observability
aoa-smoke --with-internal --preset agent-observability
aoa-qwen-bench --preset agent-observabilityWhat it gives you:
- the generic local agent path
- a localhost-only federation seam for mirrored
aoa-agentscontracts,aoa-routingadvisory surfaces,aoa-memorecall surfaces,aoa-evalseval-selection surfaces,aoa-playbooksactivation/composition advisory surfaces,aoa-kagretrieval/regrounding surfaces, and thetos-sourcehandoff companion - filesystem-first memo export candidates under
Logs/memo-exports/ - filesystem-first eval export candidates under
Logs/eval-exports/ - no change to the existing
/runor/embeddingssurfaces
Try:
scripts/aoa-sync-federation-surfaces --layer aoa-agents
scripts/aoa-sync-federation-surfaces --layer aoa-routing
scripts/aoa-sync-federation-surfaces --layer aoa-memo
scripts/aoa-sync-federation-surfaces --layer aoa-evals
scripts/aoa-sync-federation-surfaces --layer aoa-playbooks
scripts/aoa-sync-federation-surfaces --layer aoa-kag
scripts/aoa-sync-federation-surfaces --layer tos-source
scripts/aoa-profile-modules --profile agentic --profile federation --paths
scripts/aoa-profile-endpoints --profile agentic --profile federation
scripts/aoa-render-services --profile agentic --profile federation
scripts/aoa-up --profile agentic --profile federation
scripts/aoa-smoke --profile agentic --profile federation
scripts/aoa-federated-checkPreset form:
aoa-preset-profiles --preset agent-federation --paths
aoa-up --preset agent-federation
aoa-smoke --preset agent-federation
aoa-federated-checkIf you want the live federated advisory consumer path, set
AOA_FEDERATED_RUN_ENABLED=true in the runtime-secret
Secrets/Configs/langchain-api.env file before startup.
When that gate is intentionally on, prove the live advisory boundary explicitly:
scripts/aoa-federated-check --require-enabled
scripts/aoa-federated-check --require-enabled --playbook-id AOA-P-0008
scripts/aoa-federated-check --require-enabled --inspect-id AOA-K-0011
scripts/aoa-federated-check --require-enabled --memo-id AOA-M-0001What it gives you:
- the Intel-aware agent runtime with OVMS
- the same localhost-only federation seam,
aoa-routingadvisory layer,aoa-memorecall layer,aoa-evalseval-selection layer,aoa-playbooksadvisory layer, andaoa-kag/tos-sourcehandoff layer - the same filesystem-first memo export candidates
- filesystem-first eval export candidates under
Logs/eval-exports/ - no change to the existing Intel overlay contract
Try:
scripts/aoa-sync-federation-surfaces --layer aoa-agents
scripts/aoa-sync-federation-surfaces --layer aoa-routing
scripts/aoa-sync-federation-surfaces --layer aoa-memo
scripts/aoa-sync-federation-surfaces --layer aoa-evals
scripts/aoa-sync-federation-surfaces --layer aoa-playbooks
scripts/aoa-sync-federation-surfaces --layer aoa-kag
scripts/aoa-sync-federation-surfaces --layer tos-source
scripts/aoa-profile-modules --profile intel --profile federation --paths
scripts/aoa-profile-endpoints --profile intel --profile federation
scripts/aoa-render-services --profile intel --profile federation
scripts/aoa-up --profile intel --profile federation
scripts/aoa-smoke --profile intel --profile federation
scripts/aoa-federated-checkPreset form:
aoa-preset-profiles --preset intel-federation --paths
aoa-up --preset intel-federation
aoa-smoke --preset intel-federation
aoa-federated-checkIf you want the live federated advisory consumer path, set
AOA_FEDERATED_RUN_ENABLED=true in the runtime-secret
Secrets/Configs/langchain-api.env file before startup.
When that gate is intentionally on, prove the live advisory boundary explicitly:
scripts/aoa-federated-check --require-enabled
scripts/aoa-federated-check --require-enabled --playbook-id AOA-P-0008
scripts/aoa-federated-check --require-enabled --inspect-id AOA-K-0011
scripts/aoa-federated-check --require-enabled --memo-id AOA-M-0001What it gives you:
- Intel-aware agent runtime with OVMS
- speech helpers
- observability surfaces
- all internal-only surfaces checked in one pass
Try:
scripts/aoa-profile-modules --profile intel,tools,observability --paths
scripts/aoa-profile-endpoints --profile intel,tools,observability
scripts/aoa-render-services --profile intel,tools,observability
scripts/aoa-up --profile intel,tools,observability
scripts/aoa-smoke --with-internal --profile intel,tools,observabilityPreset form:
aoa-preset-profiles --preset intel-full --paths
aoa-up --preset intel-full
aoa-smoke --with-internal --preset intel-full
aoa-qwen-bench --preset intel-full