A Trusted Execution Environment (TEE) provides a secure enclave for executing code, but it cannot host a database internally. This creates a challenge: how can an application running inside a TEE (like an AI agent) interact with a database that resides in the untrusted outside world, while ensuring the data has not been tampered with?
This Proof of Concept explores a solution to this problem by comparing two primary architectural approaches
This approach treats the agent's entire memory or database state as decentralized content. The data is stored on a network like IPFS, and its integrity is anchored to the blockchain using a Merkle tree, similar to Solana's compressed NFT (cNFT) standard.
-
How it works:
- Content: Stored on IPFS.
- Ownership & State: Managed in an off-chain Merkle tree.
- Proof of Integrity: The Merkle root of the tree is stored on the Solana blockchain.
-
Pros:
- Full Decentralization: The entire state is distributed and not reliant on a single database server.
- Data Recoverability: Since all data is on a public network, the state can be fully reconstructed from the chain and IPFS, even if the primary service is lost.
-
Cons:
- High Latency: Every state change requires interacting with IPFS and recalculating the tree, which is significantly slower than a traditional database write.
- Higher Cost: While cheaper than storing all data on-chain, frequent updates can still incur notable costs.
- Inconvenience: Not well-suited for applications requiring frequent, low-latency reads and writes, complex queries, or relational data.
This approach uses a high-performance, traditional database (like PostgreSQL) for storing data and leverages a Sparse Merkle Tree (SMT) to anchor the integrity of that data on-chain.
-
How it works:
- Content: Stored in a standard PostgreSQL database.
- Proof of Integrity: A single cryptographic hash (the SMT root) representing the entire database state is stored on the Solana blockchain.
-
Pros:
- High Performance: Achieves the speed and querying capabilities of a traditional database. Writes and reads are extremely fast.
- Low Cost: On-chain transactions are only needed to update a single hash, and with batching, this cost is minimized across many thousands of database operations.
- Convenience: Developers can use standard SQL and database tools.
-
Cons:
- Centralized Data Storage: The data itself is not decentralized. If the database is destroyed, the data is lost.
- No On-Chain Recoverability: The blockchain only guarantees the integrity of the data; it cannot be used to restore the data.
For the target use case of AI agents, the verifiable database model was chosen as the superior architecture. The critical requirement is to prevent and detect tampering of the agent's state, not necessarily to ensure the state is fully decentralized or recoverable from the blockchain.
The agents' operational integrity depends on knowing their memory is correct, but if the memory is lost entirely, it is not a critical failure for the overall system. Therefore, the immense performance, cost, and convenience benefits of the verifiable database approach far outweigh the advantage of full data decentralization offered by the cNFT model for this specific application.
This project is a Proof of Concept for a verifiable database system that uses a Sparse Merkle Tree (SMT) to create cryptographic commitments to its state. These commitments (Merkle roots) are anchored on the Solana blockchain, creating a tamper-evident audit trail.
The architecture ensures that a TEE (Trusted Execution Environment) can interact with a database, verify the integrity and authenticity of the data it receives, and prove to any third party that the data is correct and hasn't been tampered with.
The system is composed of two main pieces:
- TEE Agent (external client): any application running in a TEE (e.g. an AI agent service) that calls the Verifiable DB HTTP API.
In this repo, the “agent flow” is demonstrated via integration tests undertests/(e.g.test_schema_update.rs) and you can also interact manually via Swagger UI. - Verifiable API Service (runs in TEE): the trusted component. It handles DB interactions, proof generation + verification against the trusted root, and commits new roots to Solana.
The crate is organized by responsibility (transport vs domain vs infra). The main executable for manual interaction is src/bin/api_server.rs (Swagger UI), and the end-to-end flows live in integration tests under tests/.
src/
lib.rs # crate public surface + re-exports
app/
database_service.rs # DB + SMT orchestration (use-cases)
transport/
http/
types.rs # HTTP types + AppState
router.rs # Axum router + OpenAPI (ApiDoc)
handlers/ # endpoint handlers split by concern
domain/
commitment/
root_manager.rs # dual-root batching + trusted_state.json
model/
mod.rs # VerifiableModel trait
registry.rs # ModelRegistry
examples.rs # UserModel/ProductModel/WidgetModel (PoC models)
verify/
verifier.rs # SMT proof verification helpers
crypto/
hashing.rs # canonical JSON hashing + domain separation
storage/
smt/
store.rs # Sparse Merkle Tree wrapper (in-memory + proof generation)
postgres.rs # merkle_nodes persistence
infra/
config.rs # env parsing (DATABASE_URL, SOLANA_RPC_URL, SOLANA_PROGRAM_ID, BATCH_COMMIT_SIZE)
solana/
client.rs # Solana RPC client (reads SOLANA_RPC_URL + SOLANA_PROGRAM_ID from env)
bin/
api_server.rs # standalone API server binary (Swagger UI)
This PoC started with a few hardcoded example models, but the Verifiable DB service is now designed to be generic: you can plug in your own application tables without modifying Rust code.
- Schema bootstrap endpoint: you POST a JSON schema spec to
POST /bootstrap/apply-schema.- The service will create the tables in PostgreSQL (DDL is generated by the server, not executed from raw SQL supplied by the client).
- The schema is persisted into a registry table (
verifiable_models) and loaded into an in-memoryModelRegistryso the API can validate/route requests at runtime.
- Model registry tables (created automatically):
verifiable_models: per-table metadata (table name, PK field/kind, column specs, server-generated CREATE TABLE SQL).verifiable_registry_meta: stores aschema_hashused for change detection.
This repo currently runs in a single-tenant mode with a strong safety property: if the database structure changes, we reset state to avoid mixing old SMT commitments with a new schema.
On POST /bootstrap/apply-schema, the service computes a deterministic schema_hash of the provided table specs. If the hash differs from what was previously stored (or force_reset=true), it performs a full reset:
- Drops all managed tables (both previously registered tables and the tables in the incoming schema spec)
- Clears SMT persistence by truncating
merkle_nodes - Zeros BOTH roots:
- sets
temporary_root = 0in memory - writes
main_root = 0to Solana
- sets
- Deletes
trusted_state.json(so the trusted local root cannot conflict with the new zeroed chain root)
This makes schema changes explicit and safe: when you change your application tables, the verifiable layer resets cleanly and starts committing from a known zero state.
The generic model layer supports auto-increment IDs (Postgres SERIAL/BIGSERIAL).
- On write (
create-batch), you do not sendid. - The service inserts rows and uses
RETURNING idto fetch the generated PK. - The API returns the inserted
ids, which you can then pass back toread-batch.
This is important for real apps: most production schemas rely on DB-generated IDs, and the SMT keys are derived from the canonical string form of the primary key.
To achieve high performance and reduce on-chain transaction costs, the system uses a dual-root architecture for batching state commitments. Instead of writing to the blockchain on every operation, it groups multiple updates together.
-
temporary_root(In-Memory): This root lives inside the TEE API service. It is updated on every single write operation, representing the absolute latest state of the database. Write and read operations are extremely fast as they only interact with this in-memory root. -
main_root(On-Chain): This is the globally trusted root stored on the Solana blockchain. It is updated periodically by a background task.
The key security guarantee is that the Verifiable API Service (running in a TEE) handles the entire verification and commitment process internally.
- Request Initiation: The
TEE Agentsends a request to the API to create new data. - Proposed State Generation: The
API Servicereceives the request. It interacts with theDatabaseto:- Insert the new records.
- Calculate the new Merkle root (
proposed_root). - Generate a Merkle proof linking the current
temporary_rootto theproposed_root.
- Fetch Trusted State (In-Memory): The
API Serviceretrieves the currenttemporary_rootfrom its memory. This is the trusted "before" state for this operation. - Internal Verification: The
API Serviceverifies that theproofcorrectly links thetemporary_rootto theproposed_root. - Update Temporary State:
- If valid, the
API Serviceupdates its in-memorytemporary_rootto theproposed_rootand returns a success response to the agent. The write is now complete from the client's perspective. - If invalid, an error is returned.
- If valid, the
- Asynchronous Commit to Blockchain: The update increments a counter. When the counter reaches the
BATCH_COMMIT_SIZEthreshold:- A background task pauses new writes to prevent race conditions.
- It sends a transaction to
Solanato write the currenttemporary_rooton-chain, making it the newmain_root. - Writes are resumed.
One of the critical challenges in a TEE environment is handling unexpected shutdowns or crashes. If the system crashes after updating the database but before committing the new root to the blockchain, the on-chain state (the "Master Root") will be stale.
To solve this, the system implements a Trusted Local Storage mechanism:
- Trusted State File: The
RootManagermaintains a secure local file (trusted_state.json) inside the TEE. This file acts as a persistent, trusted cache for the latesttemporary_root. - Atomic Updates: On every write operation, the new root is written to this trusted file before the in-memory state is updated. This ensures that if the system crashes, the latest root is safely persisted.
- Automatic Recovery:
- On startup, the system reads the Master Root from Solana and the Local Root from the trusted file.
- If they differ, the system detects that a crash occurred.
- It initializes using the Local Root (restoring the latest valid state) and automatically schedules a background commit to sync this state to the blockchain.
This ensures that the TEE can always recover its latest state and verify database integrity, even if the blockchain is lagging behind due to a crash.
After bootstrapping schema, you can use the generic per-model endpoints:
- Healthcheck:
GET /health- Use this for load balancers / clients to confirm the service is up and the database is reachable.
- Returns:
200with{ "success": true, "data": { "status": "ok" } }when DB ping succeeds503with{ "success": false, "error": "DB ping failed: ..." }when DB ping fails
- Example:
curl -i http://localhost:3000/health- Write:
POST /api/models/{model}/create-batch - Upsert (update semantics):
POST /api/models/{model}/upsert - Read:
POST /api/models/{model}/read-batch - Read latest N:
POST /api/models/{model}/read-latest - Inspect live DB schema:
GET /bootstrap/schema - Clear all client data + reset roots:
POST /bootstrap/clear-data - Returns the current Postgres schema (tables/columns/PK) as seen by the database.
- Internal service tables are hidden (e.g.
merkle_nodes,verifiable_models,verifiable_registry_meta), so you only see your application tables (e.g.agents,actions, etc.).
These endpoints work for any registered model/table. The service:
- Writes to Postgres
- Updates the SMT
- Verifies the SMT proof transition inside the trusted API
- Updates
temporary_root(and batches commits to Solana usingBATCH_COMMIT_SIZE)
All read/write endpoints return ApiResponse with a consistent data object:
{
"success": true,
"data": {
"records": [{ "...": "..." }],
"ids": ["..."],
"verified": true,
"meta": { }
}
}records: canonical rows as returned by Postgres (row_to_json(table.*)).ids: primary key values for the returned rows.verified:truewhen the SMT proof verifies against the trustedtemporary_root.meta: optional extra info (e.g.limit,committed,proposed_root).
To fetch the most recent rows from a table (ordered by primary key descending) while still getting SMT verification, call:
POST /api/models/{model}/read-latest
Request body:
{ "limit": 5 }Notes:
- The response includes
ids,records, andverified: truewhen the SMT proof verifies against the trustedtemporary_root. limitis clamped server-side to a safe maximum (currently 100).
read-latest also supports restricted server-side queries:
where: equality-only filters ({ "field": value })order_by:{ "field": "some_column", "direction": "asc" | "desc" }
Example:
curl -sS -X POST "http://localhost:3000/api/models/agents/read-latest" \
-H "Content-Type: application/json" \
-d '{
"limit": 10,
"where": { "poll_duration": "1440", "enabled": "true" },
"order_by": { "field": "id", "direction": "desc" }
}'Apps often need “current value” tables (rate limits, cursors, since_id, etc.). For that, use:
POST /api/models/{model}/upsert
Notes:
- Each record must include the model’s primary key field.
- Implemented as
INSERT .. ON CONFLICT(pk) DO UPDATE ..and still updates/verifies the SMT transition.
Example:
curl -sS -X POST "http://localhost:3000/api/models/agents/upsert" \
-H "Content-Type: application/json" \
-d '{ "records": [ { "id": 1, "poll_duration": "1440" } ] }'The server tries to reduce client/LLM friction by coercing common scalar values based on the model’s column types, e.g.:
"1440"→ integer columns"true"/"false"/"1"/"0"→ boolean columns
If coercion/validation fails, the API returns a structured error list (record index + field):
{
"success": false,
"data": {
"errors": [
{ "index": 0, "field": "poll_duration", "expected": "int", "got": "string", "value": "abc" }
]
},
"error": "Validation/coercion failed"
}Earlier versions of this PoC could exhibit an inconsistency where a Postgres row was written but the verifiable state transition failed (proof verification failed), leading to root drift and “one write succeeds, the next write fails” symptoms.
This repo now applies the following safety fixes:
- Single-writer root lock: the service uses a single in-process lock to serialize the entire write critical section so writes cannot interleave with each other or with the background commit task.
- Atomic verifiable writes (rollback on proof failure):
create-batchandupsertkeep the SQL transaction open until the SMT proof is verified. If verification fails, the transaction is rolled back so no DB row persists. - Single API instance enforced by default: the service takes a Postgres advisory lock on startup; a second instance against the same Postgres will fail fast.
- Override (not recommended): set
ALLOW_MULTI_INSTANCE=true.
- Override (not recommended): set
- Optional optimistic concurrency: write requests accept
expected_root(hex string). If it doesn’t match the current trustedtemporary_root, the API returns 409 withcode: ROOT_CHANGEDso clients can retry cleanly. - Repair path: if you ever suspect drift, rebuild SMT from DB rows and force-set roots with:
POST /bootstrap/repair-rootswith{ "confirm": true }
To let a client wipe its data (single-tenant/dev flow), call:
POST /bootstrap/clear-data
Request body:
{ "confirm": true }This will:
- Delete rows from all client-managed tables (registered in
verifiable_models) - Clear SMT persistence (
merkle_nodes) and reset the in-memory SMT store - Reset both
temporary_rootand the on-chainmain_rootto zero
This project runs against the public Solana devnet for a realistic, production-like simulation.
- Devnet (recommended for this repo right now / PoC):
- Fast iteration and debugging.
- Free test SOL via
solana airdrop. - Good for integration testing and validating the end-to-end cryptographic flow.
- Devnet limitations (important):
- Not a production trust anchor: devnet is a testing cluster, not intended for long-term guarantees.
- State can be reset / pruned: history and accounts are not guaranteed to be preserved forever.
- Reliability is best-effort: outages, rate limits, and RPC instability can happen.
- No economic security: it does not provide the same “real-value” finality assumptions as mainnet.
- Program IDs may change if you redeploy; your service must be configured with the correct
SOLANA_PROGRAM_ID.
- Mainnet (future / optional):
- Use it when the system is production-ready and you need a durable, externally verifiable audit trail.
- Requires ops/security work: key management, cost model for writes, monitoring/runbooks, and a hardened deployment story.
You only need the PostgreSQL database. The Solana devnet is already running.
-
PostgreSQL Database:
./scripts/start_db.sh
Or manually:
docker run --name pg-verifiable-memory -e POSTGRES_PASSWORD=password -e POSTGRES_DB=verifiable_memory -p 5432:5432 -d postgres
Create a .env file in the root of the project with the following content:
DATABASE_URL="postgres://postgres:password@localhost:5432/verifiable_memory"
SOLANA_RPC_URL="https://api.devnet.solana.com"
# Program ID from `anchor deploy`
SOLANA_PROGRAM_ID="6fSQZwqdsr8zVSbE8DTo4tsHDW4af3iZyB5KGzEGqyW8"
# Number of temporary_root updates before committing to blockchain (default: 10)
BATCH_COMMIT_SIZE=10You also need to ensure your Solana CLI is configured for devnet and you have some devnet SOL.
- Set CLI to Devnet:
solana config set --url https://api.devnet.solana.com - Get Free Devnet SOL (if needed):
solana airdrop 2
Before running the app, you need to deploy the smart contract to the devnet.
# Navigate to the program directory
cd solana_program
# This command will build and deploy the contract.
# It will output a new Program ID.
anchor deployIMPORTANT: After deploying, you must copy the new Program ID from the output and update it in these two files:
solana_program/programs/verifiable_db_program/src/lib.rs(in thedeclare_id!macro)- Your
.envfile (setSOLANA_PROGRAM_ID=<new_program_id>for the verifiable service)
After updating the IDs, run anchor deploy one more time.
Now you can run the simulation against the live devnet.
# Navigate back to the project root
cd ..
# Run the simulation
./scripts/start_simulation.shIf you want the API to run inside Docker (closer to a deployment model), use the helper scripts in scripts/.
# First time only (make scripts executable)
chmod +x scripts/*.sh
# Start DB + run Solana preflight + start API in Docker
./scripts/docker_up.shWhat docker_up.sh does (purposefully “fail-fast”):
- Starts Postgres using
scripts/start_db.sh(or reuses the existing container) - Validates env vars from
.envare present (DATABASE_URL,SOLANA_RPC_URL,SOLANA_PROGRAM_ID,BATCH_COMMIT_SIZE) - Validates Solana connectivity + program + PDA using a Rust preflight (
cargo run --bin preflight) - Starts the API service inside Docker using
scripts/start_api_docker.sh
On some Linux setups, the API container may fail to reach SOLANA_RPC_URL when running on the default Docker bridge network (you’ll see startup failures while initializing RootManager).
If that happens, run with host networking:
NETWORK_MODE=host BUILD=false ./scripts/docker_up.shNotes:
NETWORK_MODE=hostis Linux-only.- With host networking, the API uses your host network stack (often fixing DNS/egress issues to Solana RPC).
The verifiable service uses trusted_state.json as durable “trusted local storage” for crash recovery.
When running via scripts/start_api_docker.sh, this file is bind-mounted from your repo into the container, so it persists across container restarts:
- Host file:
./trusted_state.json - Container path:
/app/trusted_state.json
scripts/start_api_docker.sh supports:
BUILD=auto(default): build only if the image is missingBUILD=true: always rebuild the API imageBUILD=false: never build (expects the image to already exist)
You can also run just the API server with its Swagger UI. It will connect to whichever Solana cluster is defined in your .env file.
cargo run --bin api_server- Swagger UI: http://localhost:3000/swagger-ui
The service supports dynamic application tables. A client can define the database structure by calling:
POST /bootstrap/apply-schema
Example payload (creates an agents table):
{
"tables": [
{
"table_name": "agents",
"primary_key_field": "id",
"primary_key_kind": "serial",
"columns": [
{ "name": "name", "col_type": "text", "unique": true, "nullable": false },
{ "name": "description", "col_type": "text", "nullable": true },
{ "name": "metadata", "col_type": "jsonb", "nullable": true },
{ "name": "created_at", "col_type": "timestamptz", "nullable": false }
]
}
],
"force_reset": false
}If you include a created_at column with:
col_type: "timestamptz"nullable: falsename: "created_at"
…the server-generated DDL will automatically add DEFAULT now() for that column:
created_at timestamptz NOT NULL DEFAULT now()
This means clients can omit created_at in create-batch. The service hashes the DB-returned row (from RETURNING row_to_json(...)), so the generated timestamp is included in the leaf hash.
Apply it with curl:
curl -sS -X POST "http://localhost:3000/bootstrap/apply-schema" \
-H "Content-Type: application/json" \
-d @schema.jsonAfter applying schema, you can:
- Inspect the current DB schema:
GET /bootstrap/schema - Write records:
POST /api/models/agents/create-batch - Read records:
POST /api/models/agents/read-batch - Read latest records:
POST /api/models/agents/read-latest
Important behavior (single-tenant safety mode): if the schema_hash changes, the service will reset verifiable state (drop managed tables, clear SMT persistence, and reset roots). Use force_reset=true to force a reset even if the schema hash matches.
The schema types use snake_case enums. For example:
primary_key_kind: "big_serial"(NOT"bigserial")
If you send the wrong enum string, you’ll see:
Invalid JSON body: Failed to deserialize the JSON body into the target type
While this Proof of Concept demonstrates the core mechanics of a verifiable database, a production-grade deployment would require additional security layers to protect against a malicious or compromised host environment. The following architecture outlines a robust approach for achieving end-to-end security.
The database contains the agent's sensitive state. To protect this data from being stolen or inspected by the host operator, it must be encrypted at rest.
- Mechanism: The TEE's virtual disk or file system (where PostgreSQL or another database stores its files) is encrypted using a standard like LUKS.
- Key Management: The decryption key is not stored locally. Instead, it is managed by a Key Management Service (KMS).
- Attestation-Gated Access: The TEE instance is configured to perform remote attestation with the KMS upon startup. It generates a cryptographic quote proving it is a genuine TEE running the expected, unmodified code. The KMS only releases the disk decryption key to the TEE after successfully verifying this quote.
This flow ensures that the sensitive database files can only be decrypted and accessed by a legitimate, trusted TEE instance, effectively preventing offline data theft.
To guarantee that the Merkle roots committed to the blockchain are authentic and originate from a trusted TEE, all on-chain transactions must be signed by a unique, TEE-bound key.
- TEE Key (
PK_TDX): Upon its first launch, the TEE generates a unique, persistent key pair. The private key is sealed and never leaves the enclave. The public key (PK_TDX) serves as its cryptographic identity. - On-Chain Key Registration: A one-time setup process occurs where a trusted attestation service verifies the TEE's quote and registers its public key (
PK_TDX) in the Solana smart contract. This links the TEE's identity to the on-chain program. - Signed State Updates: From then on, every transaction that commits a new
main_rootto the blockchain must be signed with the TEE's private key. The smart contract verifies this signature against the registeredPK_TDXbefore accepting the update.
This architecture ensures that only the authenticated TEE can update the on-chain state, preventing a malicious host from injecting fraudulent Merkle roots and creating a non-repudiable, fully verifiable audit trail.
By combining these two mechanisms, the system achieves:
- Confidentiality and Integrity: Data is protected from host access by TEE isolation and disk encryption.
- Durability: The database's own crash-recovery mechanisms (like WAL) ensure state is preserved across restarts.
- Verifiable History & Tamper-Detection: The on-chain, TEE-signed Merkle roots provide a secure, auditable history of the database state, making host-level attacks detectable.

