Skip to content

Multi-server support: Externalize attestation session states #82

@AnthonyRonning

Description

@AnthonyRonning

Problem Description

The application stores attestation session states in server-local memory, preventing session continuity across multiple servers. This affects the secure enclave attestation flow and encrypted communication sessions.

Current Implementation

  • Location:
    • src/main.rs line 384
    • src/web/attestation_routes.rs lines 24-52
  • Structure:
// In AppState
session_states: Arc<tokio::sync::RwLock<HashMap<Uuid, SessionState>>>,

// SessionState structure
pub struct SessionState {
    pub session_key: [u8; 32], // ChaCha20Poly1305 key
}
  • Usage: Stores encryption keys for secure communication after attestation handshake

Impact on Multi-Server Deployment

Without Changes (Sticky Sessions)

  • Will work if user maintains connection to same server
  • ⚠️ Graceful degradation if server changes:
    • User must re-establish secure session (automatic in most cases)
    • Brief interruption but transparent to user
    • No login required, just new handshake

User Experience Impact

LOW-MEDIUM SEVERITY - Session re-establishment is automatic:

  • ~1-2 second delay for new handshake
  • Transparent to user (handled by client SDK)
  • No data loss (client can retry with new session)

Proposed Solutions

Option 1: Redis-based Session Store

Implementation:

use redis::{Client, Commands, RedisResult};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
pub struct SessionState {
    pub session_key: Vec<u8>, // Serialized as Vec for Redis
    pub created_at: i64,
    pub last_accessed: i64,
}

pub struct RedisSessionStore {
    client: Client,
    ttl_seconds: u64, // e.g., 3600 for 1 hour
}

impl RedisSessionStore {
    pub async fn store_session(&self, session_id: Uuid, state: SessionState) -> Result<(), Error> {
        let mut conn = self.client.get_connection()?;
        let key = format\!("session:{}", session_id);
        let value = serde_json::to_string(&state)?;
        conn.set_ex(&key, value, self.ttl_seconds)?;
        Ok(())
    }
    
    pub async fn get_session(&self, session_id: &Uuid) -> Result<Option<SessionState>, Error> {
        let mut conn = self.client.get_connection()?;
        let key = format\!("session:{}", session_id);
        let value: Option<String> = conn.get(&key)?;
        
        match value {
            Some(v) => {
                // Refresh TTL on access
                conn.expire(&key, self.ttl_seconds as i64)?;
                Ok(Some(serde_json::from_str(&v)?))
            },
            None => Ok(None),
        }
    }
    
    pub async fn delete_session(&self, session_id: &Uuid) -> Result<(), Error> {
        let mut conn = self.client.get_connection()?;
        let key = format\!("session:{}", session_id);
        conn.del(&key)?;
        Ok(())
    }
}

Security Considerations:

// Encrypt session keys before storing in Redis
use aes_gcm::{Aes256Gcm, Key, Nonce};

impl RedisSessionStore {
    pub async fn store_session_encrypted(&self, session_id: Uuid, state: SessionState, master_key: &[u8]) -> Result<(), Error> {
        // Encrypt session_key with master_key before storing
        let cipher = Aes256Gcm::new(Key::from_slice(master_key));
        let nonce = generate_nonce();
        let encrypted_key = cipher.encrypt(&nonce, state.session_key.as_ref())?;
        
        let encrypted_state = SessionState {
            session_key: encrypted_key,
            ..state
        };
        
        self.store_session(session_id, encrypted_state).await
    }
}

Pros:

  • ✅ Fast access (< 1ms typical)
  • ✅ Built-in TTL with automatic cleanup
  • ✅ Supports session extension on access
  • ✅ Can use Redis Cluster for HA

Cons:

  • ❌ Requires Redis infrastructure
  • ❌ Session keys in memory (even if encrypted)
  • ❌ Additional attack surface

Option 2: PostgreSQL-based Session Store

Implementation:

-- Migration: Create session_states table
CREATE TABLE session_states (
    session_id UUID PRIMARY KEY,
    encrypted_session_key BYTEA NOT NULL,
    nonce BYTEA NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    last_accessed TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    expires_at TIMESTAMP WITH TIME ZONE NOT NULL
);

-- Indexes for performance
CREATE INDEX idx_session_states_expires_at ON session_states(expires_at);
CREATE INDEX idx_session_states_last_accessed ON session_states(last_accessed);
use diesel::prelude::*;
use chrono::{DateTime, Utc, Duration};

#[derive(Queryable, Insertable)]
#[diesel(table_name = session_states)]
pub struct DbSessionState {
    pub session_id: Uuid,
    pub encrypted_session_key: Vec<u8>,
    pub nonce: Vec<u8>,
    pub created_at: DateTime<Utc>,
    pub last_accessed: DateTime<Utc>,
    pub expires_at: DateTime<Utc>,
}

impl PostgresSessionStore {
    pub async fn store_session(&self, session_id: Uuid, session_key: [u8; 32]) -> Result<(), Error> {
        // Encrypt session key with master key
        let (encrypted_key, nonce) = self.encrypt_session_key(&session_key)?;
        
        let new_session = DbSessionState {
            session_id,
            encrypted_session_key: encrypted_key,
            nonce,
            created_at: Utc::now(),
            last_accessed: Utc::now(),
            expires_at: Utc::now() + Duration::hours(1),
        };
        
        diesel::insert_into(session_states::table)
            .values(&new_session)
            .on_conflict(session_states::session_id)
            .do_update()
            .set((
                session_states::encrypted_session_key.eq(&new_session.encrypted_session_key),
                session_states::last_accessed.eq(Utc::now()),
                session_states::expires_at.eq(Utc::now() + Duration::hours(1)),
            ))
            .execute(&self.conn)?;
            
        Ok(())
    }
    
    pub async fn get_session(&self, session_id: &Uuid) -> Result<Option<[u8; 32]>, Error> {
        let result = session_states::table
            .filter(session_states::session_id.eq(session_id))
            .filter(session_states::expires_at.gt(Utc::now()))
            .first::<DbSessionState>(&self.conn)
            .optional()?;
            
        match result {
            Some(state) => {
                // Update last_accessed
                diesel::update(session_states::table)
                    .filter(session_states::session_id.eq(session_id))
                    .set(session_states::last_accessed.eq(Utc::now()))
                    .execute(&self.conn)?;
                
                // Decrypt and return session key
                let session_key = self.decrypt_session_key(&state.encrypted_session_key, &state.nonce)?;
                Ok(Some(session_key))
            },
            None => Ok(None),
        }
    }
}

Cleanup Task:

// Run every 15 minutes
async fn cleanup_expired_sessions(db: &PgConnection) -> Result<usize, Error> {
    diesel::delete(session_states::table)
        .filter(session_states::expires_at.lt(Utc::now()))
        .execute(db)
}

Pros:

  • ✅ No additional infrastructure
  • ✅ Encrypted at rest in database
  • ✅ Audit trail capability
  • ✅ Can correlate with user activity

Cons:

  • ❌ Higher latency (5-10ms typical)
  • ❌ More database load
  • ❌ Requires cleanup task

Security Considerations

Encryption Requirements

  1. Always encrypt session keys before external storage
  2. Use hardware security module (HSM) or AWS KMS for master key if possible
  3. Rotate master keys periodically
  4. Implement key derivation for per-session encryption keys

Session Security

pub struct SecureSessionStore {
    store: Box<dyn SessionStore>,
    master_key: Arc<SecretKey>,
    
    // Security settings
    max_session_age: Duration,
    max_idle_time: Duration,
    require_attestation: bool,
}

impl SecureSessionStore {
    pub async fn validate_session(&self, session_id: &Uuid, request_attestation: Option<&Attestation>) -> Result<bool, Error> {
        if self.require_attestation {
            // Verify attestation matches session
            self.verify_attestation(session_id, request_attestation)?;
        }
        
        // Check session age and idle time
        let metadata = self.get_session_metadata(session_id).await?;
        if metadata.age() > self.max_session_age {
            self.invalidate_session(session_id).await?;
            return Ok(false);
        }
        
        Ok(true)
    }
}

Implementation Steps

  1. Create session store trait:
#[async_trait]
pub trait SessionStore: Send + Sync {
    async fn store(&self, session_id: Uuid, session_key: [u8; 32]) -> Result<(), Error>;
    async fn get(&self, session_id: &Uuid) -> Result<Option<[u8; 32]>, Error>;
    async fn delete(&self, session_id: &Uuid) -> Result<(), Error>;
    async fn extend_ttl(&self, session_id: &Uuid) -> Result<(), Error>;
}
  1. Update attestation routes to use injected store

  2. Add encryption layer for session keys

  3. Implement monitoring for session metrics

Recommendation

For this use case: PostgreSQL is recommended.

Reasoning:

  • Session states are security-critical
  • PostgreSQL provides better audit capabilities
  • Encryption at rest is easier to manage
  • One less infrastructure component
  • Acceptable latency for session operations

Testing Plan

  1. Security tests:

    • Verify encryption/decryption
    • Test session hijacking prevention
    • Validate TTL enforcement
  2. Performance tests:

    • Measure latency impact
    • Test under concurrent load
    • Verify cleanup performance
  3. Failover tests:

    • Server restart during active session
    • Database failover scenarios
    • Network partition handling

Monitoring

Add metrics for:

  • Session creation rate
  • Session cache hit rate
  • Session validation latency
  • Expired session cleanup rate
  • Failed session validations

Migration Strategy

  1. Dual-write to both stores initially
  2. Read from external store, fallback to memory
  3. Monitor for inconsistencies
  4. Remove in-memory store

Performance Optimization

// Consider adding local cache with short TTL
pub struct CachedSessionStore {
    external_store: Arc<dyn SessionStore>,
    local_cache: Arc<RwLock<HashMap<Uuid, CachedSession>>>,
    cache_ttl: Duration, // e.g., 30 seconds
}

struct CachedSession {
    key: [u8; 32],
    cached_at: Instant,
}

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions