Skip to content

Latest commit

 

History

History
213 lines (156 loc) · 4.51 KB

File metadata and controls

213 lines (156 loc) · 4.51 KB

API Security Best Practices

Authentication

JWT Authentication

Include the JWT token in requests:

curl -H "Authorization: Bearer <your-jwt-token>" \
     https://inferno.example.com/api/v1/models

API Key Authentication

Include the API key in requests:

curl -H "X-API-Key: <your-api-key>" \
     https://inferno.example.com/api/v1/models

Obtaining Tokens

# Login to get JWT token
curl -X POST https://inferno.example.com/api/v1/auth/login \
     -H "Content-Type: application/json" \
     -d '{"username": "admin", "password": "your-password"}'

# Response
{
  "token": "eyJ...",
  "expires_at": "2024-01-02T00:00:00Z"
}

API Key Management

Creating API Keys

curl -X POST https://inferno.example.com/api/v1/auth/api-keys \
     -H "Authorization: Bearer <admin-token>" \
     -H "Content-Type: application/json" \
     -d '{
       "name": "production-key",
       "permissions": ["read_models", "run_inference"],
       "expires_in_days": 90
     }'

API Key Permissions

Available permissions:

  • read_models - List and view model information
  • write_models - Upload and modify models
  • delete_models - Delete models
  • run_inference - Execute model inference
  • manage_cache - Manage cache operations
  • read_metrics - View system metrics
  • write_config - Modify configuration
  • manage_users - User management
  • view_audit_logs - View audit logs
  • use_streaming - Use streaming inference
  • manage_queue - Manage job queue

Revoking API Keys

curl -X DELETE https://inferno.example.com/api/v1/auth/api-keys/<key-id> \
     -H "Authorization: Bearer <admin-token>"

Rate Limiting

Rate Limit Headers

Responses include rate limit information:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1704067200

Handling Rate Limits

When rate limited (HTTP 429), implement exponential backoff:

import time
import requests

def make_request_with_retry(url, headers, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            time.sleep(retry_after * (2 ** attempt))
            continue

        return response

    raise Exception("Max retries exceeded")

Input Validation

Request Size Limits

  • Maximum request body: 10MB (configurable)
  • Maximum prompt length: 10,000 characters (configurable)

Content Types

Always specify the content type:

curl -X POST https://inferno.example.com/api/v1/inference \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer <token>" \
     -d '{"prompt": "Hello, world!", "model": "llama-7b"}'

Error Handling

Error Response Format

{
  "error": {
    "code": "AUTHENTICATION_FAILED",
    "message": "Invalid API key",
    "details": null
  }
}

Common Error Codes

HTTP Status Code Description
401 AUTHENTICATION_FAILED Invalid or missing credentials
403 PERMISSION_DENIED Insufficient permissions
429 RATE_LIMITED Too many requests
400 VALIDATION_ERROR Invalid input
500 INTERNAL_ERROR Server error

CORS Configuration

Allowed Origins

Configure allowed origins in your config:

[server]
cors_origins = ["https://your-frontend.com"]
cors_methods = ["GET", "POST", "DELETE"]
cors_headers = ["Authorization", "Content-Type", "X-API-Key"]

Secure Communication

TLS Requirements

  • Minimum TLS version: 1.2
  • Recommended: TLS 1.3
  • Always use HTTPS in production

Certificate Validation

When making requests, always validate certificates:

# Good - validates certificates
requests.get("https://inferno.example.com", verify=True)

# Bad - disables certificate validation
# requests.get("https://inferno.example.com", verify=False)

Logging and Monitoring

Request Logging

All API requests are logged with:

  • Timestamp
  • Client IP
  • User ID (if authenticated)
  • Endpoint
  • Response status
  • Response time

Sensitive Data

Sensitive data is automatically redacted from logs:

  • API keys
  • JWT tokens
  • Passwords
  • Email addresses

Best Practices Checklist

  • Always use HTTPS
  • Rotate API keys regularly
  • Use minimum required permissions
  • Implement rate limiting on client side
  • Handle errors gracefully
  • Log and monitor API usage
  • Validate all inputs
  • Keep credentials out of version control