Skip to content

feat: Add support for OpenAI Agents SDK #637

@NiveditJain

Description

@NiveditJain

Allow developers to use the OpenAI Agents SDK with Exosphere as the underlying reliability and orchestration platform.

# Goal: familiar OpenAI patterns, Exosphere reliability
from exospherehost.openai import agents

agent = agents.Agent(
    name="research-assistant",
    instructions="You help with research tasks",
    tools=[web_search, file_reader]
)

# Runs on Exosphere with retries, state persistence, observability
result = await agents.run(agent, "Find recent papers on LLM agents")

Why This Matters

Without Exosphere With Exosphere
Agent crashes = lost progress State persisted, resumable
No retry logic Configurable retry policies
No observability Full execution trace in dashboard
Single machine Distributed across workers
Manual orchestration Graph-based multi-agent workflows

OpenAI Agents SDK Overview

The OpenAI Agents SDK provides:

  • Agents: LLM + instructions + tools
  • Tools: Functions the agent can call
  • Handoffs: Agent-to-agent delegation
  • Guardrails: Input/output validation
  • Tracing: Execution visibility
# Standard OpenAI Agents SDK usage
from agents import Agent, Runner

agent = Agent(
    name="assistant",
    instructions="You are helpful",
    tools=[my_tool]
)

result = Runner.run_sync(agent, "Hello")

Proposed Integration

Approach: Wrap OpenAI SDK, Execute on Exosphere

flowchart TB
    subgraph "Developer Code"
        DEV[from exospherehost.openai import agents]
    end
    
    subgraph "exospherehost.openai"
        WRAP[Thin wrapper over OpenAI SDK]
    end
    
    subgraph "Exosphere Runtime"
        NODE[AgentNode executes agent]
        SM[State Manager tracks progress]
    end
    
    DEV --> WRAP
    WRAP --> NODE
    NODE --> SM
Loading

Key Design Decisions

  1. Minimal wrapper: Don't reimplement OpenAI SDK, wrap it
  2. Transparent to developers: Same API they already know
  3. Opt-in reliability: Easy to add retries, persistence
  4. Graph integration: Agents as nodes in larger workflows

Prototype Scope

In Scope

  • exospherehost.openai.agents module
  • Wrap Agent and Runner classes
  • Execute agent runs as Exosphere states
  • Basic retry support
  • Execution tracking in dashboard

Out of Scope (Future)

  • Multi-agent handoffs as graph edges
  • Streaming responses through Exosphere
  • Custom tool execution as separate nodes
  • Guardrails integration

API Design (Draft)

Simple Usage

from exospherehost.openai import agents

# Define agent (same as OpenAI SDK)
assistant = agents.Agent(
    name="assistant",
    instructions="You help users with questions",
    model="gpt-4o"
)

# Run with Exosphere reliability
result = await agents.run(
    assistant, 
    "What is the capital of France?",
    retry_policy={"max_retries": 3, "strategy": "EXPONENTIAL"}
)

print(result.output)

As Part of a Graph

from exospherehost import BaseNode
from exospherehost.openai import agents

class ResearchAgent(BaseNode):
    class Inputs(BaseModel):
        query: str
    
    class Outputs(BaseModel):
        findings: str
    
    async def execute(self, inputs):
        agent = agents.Agent(
            name="researcher",
            instructions="Find information on the given topic",
            tools=[web_search]
        )
        
        result = await agents.run(agent, inputs.query)
        return self.Outputs(findings=result.output)

Implementation Sketch

Module Structure

python-sdk/
  exospherehost/
    openai/
      __init__.py      # exports agents
      agents.py        # Agent, run() wrapper
      _runner.py       # Internal execution logic

Core Wrapper

# exospherehost/openai/agents.py

from openai_agents import Agent as OpenAIAgent, Runner
from exospherehost import StateManager

# Re-export Agent class unchanged
Agent = OpenAIAgent

async def run(agent: Agent, message: str, retry_policy: dict = None):
    """
    Execute an OpenAI agent with Exosphere reliability.
    
    - Tracks execution as a state
    - Applies retry policy on failure
    - Records output for observability
    """
    # For prototype: direct execution with retry wrapper
    # Future: create state in State Manager, execute via Runtime
    
    max_retries = retry_policy.get("max_retries", 0) if retry_policy else 0
    
    for attempt in range(max_retries + 1):
        try:
            result = await Runner.run(agent, message)
            return result
        except Exception as e:
            if attempt == max_retries:
                raise
            # Apply backoff strategy
            await asyncio.sleep(2 ** attempt)

Open Questions

  1. How deep should integration go?

    • Shallow: Just wrap run() with retries
    • Deep: Each tool call is a separate Exosphere state
  2. How to handle streaming?

  3. Multi-agent handoffs?

    • Model as graph edges between AgentNodes?
    • Or let OpenAI SDK handle internally?
  4. Dependency management?

    • Make openai-agents an optional dependency
    • pip install exospherehost[openai]

Effort Estimate

Task Effort
Module structure + exports 0.5 days
Basic run() wrapper with retries 1 day
State tracking integration 1-2 days
Dashboard visibility 1 day
Documentation + examples 1 day
Testing 1 day

Total: 5-7 days for prototype


Success Criteria

  • Developer can from exospherehost.openai import agents
  • Agent execution has automatic retries
  • Executions visible in Exosphere dashboard
  • Works alongside existing graph-based workflows
  • No changes required to existing OpenAI agent code

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions