Skip to content

Latest commit

 

History

History
206 lines (164 loc) · 5.88 KB

File metadata and controls

206 lines (164 loc) · 5.88 KB

MiroFish — Original Design Setup

How MiroFish was designed to work out of the box.

Architecture

User uploads seed document (PDF/MD/TXT)
  |
  v
[1] Alibaba Qwen-plus LLM --> Ontology (10 entity types + relationships)
  |
  v
[2] Zep Cloud --> Knowledge Graph (entities, edges, memory)
  |
  v
[3] Qwen-plus --> 93+ Agent Profiles (persona, behavior, stance)
  |
  v
[4] OASIS Engine (camel-ai) --> Multi-Agent Simulation (Twitter + Reddit)
  |
  v
[5] Qwen-plus --> Prediction Report (ReACT agent with tools)
  |
  v
[6] Interview Mode --> Chat with any simulated agent

Prerequisites

Component Version Purpose
Python 3.11-3.12 Backend runtime
Node.js 18+ Frontend runtime
uv latest Python package manager
Docker 20+ Optional: containerized deployment

Step 1: Get API Keys

Alibaba Bailian (LLM)

  1. Go to https://bailian.console.aliyun.com/
  2. Register an Alibaba Cloud account
  3. Enable the Bailian (百炼) service
  4. Create an API key under "API密钥管理"
  5. Note your key — it looks like: sk-xxxxxxxxxxxxxxxxxxxxxxxx

Model options:

  • qwen-plus — RECOMMENDED (balanced quality/speed/cost)
  • qwen-turbo — Faster, cheaper (good for testing)
  • qwen-max — Highest quality (expensive)

API endpoint: https://dashscope.aliyuncs.com/compatible-mode/v1

Pricing: Token-based. A full 93-agent simulation costs ~$1-3 with qwen-plus.

Zep Cloud (Knowledge Graph + Memory)

  1. Go to https://app.getzep.com/
  2. Sign up (free, no credit card)
  3. Create a project
  4. Copy your API key from Settings → API Keys
  5. Key format: z_xxxxx.yyyyy

Free tier includes:

  • 1,000 episodes/month (enough for ~5 simulations)
  • Up to 2,000 nodes per graph
  • Basic search and memory features

Paid tier ($25/month):

  • 20,000 credits
  • Custom entity types (5-20)
  • Higher rate limits

Step 2: Configure Environment

cd MiroFish
cp .env.example .env

Edit .env:

# LLM — Alibaba Bailian Qwen-plus (ORIGINAL DESIGN)
LLM_API_KEY=sk-your-alibaba-key
LLM_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
LLM_MODEL_NAME=qwen-plus

# Zep Cloud — Knowledge Graph
ZEP_API_KEY=z_your-zep-key

# Optional: Boost LLM (faster model for bulk operations)
# LLM_BOOST_API_KEY=sk-your-key
# LLM_BOOST_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
# LLM_BOOST_MODEL_NAME=qwen-turbo

Step 3: Install Dependencies

# One command — installs Node + Python deps
npm run setup:all

# Or step by step:
npm run setup          # Node deps (root + frontend)
npm run setup:backend  # Python deps (creates venv via uv)

Step 4: Start Services

npm run dev

This starts:

Step 5: Use the UI

  1. Open http://localhost:3000
  2. Upload a seed document (PDF, MD, or TXT)
  3. Enter your prediction requirement in natural language
  4. Click through the 5 steps:
    • Step 1: Graph Building (ontology + Zep knowledge graph)
    • Step 2: Environment Setup (agent profiles + simulation config)
    • Step 3: Simulation (run OASIS, watch real-time)
    • Step 4: Report (ReACT agent generates analysis)
    • Step 5: Interaction (interview agents, chat with report agent)

Docker Deployment (Alternative)

cp .env.example .env
# Edit .env with your keys

docker compose up -d
# Frontend: http://localhost:3000
# Backend: http://localhost:5001

How the Simulation Engine Works

OASIS Framework (camel-ai)

  • Library: camel-oasis==0.2.5 + camel-ai==0.2.78
  • Runs as subprocess: run_parallel_simulation.py
  • Dual-platform: Twitter + Reddit simultaneously
  • Each agent has: persona, memory, behavior config, stance

Agent Actions (Twitter)

CREATE_POST, LIKE_POST, REPOST, FOLLOW, DO_NOTHING, QUOTE_POST

Agent Actions (Reddit)

LIKE_POST, DISLIKE_POST, CREATE_POST, CREATE_COMMENT,
LIKE_COMMENT, DISLIKE_COMMENT, SEARCH_POSTS, SEARCH_USER,
TREND, REFRESH, DO_NOTHING, FOLLOW, MUTE

Simulation Time Model

  • Based on China Standard Time (CST) by default
  • Peak hours: 19-22 (1.5x activity multiplier)
  • Dead hours: 0-5 (0.05x activity)
  • Work hours: 9-18 (0.7x activity)
  • LLM adjusts per simulation context

Files Generated Per Simulation

/backend/uploads/simulations/<sim_id>/
  state.json              # Simulation metadata
  reddit_profiles.json    # Agent profiles (Reddit format)
  twitter_profiles.csv    # Agent profiles (Twitter format)
  simulation_config.json  # Full config (time, events, agents)
  reddit/actions.jsonl    # All Reddit actions
  twitter/actions.jsonl   # All Twitter actions
  simulation.log          # Process log
  run_state.json          # Real-time status

Report Agent (ReACT Pattern)

The report agent uses Reasoning + Acting:

  1. Plans table of contents
  2. For each section: reasons → calls tools → writes → reflects
  3. Available tools:
    • search(query) — Full-text Zep graph search
    • insight_forge(question) — Deep multi-dimensional search
    • panorama_search(topic) — Broad search including expired data
    • get_statistics() — Simulation statistics

Config:

REPORT_AGENT_MAX_TOOL_CALLS = 5       # Tools per section
REPORT_AGENT_MAX_REFLECTION_ROUNDS = 2 # Iterations
REPORT_AGENT_TEMPERATURE = 0.5         # Focused writing

Key Design Decisions

  1. Alibaba Qwen-plus chosen because: Native Chinese language support, OpenAI-compatible API, JSON mode support, good balance of quality/cost
  2. Zep Cloud chosen because: Graph-based memory, entity extraction, temporal memory, free tier sufficient
  3. OASIS chosen because: Purpose-built for social media simulation, dual-platform support, ReACT agent compatibility
  4. response_format: json_object — Used throughout for structured LLM outputs
  5. Chinese timezone defaults — All activity patterns based on CST (adjustable by LLM)