TaxHacker Claw Edition — Integration Architecture

Version: 1.0 | Date: 2026-04-07 | Author: Himanshu Jain + Claude Code

1. WHAT IS CLAW (And What It Is Not)

TaxHacker is the accounting engine: stores transactions, parses invoices, computes GST, generates reports. It is a database + API with a web UI. No conversational interface.

Claw is the conversational skin: it lets you talk to TaxHacker via WhatsApp or Telegram. Claw does NOT store financial data. It does NOT compute GST. It translates natural language into TaxHacker API calls and formats the responses as human-friendly messages.

The division of responsibility is absolute:

  CLAW owns:                    TAXHACKER owns:
  -------------                 --------------------
  Conversation state            All transaction data
  Intent classification         GST calculations
  LLM context window            GSTR-1/GSTR-3B reports
  Message formatting            File storage & AI analysis
  Channel adapters              Authentication & rate limiting
  User session memory           Business logic & validation
  Alert scheduling              Database (PostgreSQL + pgvector)

2. SYSTEM DESIGN

[User WhatsApp/Telegram]
        |
        v
[CLAW LAYER - nano-claw Docker container]
  - Channel adapters (Telegram Bot API, Evolution API for WA)
  - Intent Router (LLM call #1: classify + extract params)
  - Session Store (Redis, 8-turn history + context snapshot)
  - Intent Handlers (one per supported action)
  - Response Formatter (LLM call #2: natural language reply)
  - LLM: OpenRouter -> gemini-flash-2.0 (primary), mistral-small (fallback)
        |
        | X-Agent-Key auth (SHA-256)
        | Internal Docker network (no public hop)
        v
[TAXHACKER AGENT API - Next.js 15 on port 3000]
  GET  /api/agent/transactions
  POST /api/agent/transactions
  GET  /api/agent/transactions/:id
  PUT  /api/agent/transactions/:id    (MISSING - add in Phase 3)
  POST /api/agent/files
  POST /api/agent/analyze
  GET  /api/agent/digest
  GET  /api/agent/search
  GET  /api/agent/gstr1
  GET  /api/agent/gstr3b
  GET  /api/agent/setup
        |
        v
[PostgreSQL 18 + pgvector + File System + LangChain AI]

3. DATA FLOW EXAMPLES

3.1 Add Transaction by Text

User (Telegram): "Paid 1450 to Swiggy for team lunch yesterday"
  -> Channel adapter normalizes message
  -> Session manager loads last 8 turns + context snapshot
  -> Intent Router (LLM call #1, ~200 tokens):
       intent: "add_transaction"
       extracted: { merchant: "Swiggy", amount: 145000, date: "2026-04-06",
                    type: "expense", category_hint: "food/meals" }
  -> Handler: POST /api/agent/transactions
  -> TaxHacker creates transaction, returns id
  -> Response Formatter (LLM call #2, ~100 tokens):
       "Done. Recorded Rs. 1,450 expense at Swiggy (Meals & Entertainment)
        for Apr 6. Food spend this month: Rs. 8,340."
  -> Sent back to user

3.2 Receipt Upload Flow

User sends photo of invoice
  -> Channel downloads media buffer
  -> Intent: upload_receipt (auto-detected from attachment)
  -> Step 1: POST /api/agent/files (multipart) -> { fileId }
  -> Step 2: POST /api/agent/analyze { fileId } -> extracted data
  -> Step 3: POST /api/agent/transactions (from extracted data)
  -> Reply: "Invoice from Zomato Platform Pvt Ltd processed.
             Rs. 2,360 (incl. GST Rs. 360 at 18%). GSTIN: 27AAJCZ...
             Saved under Business Expenses. Want to change anything?"

3.3 Spend Query Flow

User: "How much did I spend on travel this month?"
  -> Intent: query_spend, filters: { category: travel, dateFrom: 2026-04-01 }
  -> GET /api/agent/transactions?categoryCode=travel&dateFrom=...
  -> Reply: "Travel spend Apr 1-7: Rs. 45,000
             Top entries: Uber Rs. 12,000 | IndiGo Rs. 28,000 | Parking Rs. 5,000"

4. LLM CONTEXT STRUCTURE

Claw makes exactly 2 LLM calls per user message: classify intent, then format response. Context is bounded to keep token spend predictable.

4.1 System Prompt (~400 tokens, static per session)

You are a financial assistant for {businessName}, an Indian SME.
GSTIN: {gstin}. Currency: INR. Today: {date}. FY: April to March.

You help the user:
- Add income and expenses by natural language
- Query spending by category, date range, merchant
- Understand GST liability
- Get daily/monthly/quarterly summaries

Rules:
- Respond in the same language the user writes in (Hindi, English, Hinglish all fine)
- All amounts are INR unless stated. Storage format is paisa (Rs. 1,450 = 145000).
- Dates default to today. "last month" = calendar month, not rolling 30 days.
- Available categories: {category_list}
- Available projects: {project_list}

Return intent classification as JSON. Never make up transaction data.

4.2 Context Snapshot (~300 tokens, Redis-cached, 1h TTL)

Populated by GET /api/agent/digest. Lets the LLM answer follow-up questions without a live API call on every message.

{
  "month_summary": {
    "income": 125000000,
    "expense": 89000000,
    "net": 36000000,
    "gst_collected": 12500000,
    "gst_paid": 8900000
  },
  "top_categories": ["travel", "meals_entertainment", "software"],
  "recent_merchants": ["Swiggy", "Uber", "AWS", "Zomato"],
  "pending_gst_period": "032026",
  "open_invoice_count": 3
}

4.3 Conversation History (last 8 turns, ~500 tokens max)

Bounded to 8 turns. Older messages drop FIFO. Financial data is authoritative in TaxHacker's database, not in chat history.

4.4 Total Context Budget Per Request

Component	Tokens	Refresh Cadence
System prompt	~400	Static per session
Context snapshot	~300	1h TTL (Redis)
Conversation history	~500	Every message
Current message + intent schema	~200	Every message
Total per request	~1,400	—

At Gemini Flash 2.0 pricing (~$0.075/1M tokens): $0.0001 per message. 1,000 messages/day = ~$3/month.

5. INTEGRATION POINTS WITH EXISTING AGENT API

5.1 Claw-to-TaxHacker Endpoint Map

Claw Intent	HTTP Method + Path	Notes
add_transaction	POST /api/agent/transactions	total in paisa (integer)
edit_transaction	PUT /api/agent/transactions/:id	MISSING - Phase 3 addition
query_spend	GET /api/agent/transactions	search, dateFrom, dateTo, categoryCode
upload_receipt	POST /api/agent/files	multipart/form-data, returns fileId
analyze_receipt	POST /api/agent/analyze	{ fileId } -> structured extraction
daily_digest	GET /api/agent/digest?date=today	stats + 10 recent transactions
monthly_summary	GET /api/agent/digest?dateFrom=&dateTo=	custom range
gst_summary	GET /api/agent/gstr1?period=MMYYYY	full GSTR-1 data
semantic_search	GET /api/agent/search?q=	pgvector cosine similarity
session_bootstrap	GET /api/agent/setup	categories, projects, settings

5.2 Authentication Contract

Header:     X-Agent-Key: thk_{64 hex chars}
Server:     SHA-256 hash incoming key, timing-safe compare against stored hash
Storage:    Settings table, code "agent_api_key", encrypted at rest via decrypt()
Rate limit: 60 req/min per user (in-process Map)
Mode:       Self-hosted only. Cloud mode returns 403.

5.3 ONE Missing Endpoint (the only TaxHacker code change needed)

PUT /api/agent/transactions/:id does not exist. Adding it requires ~40 lines following the same pattern as POST handler. This is the only TaxHacker code change needed across all Claw phases.

5.4 Zero TaxHacker Changes for MVP

Phases 1 and 2 (working Telegram bot) require no changes to TaxHacker. TaxHacker is a black box that Claw calls over HTTP.

6. SCOPE: MVP vs FULL

MVP - Basic CRUD via Chat (Phases 1-3, 5.5 developer days)

Capability	Input	Output
Add expense	"Paid 500 to Swiggy"	Confirmation + category
Add income	"Received 50K from client"	Confirmation
Upload receipt photo	Photo or PDF	AI-extracted + saved + confirmed
Query spend	"How much on travel this month?"	Amount + top transactions
Daily digest	"What happened today?"	Income/expense count + totals
GST query	"What's my GST this quarter?"	GSTR-1 summary figures
Edit entry	"Make that 600 not 500"	Updated confirmation
Semantic search	"Show petrol expenses"	Matching transactions

Not in MVP: bulk CSV import, PDF report delivery via chat, scheduled proactive alerts, WhatsApp Business API (Telegram first), voice note transcription.

Full - Analysis, Recommendations, Alerts (Phase 4+, weeks 5-12)

Capability	What It Adds
Proactive alerts	"No expenses logged in 3 days"
Tax deadline warnings	"GSTR-1 due in 5 days. Estimated: Rs. 28,400."
Spend anomaly detection	"Travel up 80% vs last month"
CA report sharing	GSTR-1 PDF sent to CA's WhatsApp number
Voice note support	Audio -> Whisper transcription -> intent pipeline
Reminder scheduling	"Remind me to record petrol every Friday 6pm"
WhatsApp Business API	Production channel via Evolution API
AATO projection	"At this rate you exceed Rs. 40L by December"
Email receipt forwarding	Forward bill email -> auto-processed
Hindi/Hinglish auto-response	Detect locale, respond in kind

7. CLAW INTERNAL FILE STRUCTURE

nano-claw/
├── src/
│   ├── index.ts                  Entry point, starts all channel listeners
│   ├── channels/
│   │   ├── telegram.ts           Telegram Bot API adapter (Phase 2)
│   │   └── whatsapp.ts           Evolution API adapter (Phase 4)
│   ├── core/
│   │   ├── intent.ts             LLM call #1 - classify intent + extract params
│   │   ├── context.ts            Build LLM context from session + API snapshot
│   │   ├── session.ts            Redis session load/save/expire
│   │   └── formatter.ts          LLM call #2 - format API response as message
│   ├── handlers/
│   │   ├── add_transaction.ts
│   │   ├── query_spend.ts
│   │   ├── upload_receipt.ts
│   │   ├── daily_digest.ts
│   │   ├── gst_summary.ts
│   │   ├── edit_transaction.ts
│   │   └── semantic_search.ts
│   ├── api/
│   │   └── taxhacker.ts          Typed wrapper for all Agent API endpoints
│   └── config.ts                 Env vars, LLM config, TaxHacker base URL
├── docker-compose.yml
├── .env.example
└── package.json

Session Shape (Redis key: `session:{userId}`)

interface ClawSession {
  userId:         string
  channelId:      string           // Telegram chat ID or WhatsApp number
  history:        Message[]        // last 8 turns, FIFO eviction
  contextSnap:    ContextSnapshot
  contextSnapAge: number           // unix ms, refresh when > 1h old
  lastActive:     number
  locale:         "en" | "hi" | "hinglish"
}

interface ContextSnapshot {
  monthIncome:        number       // paisa
  monthExpense:       number       // paisa
  gstCollected:       number       // paisa
  gstPaid:            number       // paisa
  topCategories:      string[]
  recentMerchants:    string[]
  openInvoiceCount:   number
  pendingGstPeriod:   string | null
}

8. DEPLOYMENT TOPOLOGY

OVH VPS (8 cores, 22GB RAM)
├── Coolify
│   ├── taxhacker    port 3000  ->  taxhacker.yourdomain.com  (Traefik)
│   │   ├── Next.js 15 app
│   │   └── PostgreSQL 18 with pgvector
│   │
│   └── nano-claw   port 4000  ->  no public exposure needed
│       ├── Node.js 24 service
│       └── Redis on port 6379 (shared with existing VPS services)
│
└── [existing] SearXNG, Crawl4ai, n8n, LightPanda

Claw communicates with TaxHacker over Docker's internal network via http://taxhacker:3000. No public internet hop. Telegram/WhatsApp push webhooks to Claw via Traefik routing to /claw/webhook.

Environment Variables (`.env` for nano-claw)

TAXHACKER_BASE_URL=http://taxhacker:3000
TAXHACKER_AGENT_KEY=thk_...
OPENROUTER_API_KEY=...
TELEGRAM_BOT_TOKEN=...
REDIS_URL=redis://redis:6379
CLAW_WEBHOOK_SECRET=...
EVOLUTION_API_URL=...     # Phase 4 only
EVOLUTION_API_KEY=...     # Phase 4 only

9. IMPLEMENTATION TIMELINE

Phase 1 - Core Pipeline (Week 1, ~2 developer days)

Task	Hours	Output File
Scaffold repo (TypeScript, Docker Compose)	2h	Project structure
TaxHacker API wrapper, typed client for all 9 endpoints	3h	api/taxhacker.ts
Redis session manager (load/save/expire)	2h	core/session.ts
LLM context builder	2h	core/context.ts
Intent classifier (LLM call #1, structured JSON output)	3h	core/intent.ts
Response formatter (LLM call #2, natural language)	2h	core/formatter.ts

Deliverable: Full pipeline unit-tested end-to-end without a real channel.

Phase 2 - Telegram Bot (Week 2, ~2 developer days)

Task	Hours	Output
Telegram channel adapter (webhook receive + send)	3h	channels/telegram.ts
Handler: add_transaction	2h	Text -> create transaction
Handler: query_spend (date/category filter)	2h	Spend summary reply
Handler: daily_digest	1h	"What happened today?"
Handler: upload_receipt (3-step: upload, analyze, save)	3h	Photo processing
Deploy to VPS, end-to-end test with real Telegram	2h	Live bot

Deliverable: Working Telegram bot deployed on VPS. Personal daily use begins.

Phase 3 - Polish + Full MVP Scope (Week 3, ~1.5 developer days)

Task	Hours	Notes
Add PUT /api/agent/transactions/:id to TaxHacker	1.5h	Only TaxHacker code change
Handler: edit_transaction	2h	"Change that to Rs. 600"
Handler: gst_summary	1.5h	Monthly/quarterly GST
Handler: semantic_search	1h	"Show petrol expenses"
Hindi/Hinglish locale detection + prompt tuning	2h	Respond in kind
Error handling (never surface raw API errors to user)	2h	Graceful failures

Deliverable: All 8 intents working. Full MVP scope complete.

Phase 4 - WhatsApp + Proactive Features (Weeks 5-8, ~8 developer days)

Task	Hours	Outcome
Evolution API WhatsApp adapter	3h	WA channel live
Cron-based proactive alert system	4h	GST reminders, inactivity alerts
Spend anomaly detection (MTD vs prior month)	3h	Trend warnings
GSTR-1 PDF delivery via chat	4h	CA sharing use case
Voice note pipeline (Whisper transcription)	4h	Audio messages supported

10. EFFORT AND COST SUMMARY

Phase	Developer Days	LLM Cost/month	Infra Cost
Phase 1 - Foundation	2	—	—
Phase 2 - Telegram MVP	2	~$3	0 (Redis already on VPS)
Phase 3 - Full MVP	1.5	~$3	0
Phase 4 - WhatsApp + Alerts	8	~$5	0 (Evolution API self-hosted)
Total MVP (P1+P2+P3)	5.5 days	~$3/month	free

11. RISKS AND MITIGATIONS

Risk	Probability	Mitigation
LLM misclassifies intent	Medium	Echo extracted data before writing; edit handler covers corrections
LLM hallucinates amounts	Low	Always confirm extracted values in message before saving
WhatsApp Business API rejection by Meta	Medium	Telegram-first; WA is Phase 4, not Phase 2
Redis session loss on VPS restart	Low	Redis AOF persistence; history is convenience not source of truth
TaxHacker rate limit (60 req/min)	Low	Single-user personal use; 60/min is ample
VPS data residency in France	High	Already flagged in TAXHACKER_DNA.md; migrate to Mumbai before multi-tenant

12. WHAT CHANGES WHEN CLAW IS LIVE

Before Claw: "Upload your invoices to a web app when you get around to it."

After Claw: "Forward any bill to this number right now. It is filed automatically."

That is the difference between software and a service. It is also the pitch that closes at Rs. 5,000/month per client with zero training required, because everyone already knows how to forward a WhatsApp photo.

The accounting engine is built. Claw is the last-mile delivery mechanism.

KEY ARCHITECTURAL DECISIONS

Separation boundary: Claw owns nothing financial. It is a stateless translation layer. Redeploy, reset, or replace Claw without touching any accounting data. TaxHacker is the single source of truth.

2 LLM calls per message maximum: Call 1 classifies intent and extracts structured parameters. Call 2 formats the response naturally. Keeps latency under 2 seconds and cost at $3/month for personal use.

Context snapshot pattern: Rather than querying TaxHacker on every message, a snapshot is cached in Redis for 1 hour via /api/agent/digest. 90% of conversations need no extra API call for context.

Telegram before WhatsApp: Telegram Bot API requires no approval, no business registration, no Meta review. Working bot in hours. WhatsApp Business API takes weeks of Meta approval. Phase 2 ships fully functional Telegram. WhatsApp is Phase 4 upgrade.

One missing endpoint in TaxHacker: PUT /api/agent/transactions/:id is the only TaxHacker code change needed across all Claw phases. ~40 lines.

Revenue unlock: The service pitch changes from "here is a web app" to "forward your bills here and I handle the rest." That is what closes at Rs. 5,000/month.

This document is the implementation-ready specification for TaxHacker Claw Edition. Next action: scaffold the nano-claw repository and begin Phase 1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TaxHacker Claw Edition — Integration Architecture

1. WHAT IS CLAW (And What It Is Not)

2. SYSTEM DESIGN

3. DATA FLOW EXAMPLES

3.1 Add Transaction by Text

3.2 Receipt Upload Flow

3.3 Spend Query Flow

4. LLM CONTEXT STRUCTURE

4.1 System Prompt (~400 tokens, static per session)

4.2 Context Snapshot (~300 tokens, Redis-cached, 1h TTL)

4.3 Conversation History (last 8 turns, ~500 tokens max)

4.4 Total Context Budget Per Request

5. INTEGRATION POINTS WITH EXISTING AGENT API

5.1 Claw-to-TaxHacker Endpoint Map

5.2 Authentication Contract

5.3 ONE Missing Endpoint (the only TaxHacker code change needed)

5.4 Zero TaxHacker Changes for MVP

6. SCOPE: MVP vs FULL

MVP - Basic CRUD via Chat (Phases 1-3, 5.5 developer days)

Full - Analysis, Recommendations, Alerts (Phase 4+, weeks 5-12)

7. CLAW INTERNAL FILE STRUCTURE

Session Shape (Redis key: `session:{userId}`)

8. DEPLOYMENT TOPOLOGY

Environment Variables (`.env` for nano-claw)

9. IMPLEMENTATION TIMELINE

Phase 1 - Core Pipeline (Week 1, ~2 developer days)

Phase 2 - Telegram Bot (Week 2, ~2 developer days)

Phase 3 - Polish + Full MVP Scope (Week 3, ~1.5 developer days)

Phase 4 - WhatsApp + Proactive Features (Weeks 5-8, ~8 developer days)

10. EFFORT AND COST SUMMARY

11. RISKS AND MITIGATIONS

12. WHAT CHANGES WHEN CLAW IS LIVE

KEY ARCHITECTURAL DECISIONS

FilesExpand file tree

CLAW_INTEGRATION_ARCHITECTURE.md

Latest commit

History

CLAW_INTEGRATION_ARCHITECTURE.md

File metadata and controls

TaxHacker Claw Edition — Integration Architecture

1. WHAT IS CLAW (And What It Is Not)

2. SYSTEM DESIGN

3. DATA FLOW EXAMPLES

3.1 Add Transaction by Text

3.2 Receipt Upload Flow

3.3 Spend Query Flow

4. LLM CONTEXT STRUCTURE

4.1 System Prompt (~400 tokens, static per session)

4.2 Context Snapshot (~300 tokens, Redis-cached, 1h TTL)

4.3 Conversation History (last 8 turns, ~500 tokens max)

4.4 Total Context Budget Per Request

5. INTEGRATION POINTS WITH EXISTING AGENT API

5.1 Claw-to-TaxHacker Endpoint Map

5.2 Authentication Contract

5.3 ONE Missing Endpoint (the only TaxHacker code change needed)

5.4 Zero TaxHacker Changes for MVP

6. SCOPE: MVP vs FULL

MVP - Basic CRUD via Chat (Phases 1-3, 5.5 developer days)

Full - Analysis, Recommendations, Alerts (Phase 4+, weeks 5-12)

7. CLAW INTERNAL FILE STRUCTURE

Session Shape (Redis key: session:{userId})

8. DEPLOYMENT TOPOLOGY

Environment Variables (.env for nano-claw)

9. IMPLEMENTATION TIMELINE

Phase 1 - Core Pipeline (Week 1, ~2 developer days)

Phase 2 - Telegram Bot (Week 2, ~2 developer days)

Phase 3 - Polish + Full MVP Scope (Week 3, ~1.5 developer days)

Phase 4 - WhatsApp + Proactive Features (Weeks 5-8, ~8 developer days)

10. EFFORT AND COST SUMMARY

11. RISKS AND MITIGATIONS

12. WHAT CHANGES WHEN CLAW IS LIVE

KEY ARCHITECTURAL DECISIONS

Session Shape (Redis key: `session:{userId}`)

Environment Variables (`.env` for nano-claw)