A small demo app that allows a user to interrogate the FinCEN data in Neo4j via natural language.
This project serves as a practical demonstration of modern GenAI engineering patterns, focusing on reliable interactions with structured databases:
- PydanticAI Framework: The core agent is built using
pydantic-ai, enforcing type-safe, structured outputs. This ensures the model's responses adhere strictly to defined schemas (e.g., separating findings from extracted entities). - Model Context Protocol (MCP): The agent connects to Neo4j via an MCP server. This provides the LLM with native tool access to query the graph database directly, allowing it to autonomously gather context.
- Self-Reflection & Validation: Domain-level validation is used to verify the LLM's output. For example, if the model indicates data was found but returns an empty answer, a retry is triggered, prompting the agent to correct itself.
- Observability: The agent is instrumented with Langfuse for detailed tracing, providing visibility into the broader LLM calls, tool usage, and prompt execution.
- Evaluation: An evaluation suite powered by
pydantic-evalshelps measure the agent's performance and accuracy against a dataset of test cases over time. - Durable Multi-Agent Orchestration: A deep research investigation mode uses Temporal to orchestrate a three-agent workflow (planner, researcher, synthesiser) with durable execution guarantees. Each step is automatically retried on failure and completed steps are never re-executed.
sequenceDiagram
participant User
participant FastAPI
participant Temporal as Temporal Server
participant Planner as Planning Agent
participant Researcher as Research Agent
participant Synthesiser as Synthesis Agent
participant Neo4j as Neo4j MCP
User->>FastAPI: POST /api/v1/investigations
FastAPI->>Temporal: Start InvestigationWorkflow
FastAPI-->>User: investigation_id
Temporal->>Planner: Decompose query
Planner-->>Temporal: ResearchPlan with sub-queries
loop For each sub-query
Temporal->>Researcher: Execute sub-query
Researcher->>Neo4j: Query graph via MCP
Neo4j-->>Researcher: Graph data
Researcher-->>Temporal: SubQueryResult
end
Temporal->>Synthesiser: Synthesise all findings
Synthesiser-->>Temporal: InvestigationReport
User->>FastAPI: GET /investigations/{id}/status
FastAPI->>Temporal: Query workflow state
Temporal-->>FastAPI: Status and progress
FastAPI-->>User: Status response
User->>FastAPI: GET /investigations/{id}/result
FastAPI->>Temporal: Get workflow result
Temporal-->>FastAPI: InvestigationReport
FastAPI-->>User: Full report
Provide a Google Gemini API key (or change the MODEL variable in .env to another provider and provide the relevant API key).
mise run setup # Boot Docker services and load FinCEN data
mise run run # Start the FastAPI dev server
mise run teardown # Stop services and delete local dataThese are simple, single-agent queries for entity lookups and basic relationship analysis.
- 'Which banks appear in the FinCEN files?'
- 'What countries are most frequently mentioned in suspicious activity reports?'
- 'Show me the transaction flow involving Deutsche Bank.'
- 'What entities are connected to HSBC in the FinCEN data?'
These complex, multi-stage queries leverage the Temporal-orchestrated workflow for in-depth research.
- 'Investigate potential money laundering networks linking offshore shell companies in the British Virgin Islands to high-value real estate purchases in Miami and New York between 2018 and 2022. Map out the key front companies and financial intermediaries involved.'
- 'Identify suspicious patterns of rapid fund transfers through correspondent banking accounts at JP Morgan Chase and Standard Chartered, specifically involving Baltic-region entities and known tax havens, highlighting any links to entities under international sanctions.'
- 'Examine the role of professional money laundering facilitators (lawyers, accountants, and trust providers) based in Cyprus and Malta who have assisted Eastern European PEPs (Politically Exposed Persons) in moving assets into the London property market.'
.env doesn't contain any real secrets and I wanted to keep manual configuration to a minimum, hence why it isn't in .gitignore (or, even better, a secrets manager isn't used).
The guardrails are very limited and basic at the moment but the model will decline to answer many messages that aren't related to the FinCEN data.
The evals have some failures (93% pass rate)- I'll address these at some point soon.