An open benchmark for evaluating the accuracy of Japanese financial data APIs against EDINET XBRL filings.
EDINET is operated by Japan's Financial Services Agency (金融庁). It is the authoritative source for Japanese corporate financial disclosures. This benchmark checks whether data providers correctly parse and return the values from these filings.
| # | Provider | Accuracy | Score |
|---|---|---|---|
| 1 | Axiora | 100.00% | 285 / 285 |
| 2 | edinetdb.jp | 74.74% | 213 / 285 |
Excluding total_equity (which edinetdb.jp does not serve): edinetdb.jp scores 213/235 (90.6%).
| Field | Axiora | edinetdb.jp | Notes |
|---|---|---|---|
| revenue | 50/50 (100%) | 45/50 (90%) | |
| net_income | 50/50 (100%) | 48/50 (96%) | |
| operating_income | 35/35 (100%) | 29/35 (83%) | |
| total_assets | 50/50 (100%) | 48/50 (96%) | |
| total_equity | 50/50 (100%) | 0/50 (0%) | edinetdb.jp returns null for all companies |
| eps | 50/50 (100%) | 43/50 (86%) |
git clone https://github.com/axioradev/edinet-benchmark.git
cd edinet-benchmark
pip install httpx numpy
export AXIORA_API_KEY="..." # https://axiora.dev
export EDINETDB_API_KEY="..." # https://edinetdb.jp
python compare.pyResults are written to results/.
You only need API keys for the providers you want to test. Providers without a key are skipped.
- Add an entry to
providers.json:
{
"name": "your_api",
"display_name": "Your API",
"base_url": "https://api.example.com/v1",
"api_key_env": "YOUR_API_KEY",
"auth_header": "Authorization",
"auth_prefix": "Bearer ",
"endpoint": "/companies/{edinet_code}/financials",
"params": {},
"response_path": "data",
"fiscal_year_key": "fiscal_year",
"rate_limit_ms": 100
}- Run the benchmark:
python compare.py - Open a PR with your results
Requirements: Your API must accept an EDINET code and return financial data with fiscal_year, revenue, net_income, operating_income, total_assets, total_equity, and eps fields.
Ground truth: 285 data points across 50 companies, verified against EDINET XBRL annual filings (有価証券報告書, doc_type=120). Every expected value can be independently checked against the original filing on EDINET.
Sample: 50 companies selected via stratified sampling across 19 sectors and 3 accounting standards (IFRS, JP-GAAP, US-GAAP). Deterministic seed: 2026-03-03. Universe snapshot committed as universe.json. See select_sample.py.
Matching rules (see matching_rules.json):
- Monetary fields (JPY): exact match below ¥1B, ±0.01% relative tolerance above
- Per-share fields: ±¥0.01 absolute tolerance, auto sen-to-yen conversion
- Null expected values excluded from scoring
- Null/missing provider values count as mismatch
compare.py # Comparison engine (uses providers.json)
providers.json # Provider configurations (extensible)
golden.json # Ground truth dataset (285 data points)
matching_rules.json # Matching tolerances
universe.json # Company universe snapshot for reproducible sampling
select_sample.py # Sample selection script
company_lookup.json # Company metadata + EDINET doc IDs
evidence.json # Partial source traces for 10 companies
results/
scoreboard.json # Full results
scoreboard.csv # CSV export
summary.json # Aggregate scores, CIs, binomial bounds, sensitivity
response_hashes.json # SHA-256 of raw API responses per provider
- 50 companies out of ~4,000 on EDINET
- Annual filings only (doc_type=120)
- 6 financial metrics (no cash flow, dividends, or segments)
total_equitynot served by all providers — see results with and without- Point-in-time — providers may fix issues after publication
- Golden truth constructed by the developer of one evaluated provider (see methodology.md for conflict of interest discussion)
MIT