An autonomous, AI-driven version of Statistics Canada's "The Daily" — generating bilingual statistical bulletins from CANSIM data.
The D-AI-LY runs daily at 8am, automatically:
- Discovering newsworthy CANSIM table updates
- Fetching real data from Statistics Canada
- Generating bilingual articles (EN + FR) following The Daily's voice
- Publishing to a static website
┌─────────────────────────────────────────────────────────────┐
│ SCHEDULED TRIGGER (launchd) │
│ Runs daily at 8am │
└─────────────────────┬───────────────────────────────────────┘
│
┌───────────▼────────────┐
│ AI LAYER 1 │ ← discover_topics.R
│ Topic Selection │ Score by recency, sector
└───────────┬────────────┘
│
┌───────────▼────────────┐
│ DETERMINISTIC CORE │ ← fetch_table.R
│ R + cansim package │ Config-driven extraction
└───────────┬────────────┘
│
┌───────────▼────────────┐
│ AI LAYER 2 │ ← Claude Code
│ Article Generation │ /the-daily-generator skill
└───────────┬────────────┘
│
┌───────────▼────────────┐
│ DETERMINISTIC CORE │ ← Observable Framework
│ Build + Publish │ npm run build
└────────────────────────┘
- R with packages:
cansim,dplyr,tidyr,jsonlite - Node.js 20+
- Claude Code CLI (
npm install -g @anthropic-ai/claude-code)
# Clone and install
git clone https://github.com/mountainmath/the-daily.git
cd the-daily
npm install
# Install R packages
Rscript -e 'install.packages(c("cansim", "dplyr", "tidyr", "jsonlite"))'# Full pipeline (discovery → fetch → generate → build)
./automation/run_pipeline.sh
# Specific table
./automation/run_pipeline.sh --table=18-10-0004
# Prep only (no article generation)
./automation/run_pipeline.sh --prep-only# Install launchd agent (runs at 8am daily)
./automation/install.sh
# Check status
./automation/install.sh --status
# Remove automation
./automation/install.sh --removethe-daily/
├── automation/
│ ├── run_pipeline.sh # Daily orchestrator
│ ├── install.sh # Automation installer
│ └── com.the-daily.pipeline.plist
│
├── r-tools/
│ ├── discover_topics.R # Topic discovery & ranking
│ ├── fetch_table.R # CANSIM data fetcher
│ └── table_configs.json # Table extraction configs (25 tables)
│
├── docs/ # Observable Framework site
│ ├── en/ # English articles
│ ├── fr/ # French articles
│ └── style.css # StatCan-inspired styling
│
├── .claude/skills/
│ ├── the-daily-generator/ # Article generation skill
│ ├── the-daily-discover/ # Topic discovery skill
│ └── the-daily-publish/ # Build & deploy skill
│
├── .github/workflows/
│ └── daily.yml # GitHub Action (fallback)
│
└── output/ # Generated data files
The project uses Claude Code skills for AI-driven tasks:
| Skill | Purpose |
|---|---|
/the-daily-generator |
Generate bilingual articles from CANSIM data |
/the-daily-discover |
Identify newsworthy table updates |
/the-daily-publish |
Build and deploy the site |
The R script discover_topics.R scans CANSIM for recently updated tables and ranks them by:
- Recency (25%) — How recently was data released?
- Diversity (25%) — Avoid covering same sector repeatedly
- Public Interest (50%) — Labour, prices, housing score highest
The fetch_table.R script uses configs from table_configs.json to:
- Fetch data via the
cansimR package - Apply dimension filters (GEO, categories)
- Calculate MoM and YoY changes
- Export analysis-ready JSON
Claude Code follows the skill documentation to:
- Write in The Daily's neutral, clinical voice
- Create Observable markdown with embedded charts
- Generate both English and French versions
- Verify data integrity against source JSON
Articles follow strict style guidelines:
- Neutral and clinical — no emotional language ("increased" not "surged")
- Inverted pyramid — most important facts first
- Plain language — accessible to general audiences
- Headlines lead with the key statistic
- Always include MoM and YoY comparisons
- Hedge causation: "amid", "coinciding with" (not "caused by")
- Add entry to
r-tools/table_configs.json:
"18-10-0004": {
"name": "Consumer Price Index",
"headline": "Consumer prices",
"unit": "index",
"filters": {
"GEO": "Canada",
"Products and product groups": "All-items"
}
}- Test the fetch:
Rscript r-tools/fetch_table.R 18-10-0004 outputDefault: 8:00 AM daily. To change, edit automation/com.the-daily.pipeline.plist and reinstall.
# Start dev server
npm run dev
# Build site
npm run build
# Run discovery only
Rscript r-tools/discover_topics.R --configured --jsonIf local automation fails (Mac offline, Claude Code issues):
- GitHub Action runs at 8am ET (1pm UTC)
- Runs discovery + fetch
- Creates GitHub Issue with instructions
- User runs
claude "/the-daily-generator TABLE"when available
MIT License. Data is from Statistics Canada (Crown Copyright).
- Statistics Canada for the CANSIM data
- The
cansimR package by Jens von Bergmann - Observable Framework for the static site
- Anthropic Claude for AI capabilities