Skip to content

fshaan/medical-guidelines-suite

Repository files navigation

Medical Guidelines Suite v2.5.0

Clinical guidelines knowledge base builder, retrieval engine, and batch patient report generator.

Installation

Option A: Claude Code Ecosystem Skill

npx skills add fshaan/medical-guidelines-suite -g -y

Option B: Manual Installation

# Clone to Claude Code skills directory
git clone https://github.com/fshaan/medical-guidelines-suite \
  ~/.claude/skills/medical-guidelines-suite

# Install Python dependencies
pip install openpyxl pdfplumber
# or: brew install poppler  (for pdftotext)

Option C: Project-Local Installation

# Copy to project's .agent/skills/ directory
cp -r medical-guidelines-suite /path/to/project/.agent/skills/

# Or symlink
ln -s $(pwd) /path/to/project/.agent/skills/medical-guidelines-suite

Quick Start

1. Build Knowledge Base

Place guideline PDFs/DOCXs in a directory and run:

# Create knowledge base structure
mkdir -p guidelines/NCCN/extracted
cp NCCN_Gastric_2026.pdf guidelines/NCCN/

# Extract text
python scripts/extract_all.py --force

Then ask Claude: "构建知识库索引"

2. Single-Patient Query

Ask Claude: "HER2阳性晚期胃癌一线治疗,各指南推荐什么?"

3. Batch Patient Processing

# Parse patient Excel
python scripts/batch_pipeline.py parse --input patients.xlsx --output Output/patients.json

# Orchestrate: auto-scan KB, extract features, generate grep commands + batch prompts
python scripts/batch_pipeline.py orchestrate \
  --patients Output/patients.json --kb-root ./guidelines --batch-size 5

# (Claude executes each batch prompt → Output/batches/rag_batch_*.json)

# Verify execution evidence + merge batch results + validate quality
python scripts/batch_pipeline.py verify-batch --input-dir Output/batches/ --kb-root ./guidelines
python scripts/batch_pipeline.py merge --input-dir Output/batches/ --output Output/rag_results.json
python scripts/batch_pipeline.py validate --input Output/rag_results.json --patients Output/patients.json

# Generate reports
python scripts/batch_pipeline.py generate --input Output/rag_results.json --format md

Or simply ask Claude: "对 patients.xlsx 中的患者,批量检索指南推荐"

The orchestrate command replaces manual splitting — it automatically scans the knowledge base, extracts 9 clinical dimensions from each patient, and generates self-contained batch prompts with pre-built grep commands. Supports checkpoint recovery for interrupted processing.

Small Model Mode (--profile slim)

For local models (Qwen 27B etc.) that struggle with complex prompts:

python scripts/batch_pipeline.py orchestrate \
  --patients Output/patients.json \
  --output-dir Output/batches \
  --batch-size 5 \
  --profile slim

python scripts/batch_pipeline.py verify-batch --input-dir Output/batches/ --profile slim
python scripts/batch_pipeline.py merge --input-dir Output/batches/ --output Output/rag_results.json
python scripts/batch_pipeline.py validate --input Output/rag_results.json --profile slim

Agent 自然语言调用: 在 Claude Code / OpenClaw 等 agent 界面中,可以用自然语言触发 slim 模式:

说法 效果
"用 slim 模式批量检索" --profile slim
"小模型模式处理患者" --profile slim
"用 27B 本地模型跑批量" agent 推断使用 slim
不指定 默认 --profile full

Output Deliverables

File Description
批量指南推荐报告_YYYYMMDD.md Single Markdown report with all patients, TOC navigation, guideline cards, evidence appendix

File Structure

medical-guidelines-suite/
├── SKILL.md                    # Main skill definition (build + query + batch)
├── skill.json                  # Package metadata
├── CHANGELOG.md                # Version history
├── README.md                   # This file
├── references/
│   ├── pdf_reading.md          # PDF processing guide
│   ├── pdf_extraction.md       # PDF extraction methods
│   ├── docx_reading.md         # DOCX processing guide
│   ├── docx_extraction.md      # DOCX extraction methods
│   ├── index_generation.md     # Index template guide
│   └── input_format.md         # Batch input Excel spec
├── templates/
│   ├── data_structure_root.md  # Root index template
│   └── data_structure_org.md   # Organization index template
├── scripts/
│   ├── extract_pdf.py          # PDF text extraction
│   ├── extract_docx.py         # DOCX text extraction
│   ├── extract_all.py          # Batch extraction
│   └── batch_pipeline.py       # Batch patient pipeline (8 subcommands incl. verify-batch)
├── tests/                      # pytest test suite (148 tests)
├── docs/
│   ├── v2.3-anti-laziness-spec.md  # v2.3 execution evidence spec
│   ├── v2.2-fix-plan.md       # v2.2 design spec
│   ├── v2.2-decisions.md      # Confirmed design decisions (D1-D9)
│   └── architecture.md        # Engineering review report
└── examples/
    └── sample_queries.md       # Example clinical questions

Requirements

  • Python 3.9+
  • openpyxl — Excel input parsing
  • pdftotext (poppler) — PDF text extraction (optional, for build phase)

Acknowledgments

This project was inspired by ConardLi/rag-skill, which demonstrated the hierarchical index + progressive retrieval pattern for local knowledge bases using Claude Code Skills. We adopted and extended its core architectural ideas — data_structure.md layered indexing, grep-based search, and the "learn before process" constraint — into the medical guidelines domain, adding cross-guideline comparison, batch patient processing, and Markdown report generation.

License

This work is licensed under CC BY-NC-SA 4.0.

CC BY-NC-SA 4.0

About

Medical Clinical Guidelines Knowledge Suite: Build, Query & Batch Process. Claude Code / OpenClaw skill for cross-guideline retrieval and batch patient report generation.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages