Skip to content

ruofanjin/BioClaw

 
 

Repository files navigation

BioClaw

AI-Powered Bioinformatics Research Assistant on WhatsApp

English | 简体中文

Version License Paper arXiv

BioClaw brings the power of computational biology directly into WhatsApp group chats. Researchers can run BLAST searches, render protein structures, generate publication-quality plots, perform sequencing QC, and search the literature — all through natural language messages.

Built on the NanoClaw architecture with bioinformatics tools and skills from the STELLA project, powered by the Claude Agent SDK.

Join WeChat Group

Welcome to join our WeChat group to discuss and exchange ideas! Scan the QR code below to join:

WeChat Group QR Code
Scan to join the BioClaw community

Contents

Overview

The rapid growth of biomedical data, tools, and literature has created a fragmented research landscape that outpaces human expertise. Researchers frequently need to switch between command-line bioinformatics tools, visualization software, databases, and literature search engines — often across different machines and environments.

BioClaw addresses this by providing a conversational interface to a comprehensive bioinformatics toolkit. By messaging @Bioclaw in a WhatsApp group, researchers can:

  • Sequence Analysis — Run BLAST searches against NCBI databases, align reads with BWA/minimap2, and call variants
  • Quality Control — Generate FastQC reports on sequencing data with automated interpretation
  • Structural Biology — Fetch and render 3D protein structures from PDB with PyMOL
  • Data Visualization — Create volcano plots, heatmaps, and expression figures from CSV data
  • Literature Search — Query PubMed for recent papers with structured summaries
  • Workspace Management — Triage files, recommend analysis steps, and manage shared group workspaces

Results — including images, plots, and structured reports — are delivered directly back to the chat.

Quick Start

Prerequisites

  • macOS / Linux / Windows (Windows requires PowerShell 5.1+)
  • Node.js 20+
  • Docker Desktop
  • Anthropic API key or OpenRouter API key

Installation

One-command setup (recommended for first-time users):

macOS / Linux
git clone https://github.com/Runchuan-BU/BioClaw.git
cd BioClaw
bash scripts/setup.sh
Windows (PowerShell)
git clone https://github.com/Runchuan-BU/BioClaw.git
cd BioClaw
powershell -ExecutionPolicy Bypass -File scripts\setup.ps1

The setup script will check prerequisites, install dependencies, build the Docker image, and walk you through API key configuration interactively.

Manual setup:

git clone https://github.com/Runchuan-BU/BioClaw.git
cd BioClaw
npm install
cp .env.example .env        # Edit with your API keys (see model section below)
docker build --no-cache -t bioclaw-agent:latest container/ # uncomment Dockerfile image source if you meet 100 errors.
npm start

Model Provider Configuration

BioClaw now supports two provider paths:

  • Anthropic — default, keeps the original Claude Agent SDK flow
  • OpenRouter / OpenAI-compatible — optional path for OpenRouter and similar /chat/completions providers

Create a .env file in the project root and choose one of the following setups.

Option A — Anthropic (default)

ANTHROPIC_API_KEY=your_anthropic_key

Option B — OpenRouter (Gemini, DeepSeek, Claude, GPT, and more)

MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
OPENROUTER_MODEL=deepseek/deepseek-chat-v3.1

Popular model IDs: deepseek/deepseek-chat-v3.1, google/gemini-2.5-flash, anthropic/claude-3.5-sonnet. Full list: openrouter.ai/models

Note: Use models that support tool calling (e.g. DeepSeek, Gemini, Claude). Session history is preserved within a container session; after idle timeout, a new container starts with a fresh context.

Generic OpenAI-compatible setup

MODEL_PROVIDER=openai-compatible
OPENAI_COMPATIBLE_API_KEY=your_api_key
OPENAI_COMPATIBLE_BASE_URL=https://your-provider.example/v1
OPENAI_COMPATIBLE_MODEL=your-model-name

After updating .env, restart BioClaw:

npm run dev

When a container starts, docker logs <container-name> will show which provider path is active.

Usage

In any connected chat, simply message:

@Bioclaw <your request>

Messaging channels

Supported platforms include WhatsApp (default), Feishu (Lark), WeCom, Discord, Slack (Socket Mode), WeChat Personal (experimental), and optional local web (browser) chat. Full setup steps, env vars, and disabling channels are in docs/CHANNELS.md (简体中文:docs/CHANNELS.zh-CN.md).

Lab trace (SSE timeline, workspace tree) is built into the local web UI — no extra config needed. See docs/DASHBOARD.md.

Second Quick Start

Just send the message to OpenClaw:

install https://github.com/Runchuan-BU/BioClaw

See the ExampleTask document for 6 ready-to-use demo prompts with expected outputs.

Demo Examples

Below are live demonstrations of BioClaw handling real bioinformatics tasks via WhatsApp.

1. Workspace Triage & Next Steps

Analyze files in a shared workspace and recommend the best next analysis steps.


2. FastQC Quality Control

Run FastQC on paired-end FASTQ files and deliver the QC report with key findings.


3. BLAST Sequence Search

BLAST a protein sequence against the NCBI nr database and return structured top hits.


4. Volcano Plot Generation

Create a differential expression volcano plot from a CSV file and interpret the results.


5. Protein Structure Rendering

Fetch a PDB structure, render it in rainbow coloring with PyMOL, and send the image.


6. PubMed Literature Search

Search PubMed for recent high-impact papers and provide structured summaries.


7. Hydrogen Bond Analysis

Visualize hydrogen bonds between a ligand and protein in PDB 1M17.


8. Binding Site Visualization

Show residues within 5Å of ligand AQ4 in PDB 1M17.


System Architecture

BioClaw is built on the NanoClaw container-based agent architecture, extended with biomedical tools and domain knowledge from the STELLA framework.

WhatsApp ──► Node.js Orchestrator ──► SQLite (state) ──► Docker Container
                                                              │
                                                     Claude Agent SDK
                                                              │
                                                   ┌──────────┴──────────┐
                                                   │   Bioinformatics    │
                                                   │      Toolbox        │
                                                   ├─────────────────────┤
                                                   │ BLAST+  │ SAMtools  │
                                                   │ BWA     │ BEDTools  │
                                                   │ FastQC  │ PyMOL     │
                                                   │ minimap2│ seqtk     │
                                                   ├─────────────────────┤
                                                   │   Python Libraries  │
                                                   ├─────────────────────┤
                                                   │ BioPython │ pandas  │
                                                   │ RDKit     │ scanpy  │
                                                   │ PyDESeq2  │ pysam   │
                                                   │ matplotlib│ seaborn │
                                                   └─────────────────────┘

Key design principles (inherited from NanoClaw):

Component Description
Container Isolation Each conversation group runs in its own Docker container with pre-installed bioinformatics tools
Filesystem IPC Text and image results are communicated between the agent and orchestrator via the filesystem
Per-Group State SQLite database tracks messages, sessions, and group-specific workspaces
Channel Agnostic Channels self-register at startup; the orchestrator connects whichever ones have credentials

Biomedical capabilities (attributed to STELLA):

The bioinformatics tool suite and domain-specific skills — including sequence analysis, structural biology, literature mining, and data visualization — draw from the tool ecosystem developed in the STELLA project, a self-evolving multi-agent framework for biomedical research.

Included Tools

Command-Line Bioinformatics

Tool Purpose
BLAST+ Sequence similarity search against NCBI databases
SAMtools Manipulate alignments in SAM/BAM format
BEDTools Genome arithmetic and interval manipulation
BWA Burrows-Wheeler short read aligner
minimap2 Long read and assembly alignment
FastQC Sequencing quality control reports
fastp FASTQ filtering and trimming (QC/preprocessing)
MultiQC Aggregate QC reports into one summary
seqtk FASTA/FASTQ file manipulation
seqkit FASTA/FASTQ toolkit (extended)
BCFtools Variant calling and VCF/BCF manipulation
tabix Index/query compressed VCF/BED (bgzip/tabix)
pigz Parallel gzip compression/decompression
SRA Toolkit Download data from NCBI SRA (prefetch/fasterq-dump)
Salmon RNA-seq transcript quantification
kallisto RNA-seq transcript quantification
PyMOL Molecular visualization and rendering

Python Libraries

Library Purpose
BioPython Biological computation (sequences, PDB, BLAST parsing)
pandas / NumPy / SciPy Data manipulation and scientific computing
matplotlib / seaborn Publication-quality plotting
scikit-learn Machine learning for biological data
RDKit Cheminformatics and molecular descriptors
PyDESeq2 Differential expression analysis
scanpy Single-cell RNA-seq analysis
pysam SAM/BAM file access from Python

Scripts

All utility scripts are in the scripts/ directory:

Command Script Description
bash scripts/setup.sh scripts/setup.sh One-command setup for macOS/Linux
powershell scripts\setup.ps1 scripts/setup.ps1 One-command setup for Windows
npm run web scripts/start-web.mjs Start BioClaw with local web UI (chat + lab trace)
npm run open:web scripts/open-local-web.mjs Open the web UI in default browser
npm run stop:web scripts/stop-bioclaw-web.mjs Stop the web server process
bash scripts/clear-local-web.sh scripts/clear-local-web.sh Clear all local-web chat history and trace events
npx tsx scripts/test-cli.ts "prompt" scripts/test-cli.ts Run a single prompt through the container agent (CLI test)
npx tsx scripts/manage-groups.ts list scripts/manage-groups.ts Manage WhatsApp group registrations (list / register / remove)
python3 scripts/demo.py scripts/demo.py TP53 gene analysis demo (runs inside container)

Project Structure

BioClaw/
├── src/                       # Node orchestrator
│   └── channels/              # WhatsApp, WeCom, Feishu, Discord, Slack, WeChat, local web
├── container/                 # Agent Dockerfile + skills
├── scripts/                   # Utility scripts (setup, web, testing)
├── groups/                    # Per-group workspace & CLAUDE.md
├── docs/
│   ├── CHANNELS.md            # Messaging platform setup (EN)
│   ├── CHANNELS.zh-CN.md      # Messaging platform setup (ZH)
│   ├── DASHBOARD.md           # Lab trace & observability
│   ├── SECURITY.md            # Trust model & container isolation
│   ├── SPEC.md                # Technical specification
│   ├── DEBUG_CHECKLIST.md     # Troubleshooting guide
│   └── images/                # Doc screenshots
├── ExampleTask/               # Demo prompts + screenshots
└── README.md

Citation

BioClaw builds upon the STELLA framework. If you use BioClaw in your research, please cite:

@article{jin2025stella,
  title={STELLA: Towards a Biomedical World Model with Self-Evolving Multimodal Agents},
  author={Jin, Ruofan and Xu, Mingyang and Meng, Fei and Wan, Guancheng and Cai, Qingran and Jiang, Yize and Han, Jin and Chen, Yuanyuan and Lu, Wanqing and Wang, Mengyang and Lan, Zhiqian and Jiang, Yuxuan and Liu, Junhong and Wang, Dongyao and Cong, Le and Zhang, Zaixi},
  journal={bioRxiv},
  year={2025},
  doi={10.1101/2025.07.01.662467}
}

License

This project is licensed under the MIT License. See LICENSE for details.

About

AI-Powered Bioinformatics Research Assistant. Built on OpenClaw.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 78.4%
  • JavaScript 8.6%
  • CSS 5.5%
  • Python 2.4%
  • Shell 1.5%
  • PowerShell 1.5%
  • Other 2.1%