This repo implements (and partially prototypes) the graph-based epidemic modeling approach described in the report “Mô phỏng mô hình lan truyền dịch bệnh” (HUST – Applied Mathematics & Informatics, 2020).
Instead of only using compartmental models (SI/SIR/SEIR), the report treats spread as a time-varying influence process over a network (patients + shared locations), and ranks “influential” patients per day using PageRank.
The report models the outbreak as a graph
- Patient nodes: each case is a node with attributes such as age group, symptom onset date, and announcement date.
- Edges: represent potential transmission influence (direct relationships).
- Location nodes (optional but important): to connect otherwise fragmented patient-only graphs, location nodes (e.g., wards/communes) are added and connected to patients.
- This encodes indirect links (shared environment) and “unknown source” links among co-located patients.
The report highlights that onset dates are often missing. It defines
and fits a distribution to observed
The report proposes a daily weighted influence network. Edge weights vary over time, peaking around symptom onset and decaying afterward. The weight combines:
- relationship-type strength,
- age-group mixing,
- intervention/media factors,
- (when using locations) a short surface-survival window (e.g., ~3 days) and an indirect transmission decay factor
$\gamma$ .
Each day
The code in server/ provides a lightweight prototype pipeline to:
- transform raw case CSV → relationship CSV,
- load the graph into Neo4j,
- export graph JSON for visualization (Sigma.js),
- compute PageRank on the exported graph (in the notebook).
- server/: data prep + Neo4j import + analysis/export tooling
- visualization/: Sigma.js-based viewer reading
visualization/data.json
Key files in server/:
data.csv: raw case table (Vietnamese column names)relation_new.csv: processed edge list (also includes arelationshipid column in this repo)Generate data.ipynb: builds an edge list fromdata.csvUpdate DB.ipynb: loadsrelation_new.csvinto Neo4j (nodes + typed edges)Graph.ipynb: pulls Neo4j → igraph, computes PageRank, exports JSONpass_igraph.py: quick Neo4j → igraph → JSON export (clusters + colors)test.py: minimal Neo4j driver wrapper used bypass_igraph.py
The easiest way to run Neo4j locally is using Docker Compose:
docker-compose up -dThis will start Neo4j 5.15.0 with:
- Browser UI: http://localhost:7474
- Bolt Protocol: bolt://localhost:7687
- Credentials: username=
neo4j, password=123
Check if Neo4j is ready:
docker-compose logs neo4j | grep "Started"To stop Neo4j:
docker-compose downAlternative: If you have Neo4j installed locally or running on a remote server, update the .env file with your connection details:
NEO4J_URI=bolt://your-host:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-passwordInstall required Python packages:
pip install pandas numpy neo4j python-dotenv python-igraphSecurity note: The default credentials are neo4j / 123. Update .env before using in production.
The modern pipeline uses refactored Python modules in the src/ directory.
Run the complete pipeline with a single command:
python src/main.pyThis will:
- Process raw data (
data.csv→relation_new.csv) - Migrate to Neo4j (create nodes and relationships)
- Calculate PageRank and export visualization JSON
Run specific pipeline steps:
# Process data only
python src/main.py --step process
# Migrate to Neo4j only
python src/main.py --step migrate
# Calculate metrics only
python src/main.py --step analyzeAdditional options:
# Force re-import even if data exists
python src/main.py --force
# Skip if files already exist
python src/main.py --skip-existing
# Custom output path
python src/main.py --output-json custom/path/output.json
# View all options
python src/main.py --helpAlternatively, run each module individually:
- Process raw data
python src/components/data_processing.pyThis converts server/data/data.csv → server/data/relation_new.csv with processed relationships.
- Migrate to Neo4j
python src/components/migrate_graph_db.pyLoads nodes (patient cases) and relationships (transmission links) into Neo4j.
- Calculate metrics and export visualization
python src/models/calculator.pyThis:
- Fetches graph data from Neo4j
- Computes PageRank influence metrics
- Identifies outbreak clusters
- Exports
visualization/data.jsonfor Sigma.js
- View the visualization
Open visualization/index.html in a browser to see the interactive epidemic network with:
- Nodes sized by PageRank (influence)
- Color-coded clusters
- Interactive exploration (drag, zoom, click)
The original notebooks in server/ are still available:
- (Optional) regenerate relationship CSV
Open and run server/Generate data.ipynb to produce an edge list.
- Load into Neo4j
Open and run server/Update DB.ipynb. It creates:
- nodes labeled by age group (
G1,G2,G3,G4) - relationship types mapped from the
relationshipcolumn:0→Unknown1→Staff_Patience2→Fellow3→Relatives4→Social
- Export JSON for visualization
Option A (notebook): run server/Graph.ipynb and it writes visualization/data.json.
Option B (script):
cd server
python pass_igraph.py- View
Open visualization/index.html. It reads visualization/data.json.
Both export paths create a Sigma.js-like JSON:
{
"nodes": [{
"id": 0,
"label": "BN416",
"x": 0,
"y": 0,
"size": 2,
"_color": "#34c0eb"
}],
"edges": [{
"id": 0,
"source": 0,
"target": 1,
"weight": 0.42,
"type": "Relatives"
}]
}