MARK: Multi-Agent Graph Clustering with GNN Backbones

The foundation and reproducibility are based on the attached paper 2025.findings-acl.314.pdf: implemented end-to-end cycle pretrain → agents → filter R → update X → ranking fine-tuning → metrics over GNN backbones (MAGI, DMoN) with local LLM.

Installation

Install PyTorch with appropriate CUDA/CPU build: see https://pytorch.org/get-started/locally/
pip install -r requirements.txt
Export data directory: set TORCH_GEOMETRIC_HOME=./data (Windows PowerShell) or export TORCH_GEOMETRIC_HOME=./data
If necessary, specify LLM endpoint/model in configs/agents.yaml (default: http://10.100.10.70:9999/v1, openai/gpt-oss-120b).

Structure

mark/: package with data loading, augmentations, metrics, utilities, backbones (MAGI, DMoN), agents and ranking loss.
configs/: data.yaml, engine.yaml, agents.yaml, train.yaml.
scripts/: check_data.py, pretrain.py, run_mark.py, eval.py, ablations.py, sensitivity.py.
experiments/: logs, checkpoints, caches (has .gitkeep).

Quick Start

python scripts/check_data.py
python scripts/pretrain.py
python scripts/run_mark.py
python scripts/eval.py

All paths and hyperparameters are taken from configs/*.yaml.
Pretrain checkpoint is copied to experiments/checkpoints/<dataset>-<backbone>-pretrain.pt.

Config Parameters

configs/data.yaml: dataset (cora|citeseer|pubmed|wikics), data_dir, plm_model (SentenceTransformer), batch_size_plm.
configs/engine.yaml: backbone (magi|dmon), hidden_dim, proj_dim, num_clusters (0=from dataset), tau_align, lambda_clu, t_rank (temperature/weight calibration).
configs/agents.yaml: llm.base_url, llm.model, temperature, max_tokens, top_n_concept, k_neighbors, batch_nodes, concurrency, retries, retry_backoff.
configs/train.yaml: device, amp, seed, lr, weight_decay, T (number of agent+FT steps), T_prime (agent period), ft_epochs_per_step, log_dir, pretrain_epochs, checkpoint_path.

Typical Pipeline

Data Loading (check_data.py): downloading TAG (Cora by default), printing sizes and internet check.
Pretrain (pretrain.py): self-supervised MAGI (NT-Xent + clustering) or DMoN (modularity) with two augmentations; checkpoint, centers, disagreements S are saved.
MARK Cycle (run_mark.py): every T_prime epochs
- Concept induction (LLM, agents/concept.py)
- Synthetic generation for S (LLM)
- Consistency inference → set R
- Feature update by averaging with PLM embeddings of summarized nodes
- Fine-tuning with ranking calibration Lft = Leng + Lcal
- Time logs (timing_report.json), tokens (costs.json), metrics per steps.
Evaluation (eval.py): ACC/NMI/ARI/F1 in metrics.json/metrics.csv.
Ablations/Sensitivity: ablations.py (disabling concept/generation/inference) and sensitivity.py (grids top-n, k, plots sensitivity.png).

Notes

All agents require local LLM with OpenAI-compatible API; concurrency, batches and retries are managed in agents.yaml.
PLM caches and agent responses are stored in experiments/<run>/.
AMP is enabled when device=cuda and amp=true.

Readiness Check

After pretrain.py and one step of run_mark.py on Cora should appear: non-empty set S, non-zero R, decreasing Lali (MAGI) and ACC/NMI/ARI/F1 metrics in metrics_step*.json and scripts/eval.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MARK: Multi-Agent Graph Clustering with GNN Backbones

Installation

Structure

Quick Start

Config Parameters

Typical Pipeline

Notes

Readiness Check

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
data/Cora		data/Cora
mark		mark
scripts		scripts
2025.findings-acl.314.pdf		2025.findings-acl.314.pdf
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MARK: Multi-Agent Graph Clustering with GNN Backbones

Installation

Structure

Quick Start

Config Parameters

Typical Pipeline

Notes

Readiness Check

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages