CiteAgent

Traditional social science research often faces limitations in experimental control and contextual generalizability, with lab studies lacking ecological validity and field studies offering limited manipulation of variables.

To address this, we introduce CiteAgent, an LLM-agent-based platform for simulating citation network dynamics. CiteAgent enables realistic, scalable, and controlled experimentation in academic environments, supporting rigorous hypothesis testing through:

Realistic modeling of citation behaviors;
Precise environmental control for causal analysis;
Scalable, reproducible simulations across diverse research contexts.

CiteAgent is built upon the AgentScope framework. We thank the AgentScope team for providing an excellent, flexible foundation for multi-agent research!

Figure 1: CiteAgent Framework Workflow

🛠️ Setup

Before we get started, please configure your OpenAI API keys in the file located at LLMGraph\llms\default_model_configs.json. The format should be as follows:

 {
        "model_type": "openai_chat",
        "config_name": "gpt-3.5-turbo-0125",
        "model_name": "gpt-3.5-turbo-0125",
        "api_key": "sk-.*",
        "generate_args": {
            "max_tokens": 2000,
            "temperature": 0.8
        },
        "client_args":{
            "base_url":""
        }
    }

Next, create the experiment and install the necessary packages by running: pip install -i "requirements.txt"

📦 Usage

We offer three seed networks enriched with text features for author and paper: Cora, Citeseer, and LLM_Agent.

To begin constructing a citation graph, please specify the task_name and config_name:

config_name: Control the academic environment setup in CiteAgent"
task_name: Choose from "cora", "citeseer", or "llm_agent_*" (where you specify the corresponding seed network).

Then, execute the following commands:

# Build the citation graph using the Cora dataset
python main.py --task cora --config <template_config_name> --build 

# Build the citation graph using the Citeseer dataset
python main.py --task citeseer --config <template_config_name> --build 

# Build the citation graph using the LLM_Agent dataset
python main.py --task llm_agent_1 --config <template_config_name> --build

Make sure to adjust the task_name according to the seed network you wish to use.

Template Configuration

To customize the simulation, adjust the configuration file found at LLMGraph\tasks\llm_agent_1\configs\template_*.

We offer support for multiple scholarly search engines, including Generated Papers, Arxiv, and Google Scholar. Change the online_retriever_kwargs field to specify the search engine you wish to use.

🧪 Experiments

For the experiments outlined in the paper, we provide a script for execution.

Download the Datasets:

citation

Format it like:

tasks/
├── citeseer/
│   ├── data/
│   ├── configs/
├── citeseer_1/
├── cora/
├── cora_1/
├── llm_agent/
├── llm_agent_*/

Run Simulation Experiments:

Start launchers in one terminal
```
python start.py --start_server
```
Then run simulation experiments in another terminal
```
python start.py 
```
Run Evaluation Metrics for Simulation Experiments:
```
python evaluate.py
```
Visualize Experimental Results: Please refer to evaluate/Graph/readme.md for detailed instructions.

✅ Results

The CiteAgent paper simulates key phenomena in citation networks, including power-law distribution and citational distortion. To analyze the mechanisms underlying these observed phenomena, we propose two LLM-based SSR research paradigms for examining human referencing behavior: LLM-SA (Synthetic Analysis) and LLM-CA (Counterfactual Analysis). Additional simulations and analyses of other phenomena are provided in the paper.

Power Law Distribution

The degree distribution of citation networks often follows a power-law distribution[1], reflecting a scale-free characteristic. Citation networks generated by the CiteAgent framework replicate this property, exhibiting realistic scale-free behavior that closely mirrors real-world citation dynamics.

Figure 2: Power Law Distribution

Citational Distortion

This phenomenon, which captures biases in citation practices[2], is effectively simulated within the CiteAgent framework. Through interactions among LLM-based agents, CiteAgent reproduces this distortion phenomena.

Figure 3: Citational Distortion

References

Barabási A L, Albert R. Emergence of scaling in random networks[J]. science, 1999, 286(5439): 509-512.
Gomez C J, Herman A C, Parigi P. Leading countries in global science increasingly receive more citations than other countries doing similar research[J]. Nature Human Behaviour, 2022, 6(7): 919-929.

Citation

@inproceedings{ji-etal-2025-llm,
    title = "{LLM}-Based Multi-Agent Systems are Scalable Graph Generative Models",
    author = "Ji, Jiarui  and
      Lei, Runlin  and
      Bi, Jialing  and
      Wei, Zhewei  and
      Chen, Xu  and
      Lin, Yankai  and
      Pan, Xuchen  and
      Li, Yaliang  and
      Ding, Bolin",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.78/",
    doi = "10.18653/v1/2025.findings-acl.78",
    pages = "1492--1523",
    ISBN = "979-8-89176-256-5",
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.vscode		.vscode
LLMGraph		LLMGraph
evaluate		evaluate
figures		figures
.gitignore		.gitignore
README.md		README.md
chop_pdf.py		chop_pdf.py
evaluate.py		evaluate.py
main.py		main.py
requirements.txt		requirements.txt
shuffle_seed.py		shuffle_seed.py
start.py		start.py
start_config.sh		start_config.sh
start_launcher.sh		start_launcher.sh
start_launchers.py		start_launchers.py
test_vllm.sh		test_vllm.sh
vllm.sh		vllm.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CiteAgent

🛠️ Setup

📦 Usage

Template Configuration

🧪 Experiments

✅ Results

Power Law Distribution

Citational Distortion

References

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Ji-Cather/CiteAgent

Folders and files

Latest commit

History

Repository files navigation

CiteAgent

🛠️ Setup

📦 Usage

Template Configuration

🧪 Experiments

✅ Results

Power Law Distribution

Citational Distortion

References

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages