Skip to content

Plurality-Institute/bridging-bot

Repository files navigation

Bridging Bot

Bridging Bot monitors Reddit conversations and generates constructive interventions in polarized discussions. It uses LLMs to identify threads where a bridging comment could help, generates context-aware responses, and optionally posts them to Reddit.

How it works

The system runs as a Prefect pipeline that processes subreddits on a schedule:

  1. Sync — Fetches new posts and comments from Reddit, stores them in PostgreSQL
  2. Enrich — Scores comments for toxicity using Google's Perspective API
  3. Classify — Uses an LLM (via Braintrust) to determine if a post would benefit from intervention
  4. Thread Selection — Identifies the best thread within a post to respond to
  5. Intervention — Generates a bridging comment tailored to the conversation context
  6. Post — Optionally posts the intervention to Reddit via the API

Each step is configurable and the pipeline can be run end-to-end or step-by-step via the CLI.

Prerequisites

Getting Started

  1. Create a .env file from the example:

    cp .env.example .env

    Fill in your credentials. See the comments in .env.example for details on each variable.

  2. Build and start the services:

    docker-compose up --build

    This starts PostgreSQL, a local Prefect server, a Prefect worker, and the main application container.

  3. Set up the database:

    docker-compose run --rm bridging_bot python -m src.db.setup
    
    docker-compose run --rm bridging_bot python -m src.db.migrations.apply_migrations
  4. Set up Braintrust prompts — The pipeline requires prompts configured in Braintrust for classification, thread classification, thread selection, and intervention generation. Set the prompt slugs and versions in your .env file.

  5. Access the Prefect UI at localhost:4200 to monitor and trigger pipeline runs.

Running the Pipeline

Via Prefect

The pipeline is deployed as Prefect flows. After starting the services:

  1. Open the Prefect UI at localhost:4200
  2. Go to Deployments
  3. Trigger sync-partner-subreddits with your subreddit list

For production, deploy to Prefect Cloud using prefect.yaml.

Via CLI

Run individual pipeline steps from inside the container:

docker-compose run --rm bridging_bot bash

Core workflow

Run the full pipeline for a single post:

python -m src.cli.reddit pipeline <post_id>

Sync a post from Reddit:

python -m src.cli.reddit sync <post_id>

Sync all recent posts from a subreddit:

python -m src.cli.reddit sync-subreddit <subreddit> [--days-back 3] [--limit 50]

Classify a post for intervention:

python -m src.cli.reddit classify-post <post_id> <run_id>

Classify all posts in a subreddit:

python -m src.cli.reddit classify-subreddit <subreddit> [--limit 50]

Classify all threads in a post:

python -m src.cli.reddit classify-threads <post_id> <run_id>

Select the best thread for intervention:

python -m src.cli.reddit select-thread <post_id> <run_id>

Generate an intervention:

python -m src.cli.reddit intervene <post_id> <comment_id>

Post an intervention to Reddit:

python -m src.cli.reddit post-intervention <intervention_id>

Inspection

Display a conversation tree:

python -m src.cli.reddit display <post_id>

View thread classifications for a post:

python -m src.cli.reddit list-thread-classifications <post_id>

Display a specific thread classification:

python -m src.cli.reddit display-thread-classification <thread_classification_id>

Enrichment and reporting

Enrich comments with Perspective API scores:

python -m src.cli.reddit enrich <run_id>

Generate a subreddit statistics report:

python -m src.cli.reddit subreddit-report <subreddit> [--days-back 7]

Generate a data quality report:

python -m src.cli.reddit data-quality-report <subreddit>

Configuration

Auto-posting

By default, interventions are generated but not posted to Reddit. To enable auto-posting:

AUTO_POST_ENABLED=true
AUTO_POST_SUBREDDITS=subreddit1,subreddit2  # optional allowlist

If AUTO_POST_SUBREDDITS is empty and auto-posting is enabled, interventions will be posted to all monitored subreddits.

Reddit credentials

The system supports multiple Reddit API credential sets for rate limit management:

REDDIT_CLIENT_ID_1=...
REDDIT_CLIENT_SECRET_1=...
REDDIT_USER_AGENT_1=...

REDDIT_CLIENT_ID_2=...
REDDIT_CLIENT_SECRET_2=...
REDDIT_USER_AGENT_2=...

Set REDDIT_CREDENTIAL_MODE=random for random rotation or db_lock for database-locked credential assignment.

Braintrust prompts

The pipeline uses Braintrust for prompt management. Each stage requires a configured prompt:

Env var Pipeline stage
BRAINTRUST_CLASSIFICATION_PROMPT_SLUG Post-level classification
BRAINTRUST_THREAD_CLASSIFICATION_PROMPT_SLUG Thread-level classification
BRAINTRUST_THREAD_SELECTION_PROMPT_SLUG Thread selection
BRAINTRUST_INTERVENTION_PROMPT_SLUG Intervention generation

Optionally pin prompt versions with *_PROMPT_VERSION env vars.

Dataset Creation

Create datasets for evaluation and training via the CLI:

# Thread classification dataset from a post
python -m src.cli.braintrust create-thread-classification-dataset <post_id> [--upload]

# Classification dataset from a subreddit
python -m src.cli.braintrust create-classification-dataset <subreddit> <dataset_name> [--upload]

# Unlabeled dataset from random posts
python -m src.cli.braintrust create-unlabeled-dataset <dataset_name> [--subreddit <name>] [--count 50] [--upload]

Labeling Webapp

A FastAPI webapp for viewing and labeling thread classifications and interventions.

docker-compose up labeling-webapp

Access at localhost:8000. Features:

  • View thread classifications with full comment context
  • See bot-generated interventions inline
  • Label threads for dataset creation (requires login)
  • Public /intervention/{id} route for sharing intervention previews

Development

Testing

docker-compose run --rm bridging_bot python -m pytest tests/ -x -v

Tests use a separate bridging_bot_test database that is automatically set up by docker-compose.

Code style

Ruff is configured for formatting and import sorting via pre-commit:

pre-commit install

Adding dependencies

Dependencies are managed with Poetry. Add new packages in pyproject.toml and run poetry lock to update the lock file.

Migrations

Database migrations live in src/db/migrations/ as numbered SQL files. To create a new migration:

  1. Add a SQL file in src/db/migrations/ (e.g. 025_your_change.sql)
  2. Register it in src/db/migrations/apply_migrations.py
  3. Register it in tests/conftest.py
  4. Apply with: docker-compose run --rm bridging_bot python -m src.db.migrations.apply_migrations

License

Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors