GitHub

Abuse Vector Reporting Pipeline

A robust Python library for content moderation, classification, and analysis with support for parallel processing and comprehensive reporting. Features

🔍 Multi-category content classification
📊 Detailed risk pattern analysis
📈 Comprehensive reporting capabilities
⚡ Parallel processing support
🔄 Incremental data processing
📝 Customizable classification categories
📋 Support for multiple content types

The dataset google/civil_comments is used in the example but any dataset with a prompt/text field can be used.

Instructions

The library is developed using Python 3.13.1.

Copy and rename .env.example and fill in your API key file, don't add quotes. This is critical!

OPEN_AI_API_KEY=sk....

Set up virtual environment using venv

    python3 -m venv env 
    source env/bin/activate

Install dependencies

    pip3 install -r requirements.txt

Generate report. Note: This will take ~3-4 minutes to classify all the prompts and generate a text file with the analysis, which automatically includes new data when added to the json without reclassifying events with labels.

    python -m generate_report

Key Concepts

graph LR
    A[Input Data] --> B[ContentDataSet]
    B --> C[ContentClassifier]
    C --> D[Classification Results]
    D --> E[ReportGenerator]
    E --> F[Analysis Reports]

The library is organized into several key components:

Categories: Defines the taxonomy and classification schema

Category	Label	Definition
sexual	`S`	Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).
hate	`H`	Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
violence	`V`	Content that promotes or glorifies violence or celebrates the suffering or humiliation of others.
harassment	`HR`	Content that may be used to torment or annoy individuals in real life, or make harassment more likely to occur.
self-harm	`SH`	Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
sexual/minors	`S3`	Sexual content that includes an individual who is under 18 years old.
hate/threatening	`H2`	Hateful content that also includes violence or serious harm towards the targeted group.
violence/graphic	`V2`	Violent content that depicts death, violence, or serious physical injury in extreme graphic detail.
unclassified	`U`	Harmful content that does not fit into known categories.
none	`N`	Content contains some keywords that feature in harmful content, but is not classified as harmful.

Content Type: Form of communication employed in content
- Question/Query
- Call to Action
- News/Information_Sharing
- Complaint/Grievance
- Debate/Argument
- Emotional_Expression
- Personal_Experience
- Educational_Content
- Promotion/Advertisement
- Unclassified
Classification: Handles content analysis and categorization
- Divisive Content
- Radicalization Patterns
- Harassment Indicators
- Spam Patterns
- Scam Indicators
- Data Mining Attempts
- Coordinated Behavior
- Ban Evasion Attempts
- Testing Boundaries
- Malicious Links
- Policy Circumvention
Report: Generates comprehensive analysis reports

Command Line Usage

The library includes a command-line interface for batch processing:

python generate_report.py \
    --pickle-path data_latest.pkl \
    --report-title "Daily Moderation Report" \
    --output-dir reports

Command Line Options

--pickle-path: Path for data serialization (default: data_latest.pkl)

--report-title: Custom report title

--keep-last-classification: Only use most recent classifications

--skip-classification: Skip classification step

--output-dir: Output directory for reports

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
report_pipeline		report_pipeline
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate_report.py		generate_report.py
report.txt		report.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Abuse Vector Reporting Pipeline

Instructions

Key Concepts

Command Line Usage

About

Uh oh!

Releases

Packages

Languages

License

sofasogood/moderation_pipeline

Folders and files

Latest commit

History

Repository files navigation

Abuse Vector Reporting Pipeline

Instructions

Key Concepts

Command Line Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages