Privacy-Pilot

Overview

A Python-based tool for analyzing, and evaluating website Terms & Conditions documents with a focus on privacy and data protection compliance.

Note: This project now uses crawl4ai for web scraping instead of Selenium. See MIGRATION_SUMMARY.md and CRAWL4AI_GUIDE.md for details.

Features

GDPR compliance scoring
Privacy and legal compliance analysis
Web scraping of Terms & Conditions pages (powered by crawl4ai)
Multi-query document retrieval
Comprehensive JSON output of privacy metrics

Prerequisites

Python 3.8+
Internet connection (for first-time browser setup)

Installation

Clone the repository:

git clone https://github.com/yourusername/privacy-analyzer.git
cd privacy-analyzer

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install dependencies:

pip install -r requirements.txt

Set up environment variables:

Create a .env file
Add your Groq API key: GROQ_API_KEY=your_api_key_here

Configuration

Model Selection

Modify the llm initialization in get_json() to choose different language models:

mixtral-8x7b-32768
llama3-8b-8192
Other Groq-supported models

Embedding Model

Update local_model_path to use a different Hugging Face embedding model.

Usage

from main import get_json

# Analyze Terms & Conditions
url = "https://example.com/terms"
results = get_json(url)

Dependencies

LangChain
Crawl4AI (web scraping)
Hugging Face Transformers
Groq API
Chroma Vector Store

Limitations

Accuracy depends on webpage structure
Requires internet connection for scraping
Limited to publicly accessible web pages

Disclaimer

This tool provides an automated analysis and should not replace professional legal advice.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
docs		docs
extra		extra
scrape		scrape
temp		temp
test_code		test_code
.env.example		.env.example
.gitignore		.gitignore
DejaVuSans.cw127.pkl		DejaVuSans.cw127.pkl
DejaVuSans.pkl		DejaVuSans.pkl
DejaVuSans.ttf		DejaVuSans.ttf
README.md		README.md
app.py		app.py
g1.py		g1.py
main2.py		main2.py
output.json		output.json
p1.pdf		p1.pdf
requirements.txt		requirements.txt
t1.txt		t1.txt
t2p.py		t2p.py
text.txt		text.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Privacy-Pilot

Overview

Features

Prerequisites

Installation

Configuration

Model Selection

Embedding Model

Usage

Dependencies

Limitations

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Languages

VishalPainjane/PrivacyPilot

Folders and files

Latest commit

History

Repository files navigation

Privacy-Pilot

Overview

Features

Prerequisites

Installation

Configuration

Model Selection

Embedding Model

Usage

Dependencies

Limitations

Disclaimer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages