📘 Automated Book Publication Workflow

An intelligent agent-driven pipeline to automate the rewriting and publication of public-domain books with human-in-the-loop editing and content versioning.

✨ Features

🌐 Web Scraping
Scrapes book chapters from online sources (e.g., Wikisource) using Playwright.
🧠 AI Content Generation
Uses a multi-agent architecture (Writer & Reviewer) powered by LLMs (like Google Gemini) to rewrite and refine chapters.
🧑‍💻 Human-in-the-Loop Editing
Optional manual feedback after AI rewriting allows human writers/editors to make changes before finalizing.
📚 Content Versioning
Final outputs are saved into ChromaDB with automatic version labels (v1, v2, ...), UUIDs, and metadata.
🔍 RL-inspired Retrieval
Query past versions using TF-IDF + cosine similarity (stubbed for future reinforcement learning-based retrieval).

📂 Directory Structure

auto_book_pub/
├── main.py                    # Entry point
├── README.md
├── LICENSE
├── requirements.txt
├── .env
├── .gitignore
├── config/
│   └── settings.yaml
├── data/
│   ├── raw/                   # Raw scraped HTML + screenshots
│   ├── processed/             # AI-edited versions
│   └── versions/              # Finalized, versioned outputs
├── scraping/
│   └── scraper.py             # Playwright-based scraper
├── ai_agents/
│   ├── writer_agent.py        # AI "spinner"
│   ├── reviewer_agent.py      # Reviewer LLM
│   └── editor_agent.py        # Optional human/AI edit flow
├── human_loop/
│   └── feedback_manager.py    # Handle user input iterations
├── versioning/
│   └── chromadb_manager.py    # Store/retrieve versioned content
├── rl_search/
│   └── retriever.py           # Reinforcement Learning-based retriever
└── utils/
    └── helpers.py             # Common utilities

🚀 How It Works

Scrape Content

   raw_text = scrape_chapter(target_url)

AI Rewriting + Review

rewritten = WriterAgent().spin(raw_text)
reviewed  = ReviewerAgent().review(rewritten)

Human Feedback (Optional)

final_version = collect_feedback(reviewed)

Save to ChromaDB

save_version(final_version)

Retrieve Similar Versions

results = retrieve_version("version-number")

⚙️ Setup

🔧 Prerequisites

-Python 3.10+

-Google Gemini API Key (optional)

-Playwright dependencies

📦 Installation

git clone https://github.com/yourusername/auto_book_pub
cd auto_book_pub

Set up Python environment

pip install -r requirements.txt

Set up Playwright

playwright install

🔐 .env Configuration

Create a .env file:

GEMINI_API_KEY=your_google_gemini_key

🧪 Dev Mode (Optional)

To avoid re-scraping each time, enable dev mode in main.py:

dev_mode = True
if dev_mode:
    raw_text = load_cached_text()
else:
    raw_text = scrape_chapter(url)

📄 License

MIT License. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📘 Automated Book Publication Workflow

✨ Features

📂 Directory Structure

🚀 How It Works

⚙️ Setup

🔧 Prerequisites

📦 Installation

Set up Python environment

Set up Playwright

🔐 .env Configuration

🧪 Dev Mode (Optional)

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
ai_agents		ai_agents
config		config
data		data
human_loop		human_loop
rl_search		rl_search
scraping		scraping
tests		tests
utils		utils
versioning		versioning
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

License

AndyFerns/auto_book_pub

Folders and files

Latest commit

History

Repository files navigation

📘 Automated Book Publication Workflow

✨ Features

📂 Directory Structure

🚀 How It Works

⚙️ Setup

🔧 Prerequisites

📦 Installation

Set up Python environment

Set up Playwright

🔐 .env Configuration

🧪 Dev Mode (Optional)

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages