pypubpub - PubPub API Client for The Unjournal

Python package for interacting with the PubPub v6 API (pubpub.org). Created for The Unjournal (unjournal.org) to automate the production process for evaluation packages.

🚀 What's New: Full Document Import with Tables

90% automation achieved! Now supports importing Word documents with properly rendered tables directly via the PubPub API.

Key Updates (December 2024)

Word Document Import: Upload .docx files and import them with proper table rendering
Table Support: Tables are converted to native ProseMirror format (not raw HTML text)
Automated Pipeline: HTML → Word → PubPub import with full formatting preservation
Fixed Table Headers: Proper multi-row headers for metrics and journal ranking tables

Quick Start

from scripts.pubpub_automation.package_assembler import PaperMetadata, EvaluationData, EvaluationPackageData
from scripts.pubpub_automation.create_package_from_data import EvaluationPackageCreator
import conf

# 1. Define paper
paper = PaperMetadata(
    title='Your Paper Title',
    authors=['Author 1', 'Author 2'],
    doi='10.1234/example'
)

# 2. Define evaluations
evaluations = [
    EvaluationData(
        ratings={'overall_assessment': 90, 'methods': 85},
        review_source_type='latex',  # or 'markdown', 'text'
        review_source_path='/path/to/review.tex',
        evaluator_name='Jane Doe',
        is_public=False  # Anonymous for draft mode
    )
]

# 3. Create package
creator = EvaluationPackageCreator(
    email=conf.email, password=conf.password,
    community_url=conf.community_url, community_id=conf.community_id
)

package_data = EvaluationPackageData(paper=paper, evaluations=evaluations)
result = creator.create_package(package_data, draft_mode=True)

# Done! Package is live with all content automatically imported

Features

✅ Word Document Import - Import .docx files with tables via API ✅ Automatic LaTeX Conversion - LaTeX reviews → markdown → PubPub ✅ Automatic Ratings Tables - Generate formatted tables from data ✅ Draft/Final Workflow - Anonymous posting → add names after consent ✅ Template System - Auto-filled evaluation summaries ✅ Coda Integration - Fetch evaluations from Coda.io (ready to test) ✅ General Purpose - Works for any evaluation

New: Import Word Documents with Tables

from pypubpub import Pubshelper_v6

pubhelper = Pubshelper_v6(
    community_url="https://unjournal.pubpub.org",
    community_id="your-community-id",
    email="your@email.com",
    password="your-password"
)
pubhelper.login()

# Upload and import a Word document with tables
file_url = pubhelper.upload_file('evaluation_summary.docx')
file_size = os.path.getsize('evaluation_summary.docx')

# Import to pub - tables will render properly!
result = pubhelper.import_to_pub(pub_id, file_url, 'evaluation_summary.docx', file_size)

Time Savings

Task	Before	After
Convert LaTeX review	30 min	Automatic
Create ratings tables	20 min	Automatic
Fill templates	20 min	Automatic
Import to PubPub	20 min	Automatic
Total	2-3 hours	~12 minutes

📖 Documentation

AUTOMATION_COMPLETE.md - Complete automation overview
docs/AUTOMATION_WORKFLOW.md - Detailed usage guide
scripts/pubpub_automation/README.md - Quick reference
AUTOMATION_STATUS.md - Current capabilities (85% automated)
CLAUDE.md - Developer guide

🔧 Setup

Installation

# Install package in development mode
pip install -e .

# Or with dev dependencies
pip install -e ".[dev]"

Coda Integration Setup

Create .env file in repository root (already templated for you)

Add your Coda API credentials:

CODA_API_KEY=your_api_key_here
CODA_DOC_ID=your_doc_id_here
CODA_TABLE_ID=your_table_id_here

See docs/CODA_SETUP.md for detailed instructions

Test Your Setup

# Check .env configuration
python scripts/coda_integration/check_env.py

# Test Coda connection (after adding API key)
python scripts/coda_integration/test_coda_connection.py

# Test LaTeX conversion
python scripts/pubpub_automation/latex_to_markdown.py input.tex output.md

📁 Project Structure

pypubpub/
  ├── pypubpub/              # Core API client library
  │   ├── Pubv6.py           # Main API classes (Pubshelper_v6, EvaluationPackage)
  │   ├── utils.py           # Utility functions
  │   └── repec/             # RePEc metadata generation
  │
  ├── scripts/
  │   ├── pubpub_automation/       # NEW: Automated package creation
  │   │   ├── create_package_from_data.py  # Main automation script
  │   │   ├── package_assembler.py         # Package assembly
  │   │   ├── latex_to_markdown.py         # LaTeX converter
  │   │   ├── ratings_table_generator.py   # Table generator
  │   │   └── template_generator.py        # Template system
  │   │
  │   ├── coda_integration/        # Coda.io API integration
  │   │   ├── fetch_from_coda.py   # Fetch evaluation data
  │   │   ├── setup_coda.py        # Setup wizard
  │   │   └── check_env.py         # Verify configuration
  │   │
  │   └── utilities/               # Utility scripts
  │
  ├── docs/                  # Documentation
  │   ├── AUTOMATION_WORKFLOW.md   # Complete usage guide
  │   ├── AUTOMATION_GUIDE.md      # Original guide
  │   ├── CODA_SETUP.md            # Coda setup instructions
  │   └── CODA_WORKFLOW.md         # Coda integration details
  │
  ├── examples/              # Example evaluation packages
  │   └── evaluation_packages/
  │       └── scale_use_heterogeneity/  # Working example
  │
  └── tests/                 # Test suite

🎯 Main Use Cases

1. Create Evaluation Package from Files

# For LaTeX reviews, PDF ratings, local data
creator.create_from_files(
    paper_metadata=paper,
    evaluation_files=[...],
    draft_mode=True
)

2. Create Package from Coda (Coming Soon)

# Fetch from Coda and create package
from scripts.coda_integration.fetch_from_coda import fetch_evaluation_data

coda_data = fetch_evaluation_data("Paper Title")
creator.create_from_coda(coda_data, paper_metadata)

3. Draft → Final Workflow

# Step 1: Draft (anonymous)
result = creator.create_package(package_data, draft_mode=True)
# Share with authors...

# Step 2: Final (with names after consent)
for eval in evaluations:
    if evaluator_consented:
        eval.is_public = True
result = creator.create_package(package_data, draft_mode=False)

🔐 Security

✅ .env file gitignored - safe for credentials
✅ conf.py gitignored - never committed
✅ Sensitive evaluation data in evaluation_data/confidential/ (gitignored)
✅ Comprehensive .gitignore patterns for secrets
⚠️ Never commit API keys or passwords
⚠️ Never commit evaluator pseudonyms or confidential comments

🧪 Testing

# Run all tests
pytest

# Run specific test
pytest tests/test_create/test_create_eval_package.py

# Test with verbose output
pytest -v

Test configuration in tests/conf_settings.py (copy from tests/conf_settings_template.py).

📦 Core API Components

Pubshelper_v6

Main API client with methods for:

get_many_pubs() - Query and retrieve pubs
create_pub() - Create new publications
connect_pub() - Link pubs together
set_attribution() - Manage authors
replace_pub_text() - Update content
downloadpubexport() - Export in various formats
upload_file() - Upload files to PubPub's S3 storage
import_to_pub() - Import Word/HTML documents with proper table rendering
import_html_to_pub() - Convenience method for HTML import

EvaluationPackage

High-level class for creating complete evaluation packages:

Looks up paper metadata from DOI
Creates evaluation summary + individual evaluations
Sets up all connections
Associates authors/evaluators

Automation Scripts

NEW automated workflow:

Convert LaTeX/Word → Markdown
Generate ratings tables
Fill evaluation templates
Import content to PubPub
Handle draft/final modes

🛠️ Utility Scripts

Evaluation Package Creation

# Create from data with automation
python scripts/pubpub_automation/create_package_from_data.py --config config.json

Backup Pubs

from pypubpub.scripttasks.backup import backupV6

backupV6(pubhelper=pubhelper, output_dir="./backups", format='plain')

Generate RePEc Metadata

from pypubpub.repec import RePEcPopulator

populator = RePEcPopulator(pubhelper=pubhelper, inputdir="./repec_rdfs")
metadata = populator.build_metadata_file()

🌐 Related Resources

PubPub API Docs: https://www.pubpub.org/apiDocs
The Unjournal: https://unjournal.org
Production Work: https://github.com/daaronr/unjournalpubpub_production
Task Tracking: https://coda.io/d/_dOyXJoZ6imx

📈 Project Goals

Automate evaluation package creation ✅ 85% complete
Enable ad-hoc adjustments and bulk fixes ✅ Complete
Build RePEc metadata ✅ Complete
Enable feeds and updates 🚧 In progress
Coda integration 🚧 Ready to test

🤝 Contributing

This is an internal tool for The Unjournal. For issues or questions:

See documentation in docs/
Check AUTOMATION_STATUS.md for current capabilities
Refer to CLAUDE.md for development guidelines

📄 License

Internal project for The Unjournal.

Status: Production-ready automation system (90% automated) Last Updated: December 2024 Maintainer: The Unjournal team

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.claude		.claude
.venv		.venv
docs		docs
examples		examples
notebooks		notebooks
pypubpub		pypubpub
repec_rdfs		repec_rdfs
scripts		scripts
tests		tests
unjournalpubpub_production_moved		unjournalpubpub_production_moved
.env.example		.env.example
.gitignore		.gitignore
ABSTRACT_EXTRACTION_STATUS.md		ABSTRACT_EXTRACTION_STATUS.md
AGENTS.md		AGENTS.md
AUTOMATION_COMPLETE.md		AUTOMATION_COMPLETE.md
AUTOMATION_STATUS.md		AUTOMATION_STATUS.md
CLAUDE.md		CLAUDE.md
DEPLOYMENT_CHECKLIST.md		DEPLOYMENT_CHECKLIST.md
LICENSE		LICENSE
PDF_REFERENCES_REMOVED.md		PDF_REFERENCES_REMOVED.md
README.md		README.md
REORGANIZATION_PLAN.md		REORGANIZATION_PLAN.md
REPEC_DEPLOYMENT_SUMMARY.md		REPEC_DEPLOYMENT_SUMMARY.md
REPOSITORY_ORGANIZED.md		REPOSITORY_ORGANIZED.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

unjournal/pypubpub

Folders and files

Latest commit

History

Repository files navigation

pypubpub - PubPub API Client for The Unjournal

🚀 What's New: Full Document Import with Tables

Key Updates (December 2024)

Quick Start

Features

New: Import Word Documents with Tables

Time Savings

📖 Documentation

🔧 Setup

Installation

Coda Integration Setup

Test Your Setup

📁 Project Structure

🎯 Main Use Cases

1. Create Evaluation Package from Files

2. Create Package from Coda (Coming Soon)

3. Draft → Final Workflow

🔐 Security

🧪 Testing

📦 Core API Components

Pubshelper_v6

EvaluationPackage

Automation Scripts

🛠️ Utility Scripts

Evaluation Package Creation

Backup Pubs

Generate RePEc Metadata

🌐 Related Resources

📈 Project Goals

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages