Content Knowledge Base

Entity dataset for movies, books, and related content (characters, locations, items, people, organizations).

Quick Start

npm install
npm run pipeline -- titles.json

This runs the full pipeline:

Extract - LLM extracts entities (characters, locations, etc.) from each title
Generate - LLM generates detailed entity JSON files
Validate - Checks for issues (missing targets, duplicates, etc.)

Project Structure

entities/           # Individual entity JSON files (914 files)
prompts/            # LLM prompt templates
schemas/            # JSON schema for entities
scripts/            # Node.js scripts
batches/            # Generated batch configs (intermediate files)
titles.json         # Canonical list of all titles

Scripts

Command	Description
`npm run pipeline -- <titles.json>`	Full end-to-end generation
`npm run extract -- <type> <title>`	Extract entities from a single title
`npm run generate -- <type> <name> [source]`	Generate a single entity
`npm run generate-batch -- <config.json>`	Generate entities from batch config
`npm run validate`	Validate all entities
`npm run build`	Build search index (see below)
`npm run reconcile`	Interactive relationship reconciliation

Pipeline Options

# Full pipeline
npm run pipeline -- titles.json

# Extract only (review before generating)
npm run pipeline -- titles.json --extract-only

# Skip extraction, use existing batch files
npm run pipeline -- titles.json --skip-extract

# Process in smaller chunks (for API rate limits)
npm run pipeline -- titles.json --chunk-size=10

Adding New Content

Add titles to titles.json:

{ "type": "movie", "name": "New Movie Title" }

Run the pipeline:
```
npm run pipeline -- titles.json
```

Or generate individually:

npm run extract -- movie "New Movie" --save
npm run generate-batch -- batches/new-movie.json
npm run validate

Search Index (Optional)

If you need a combined index file for a search interface:

npm run build

This creates build/index.json with all entities bundled together, indexed by ID, type, and tag. The build folder is gitignored.

Entity Schema

Each entity has:

id - UUID
type - person, character, location, item, organization, movie, book, franchise
name - canonical name
description - 1-2 sentence summary
content - array of {title, body} sections
aliases - alternative names
relationships - links to other entities
properties - type-specific attributes
tags - categorization tags

Environment Variables

For entity generation, set:

OPENAI_API_KEY_CONTENTGEN=<api-key>
OPENAI_API_BASE_URL=<base-url>      # optional
OPENAI_API_ORG=<org-id>             # optional

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.claude		.claude
entities		entities
prompts		prompts
schemas		schemas
scripts		scripts
.gitignore		.gitignore
README.md		README.md
example-titles.json		example-titles.json
missing-entities.json		missing-entities.json
package-lock.json		package-lock.json
package.json		package.json
titles.json		titles.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Content Knowledge Base

Quick Start

Project Structure

Scripts

Pipeline Options

Adding New Content

Search Index (Optional)

Entity Schema

Environment Variables

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

traviskirton/content

Folders and files

Latest commit

History

Repository files navigation

Content Knowledge Base

Quick Start

Project Structure

Scripts

Pipeline Options

Adding New Content

Search Index (Optional)

Entity Schema

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages