-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Summary
Make it one-command easy for newcomers to spin up Postgres, load the provided halo-data.dump, and run the training queries in self-join-training/ and window-function-training/. This will reduce onboarding friction and ensure everyone runs against the same schema/data.
Why
- The repo includes a Postgres dump (halo-data.dump) and schema SQL (base-data-model/tables), but there are no clear, end-to-end setup instructions or automation to restore and verify the dataset.
- Having a reproducible environment will help students focus on SQL concepts rather than local setup.
Proposed changes
-
Add Dockerized Postgres
- Add docker-compose.yml that starts a Postgres service (e.g., postgres:16) with a named volume.
- Expose port 5432; set POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD via .env.
- Optionally add a simple healthcheck.
-
Bootstrap database with schema + data
- Add scripts/init/01_schema.sql that stitches together base-data-model/tables/*.sql (or a single combined schema.sql) to create all tables.
- Add scripts/init/02_restore.sh that loads halo-data.dump using pg_restore/psql after the DB is up. If halo-data.dump already includes schema, document which step comes first and whether schema-only or data-only restore is needed.
- Provide a Makefile (or package.json scripts) with: make up, make down, make reset, make psql, make restore.
-
Update README with Quickstart
- Prereqs: Docker, docker-compose, optional psql.
- One-liners:
- cp .env.example .env
- docker compose up -d
- make restore (or ./scripts/init/02_restore.sh)
- How to connect:
- psql "postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:5432/$POSTGRES_DB"
- How to run training queries:
- psql -f self-join-training/self-join-analysis-query.sql
- psql -f window-function-training/hot_and_cold_ratios.sql
- psql -f window-function-training/win_and_lose_streaks.sql
- psql -f window-function-training/rank_filter_streaks.sql
- Link to database_model_diagram.jpg and briefly explain the main tables.
-
Add verification checks
- scripts/verify.sql to sanity-check row counts per table (players, matches, match_details, medals, medals_matches_players, game_variants, maps, etc.).
- Document expected counts or at least shape (non-zero rows) so users know the restore worked.
-
Optional: lightweight sample dataset
- If halo-data.dump is large, include a small sample dump (or INSERT-based seed) to keep the Quickstart fast, and mention the full dump as an optional download step.
Acceptance criteria
- docker compose up -d followed by a single command (make restore or ./scripts/init/02_restore.sh) creates a working database with the schema and data loaded.
- README has a Quickstart section with copy/paste commands and troubleshooting notes.
- Running each SQL file in self-join-training/ and window-function-training/ succeeds without modification.
- A verify script confirms that key tables are populated.
Notes
- Please specify whether the target is strictly PostgreSQL (assumed from .dump) and any version-specific features used.
- If indexes or constraints are needed for performance with the training queries, document or add them in the schema step.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels