From eff16e0eecdded75d3bcb4a7cdec0df9b3d44258 Mon Sep 17 00:00:00 2001
From: Nicholas Karlson <nicholaskarlson@gmail.com>
Date: Sun, 18 Jan 2026 11:38:43 -0800
Subject: [PATCH] Docs: add Track D lab + TA notes (workbook)

---
 docs/source/workbook/index.rst                |   1 +
 docs/source/workbook/track_d_lab_ta_notes.rst | 399 ++++++++++++++++++
 2 files changed, 400 insertions(+)
 create mode 100644 docs/source/workbook/track_d_lab_ta_notes.rst

diff --git a/docs/source/workbook/index.rst b/docs/source/workbook/index.rst
index 658a4c1..3d12053 100644
--- a/docs/source/workbook/index.rst
+++ b/docs/source/workbook/index.rst
@@ -14,3 +14,4 @@ PyStatsV1 Workbook
    troubleshooting
    track_c
    track_d
+   track_d_lab_ta_notes
diff --git a/docs/source/workbook/track_d_lab_ta_notes.rst b/docs/source/workbook/track_d_lab_ta_notes.rst
new file mode 100644
index 0000000..a5dbef0
--- /dev/null
+++ b/docs/source/workbook/track_d_lab_ta_notes.rst
@@ -0,0 +1,399 @@
+Track D Lab 0/1 (PyPI-only)
+===========================
+
+TA Notes + Script for Explaining the Lab and the Outputs
+--------------------------------------------------------
+
+This handout is for a TA running a lab section where students install **PyStatsV1** from **PyPI**, initialize the
+**Track D workbook**, and run:
+
+- ``d00_peek_data`` (tour the datasets)
+- ``d01`` (Chapter 1: accounting checks + key metrics)
+- ``business_smoke`` (a short automated check suite)
+
+It includes what to say, what students should see, and how to explain the output.
+
+1. Learning goals
+=================
+
+By the end of this lab, students should be able to:
+
+1. Set up a clean Python environment (virtualenv) for reproducible analysis.
+2. Install a “batteries included” workbook from PyPI (no cloning repos).
+3. Initialize a Track D project folder that contains:
+
+   - a Track D workbook template
+   - pre-installed synthetic datasets (seed=123)
+
+4. Run a “data tour” to see what files exist and what they look like.
+5. Run an “accounting data sanity check” and interpret the outputs:
+
+   - Are entries balanced?
+   - Does the accounting equation hold?
+   - What are the basic business metrics?
+
+6. Run a lightweight test suite (``business_smoke``) as a professional habit.
+
+TA framing line
+---------------
+
+“Today isn’t about memorizing accounting terms. It’s about learning the analyst’s workflow:
+install → initialize → inspect → validate → summarize → repeat.”
+
+2. Lab structure
+================
+
+**Total time:** ~40–60 minutes
+
+1) Setup (10–15 min)
+   Create venv, upgrade pip, install ``pystatsv1[workbook]``.
+
+2) Initialize workbook (5 min)
+   Create a Track D workbook folder with datasets pre-installed.
+
+3) Explore data (10–15 min)
+   Run ``d00_peek_data``, interpret what’s in LedgerLab + NSO.
+
+4) Run first analysis/checks (10 min)
+   Run ``d01``, interpret checks + key metrics.
+
+5) Confidence check (5 min)
+   Run ``business_smoke`` and explain what “13 passed” means.
+
+3. Environment setup talk track
+===============================
+
+3.1 Why virtual environments matter (30 seconds)
+------------------------------------------------
+
+“A virtual environment is a sealed sandbox. Everyone in this class can run the same commands and get the same results.
+It prevents dependency conflicts and makes troubleshooting easier.”
+
+3.2 Commands
+------------
+
+.. code-block:: bash
+
+   python -m venv .venv
+   # Windows (Git Bash):
+   source .venv/Scripts/activate
+   python -m pip install -U pip
+   pip install "pystatsv1[workbook]"
+
+What students should notice
+---------------------------
+
+- pip upgrades successfully.
+- The install pulls scientific stack packages (NumPy/Pandas/SciPy/Statsmodels/Matplotlib…).
+- The workbook extra includes ``pytest``, which powers ``workbook check``.
+
+TA note: If installs are slow, reassure them it’s normal (large compiled wheels).
+
+4. Initialize the Track D workbook
+==================================
+
+4.1 What ``init`` does
+----------------------
+
+“``workbook init`` creates a new project folder. It copies a starter template and unpacks the datasets into a predictable
+location. You now have a ready-to-run lab workspace.”
+
+.. code-block:: bash
+
+   pystatsv1 workbook init --track d --dest track_d_workbook
+   cd track_d_workbook
+
+Students should see a message like:
+
+- “✅ Track D workbook starter created at …”
+- “Datasets are pre-installed under ``data/synthetic/``, seed=123.”
+
+4.2 Why seed=123 matters
+------------------------
+
+“Seed=123 means the synthetic datasets are deterministic. If you and I run the same scripts, we get the same numbers.
+That’s key for teaching, grading, and reproducibility.”
+
+5. List the available Track D runs
+==================================
+
+.. code-block:: bash
+
+   pystatsv1 workbook list --track d
+
+Explain the list
+----------------
+
+- Each ``Dxx`` corresponds to a chapter or checkpoint.
+- ``d00_peek_data`` is the dataset tour.
+- ``d01`` is the first content chapter runner.
+- Later chapters (``d02``–``d23``) provide a consistent “run menu” over the course.
+
+TA line: “You can think of this as a menu of mini-programs: run, inspect outputs, then modify and extend.”
+
+6. Run ``d00_peek_data`` (data tour)
+====================================
+
+.. code-block:: bash
+
+   pystatsv1 workbook run d00_peek_data
+
+6.1 What ``d00_peek_data`` is doing
+-----------------------------------
+
+Explain it as three steps:
+
+1. Locate datasets under ``data/synthetic/…``
+2. Read each CSV and print:
+
+   - file name
+   - number of rows/columns
+   - column names
+   - a small preview
+
+3. Write a Markdown summary file:
+
+   - ``outputs/track_d/d00_peek_data_summary.md``
+
+TA line: “Before statistics, confirm what data exists and what shape it’s in.”
+
+6.2 Two datasets: LedgerLab vs NSO
+----------------------------------
+
+LedgerLab (Ch01)
+^^^^^^^^^^^^^^^^
+
+LedgerLab is a small “training wheels” business dataset you can trace end-to-end:
+
+- ``chart_of_accounts.csv`` (account dictionary)
+- ``gl_journal.csv`` (debit/credit lines by transaction)
+- ``trial_balance_monthly.csv`` (monthly balances)
+- ``statements_is_monthly.csv`` (income statement)
+- ``statements_bs_monthly.csv`` (balance sheet)
+- ``statements_cf_monthly.csv`` (cash flow)
+
+TA point: “LedgerLab helps you trace journal → trial balance → statements.”
+
+NSO v1 running case
+^^^^^^^^^^^^^^^^^^^
+
+NSO is the “bigger business system” with multiple subledgers and derived outputs:
+
+- ``bank_statement.csv`` (includes a deliberately duplicated ID)
+- ``ar_events.csv`` / ``ap_events.csv``
+- ``inventory_movements.csv``
+- ``payroll_events.csv``
+- ``sales_tax_events.csv``
+- ``fixed_assets.csv`` + ``depreciation_schedule.csv``
+- ``debt_schedule.csv``
+- plus statement/trial balance outputs
+
+TA point: “NSO is designed to feel like real company data: multiple sources and common quality issues.”
+
+6.3 Key columns to explain
+--------------------------
+
+``chart_of_accounts.csv``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- ``account_type``: Asset, Liability, Equity, Revenue, Expense, Contra Asset
+- ``normal_side``:
+
+  - Assets/Expenses normally **Debit**
+  - Liabilities/Equity/Revenue normally **Credit**
+
+TA line: “Normal side is about sign conventions in the system.”
+
+``gl_journal.csv``
+^^^^^^^^^^^^^^^^^^
+
+- ``txn_id`` groups multiple lines into one transaction.
+- Each transaction should balance: sum(debits) = sum(credits).
+
+TA line: “A transaction is a mini-equation: where value came from and where it went.”
+
+Statements and trial balance
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- Trial balance is database-style output.
+- Statements are human-facing summaries.
+
+TA line: “Trial balance is the structured ledger; statements are the story.”
+
+7. Run ``d01`` (Chapter 1 checks + key metrics)
+===============================================
+
+.. code-block:: bash
+
+   pystatsv1 workbook run d01
+
+It prints **Checks** and **Key metrics**, then writes artifacts under ``outputs/track_d``.
+
+7.1 Checks (what they mean)
+---------------------------
+
+``transactions_balanced: True``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Every transaction’s debits equal credits.
+
+TA line: “If this fails, you fix the data pipeline before analysis.”
+
+``n_transactions``
+^^^^^^^^^^^^^^^^^^
+
+Count of transaction groups in the LedgerLab data used for d01.
+
+``n_unbalanced`` and ``max_abs_diff``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- ``n_unbalanced``: number of transactions with debits ≠ credits
+- ``max_abs_diff``: largest absolute imbalance amount
+
+TA line: “If max_abs_diff is nonzero, we have an integrity error.”
+
+``accounting_equation_balances: True``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Total assets equal total liabilities plus equity (system-wide sanity check).
+
+TA line: “Even if each transaction balances, you still want the big equation to hold.”
+
+7.2 Key metrics (how to interpret)
+----------------------------------
+
+Revenue and sales behavior
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- ``sales_total``: total sales
+- ``n_sales``: number of sales events
+- ``avg_sale``: average sale size
+- ``pct_sales_on_account``: fraction of sales made on credit
+
+TA line: “A/R exists because not all sales are paid immediately. This hints at liquidity risk.”
+
+Cost and margin
+^^^^^^^^^^^^^^^
+
+- ``cogs_total``: cost of goods sold
+- ``gross_profit`` = sales − cogs
+- ``gross_margin_pct`` = gross_profit / sales
+
+TA line: “Gross margin is a core health metric and often a driver variable later.”
+
+Net income and cash
+^^^^^^^^^^^^^^^^^^^
+
+- ``net_income`` may be negative
+- ``ending_cash`` may still be positive
+
+Teaching moment
+^^^^^^^^^^^^^^^
+
+“Profit and cash are not the same thing. You can lose money but still have cash (owner contributions, timing).
+You can also earn profit and run out of cash.”
+
+8. Outputs (what to open)
+=========================
+
+Outputs are written under:
+
+- ``outputs/track_d/``
+
+Students should open:
+
+- ``outputs/track_d/d00_peek_data_summary.md`` (readable dataset inventory)
+- Any CSV artifacts written by the runs (trial balance, statements, etc.)
+
+TA line: “In real work, reproducible artifacts matter more than console output.”
+
+9. Run the smoke tests (``business_smoke``)
+===========================================
+
+.. code-block:: bash
+
+   pystatsv1 workbook check business_smoke
+
+Students should see something like:
+
+- “13 passed …”
+
+Explain plainly
+---------------
+
+“These are automated checks that verify the workbook behaves as promised: commands run, outputs appear, and key
+invariants stay true. Passing tests means your lab environment is healthy.”
+
+10. Common issues and quick fixes
+=================================
+
+Command not found
+-----------------
+
+If ``pystatsv1`` isn’t recognized, use module form:
+
+.. code-block:: bash
+
+   python -m pystatsv1 workbook --help
+
+Wrong folder
+------------
+
+If outputs/data can’t be found, confirm they’re inside the workbook folder:
+
+.. code-block:: bash
+
+   pwd
+   ls
+
+Reset everything
+----------------
+
+.. code-block:: bash
+
+   pystatsv1 workbook run d00_setup_data --force
+   pystatsv1 workbook run d00_peek_data
+
+Confusion about negative income
+-------------------------------
+
+Teaching moment: owner contributions are cash inflows but not revenue.
+Show them the contribution entry in ``gl_journal.csv`` and compare to sales lines.
+
+11. Discussion prompts (if time)
+================================
+
+1. Why is ``pct_sales_on_account`` not zero? What does credit sales imply about cash planning?
+2. Gross margin around ~45%: what types of businesses might fit?
+3. Net income negative but cash positive: what events create that pattern?
+4. NSO includes a deliberate duplicate bank transaction ID: why include intentional errors?
+
+12. Closing script (30 seconds)
+===============================
+
+“Today you proved you can set up a reproducible environment, inspect accounting-style datasets, validate integrity
+constraints, and generate a first business summary. That’s the workflow: make data trustworthy before analyzing it.
+Next labs build on this foundation toward statistical reasoning and decision support.”
+
+Appendix A: Command block (TA slide)
+====================================
+
+.. code-block:: bash
+
+   # Setup (once)
+   python -m venv .venv
+   source .venv/Scripts/activate
+   python -m pip install -U pip
+   pip install "pystatsv1[workbook]"
+
+   # Start Track D
+   pystatsv1 workbook init --track d --dest track_d_workbook
+   cd track_d_workbook
+
+   # Tour + first checks
+   pystatsv1 workbook list --track d
+   pystatsv1 workbook run d00_peek_data
+   pystatsv1 workbook run d01
+
+   # Confidence check
+   pystatsv1 workbook check business_smoke