diff --git a/docs/source/workbook/index.rst b/docs/source/workbook/index.rst index 5ecbcc2..56d00a2 100644 --- a/docs/source/workbook/index.rst +++ b/docs/source/workbook/index.rst @@ -18,4 +18,5 @@ PyStatsV1 Workbook track_d_dataset_map track_d_outputs_guide track_d_my_own_data + track_d_assignments track_d_lab_ta_notes \ No newline at end of file diff --git a/docs/source/workbook/track_d_assignments.rst b/docs/source/workbook/track_d_assignments.rst new file mode 100644 index 0000000..8c10054 --- /dev/null +++ b/docs/source/workbook/track_d_assignments.rst @@ -0,0 +1,208 @@ +.. _track_d_assignments: + +========================================= +Track D assignments: labs + rubric (TA) +========================================= + +This page turns Track D into a **gradeable lab sequence**. + +- Students produce a small set of files under ``outputs/track_d/``. +- Students answer a few short prompts (and write one short memo). +- TAs can grade quickly with a clear checklist + rubric. + +If students are new to Track D, point them to: + +- :doc:`track_d_student_edition` +- :doc:`track_d_outputs_guide` (what outputs mean) +- :doc:`track_d_dataset_map` (tables, keys, and intentional QC “warts”) + +Prerequisites (students) +======================== + +Students should start from a fresh Track D workbook folder created by:: + + pystatsv1 workbook init --track d --dest track_d_workbook + cd track_d_workbook + +All labs below assume the default Track D scripts in that folder. + +Submission format +================= + +Ask students to submit **one ZIP file** containing: + +1) ``outputs/track_d/`` (artifacts) +2) ``answers.md`` (short answers for Labs 1–3) + +Suggested ZIP command (from the workbook folder):: + + # Windows PowerShell (inside track_d_workbook) + Compress-Archive -Path outputs/track_d,answers.md -DestinationPath track_d_submission.zip -Force + + # macOS/Linux (inside track_d_workbook) + zip -r track_d_submission.zip outputs/track_d answers.md + +Lab 1 — Run + interpret outputs +=============================== + +Goal +---- + +Students can run Track D, locate outputs, and explain the key metrics in plain language. + +Run +--- + +From the Track D workbook folder:: + + pystatsv1 workbook run d00_peek_data + pystatsv1 workbook run d01 + +Required artifacts +------------------ + +Students must include these files in ``outputs/track_d/``: + +- ``d00_peek_data_summary.md`` +- ``business_ch01_summary.json`` +- ``business_ch01_income_statement.png`` +- ``business_ch01_balance_sheet.png`` + +Short answers (write in ``answers.md``) +--------------------------------------- + +Answer in 1–3 sentences each. + +1) **Where do outputs land?** + - Give the exact folder path (relative to the workbook). + +2) **Net income can be negative. Why?** + - Explain one realistic reason a startup month might show negative net income. + +3) **Gross margin, in plain language.** + - What is it, and why do analysts care? + +4) **What does ``pct_sales_on_account`` mean?** + - Explain it as if you were speaking to a non-accountant manager. + +5) **What good looks like (sanity checks).** + - List 2 checks you can run (or observe) that increase confidence you didn’t break the data. + +TA quick-check (optional) +------------------------- + +For the stock Track D case, ``d01`` prints a small “sanity summary” to the terminal +that includes ``n_transactions`` and whether the accounting equation balances. +Students do **not** need exact matching numbers, but their explanations should be consistent. + +Lab 2 — QC detective: duplicate bank transaction IDs +==================================================== + +Goal +---- + +Students demonstrate “accounting-data analyst” habits: +identify a QC issue, explain why it matters, and propose a defensible fix. + +Task +---- + +Using the NSO v1 bank statement table (from the Track D dataset), find a duplicated +transaction identifier and document it. + +Students may use any method (spreadsheet filter, Python, etc.). +Here is an optional Python snippet they can adapt:: + + import pandas as pd + p = "data/synthetic/nso_v1/bank_statement.csv" + df = pd.read_csv(p) + dup = df[df.duplicated("bank_txn_id", keep=False)].sort_values("bank_txn_id") + print(dup[["bank_txn_id","date","amount","description"]].head(20)) + +Required artifacts +------------------ + +Students must include in ``outputs/track_d/``: + +- ``lab2_qc_report.md`` (their QC write-up) + +Their report must include: + +- the duplicated ``bank_txn_id`` value(s) +- how many rows were duplicated +- why the duplication matters (what could go wrong downstream) +- a proposed fix policy (example: “drop exact duplicates”, or “keep the earliest posted row”) +- one “before vs after” check (example: total cash inflow/outflow changes, or count of transactions) + +Short answers (write in ``answers.md``) +--------------------------------------- + +1) **What did you find?** + - One sentence summary of the duplicate. + +2) **What could break if we ignore it?** + - Name one analysis result that could be distorted. + +3) **What fix policy did you choose and why?** + +Lab 3 — Executive summary memo +============================== + +Goal +---- + +Students turn outputs into a clear, decision-friendly narrative. + +Prompt +------ + +Write a **250–400 word** memo (in ``answers.md``) titled: + +**“Track D Executive Summary (Month 1)”** + +Use the Lab 1 outputs (plots + JSON summary) and include: + +- 2–3 key facts about sales, gross margin, and expenses +- net income interpretation (including why it may be negative) +- 1 balance sheet insight (example: cash position, liabilities, equity) +- 2 “next questions” you’d ask before making a real decision + +Rubric (100 points) +=================== + +This rubric is designed to be fast to grade. + +Artifacts + reproducibility (30) +-------------------------------- + +- (10) Required Lab 1 artifacts present in ``outputs/track_d/`` +- (10) Lab 2 QC report present and complete +- (10) Student’s ``answers.md`` is organized with headings for Lab 1–3 + +Interpretation of outputs (30) +------------------------------ + +- (10) Net income explanation is realistic and correctly framed +- (10) Gross margin explained correctly (and why it matters) +- (10) ``pct_sales_on_account`` explained correctly and connected to business meaning + +QC reasoning (20) +----------------- + +- (10) Duplicate ID issue clearly demonstrated with evidence +- (10) Fix policy is defensible; “before vs after” check is included + +Communication quality (20) +-------------------------- + +- (10) Executive memo is clear, concise, and decision-oriented +- (10) Writing uses plain language, defines terms, and avoids jargon where possible + +Extensions (optional) +===================== + +If you want to stretch advanced students, ask for one extra artifact: + +- a small table (CSV) that summarizes totals by month (or by account) + +But the core labs above are intentionally minimal so students can complete them quickly. diff --git a/docs/source/workbook/track_d_lab_ta_notes.rst b/docs/source/workbook/track_d_lab_ta_notes.rst index dca620b..da5bfbb 100644 --- a/docs/source/workbook/track_d_lab_ta_notes.rst +++ b/docs/source/workbook/track_d_lab_ta_notes.rst @@ -22,6 +22,23 @@ If you have 5 minutes before lab, skim these pages: * :doc:`track_d_dataset_map` — the table mental model + intentional QC “warts” * :doc:`track_d_outputs_guide` — what each generated output means and how to interpret it * :doc:`track_d_my_own_data` — a practical "bring your own data" bridge (30-minute recipe) +* :doc:`track_d_assignments` — classroom-ready labs + rubric (gradeable artifacts + short answers) + +How to use the assignments page +------------------------------- + +If you want Track D to be **turnkey as a lab**, point students at: + +* :doc:`track_d_assignments` + +That page is designed to be self-contained: + +* **Lab 1**: run Track D and interpret outputs +* **Lab 2**: reconcile a teachable QC issue (duplicate bank transaction id) +* **Lab 3**: write a short executive summary memo from the outputs + +The rubric is intended to reward analysis habits (contracts, checks, reconciliation, and clear writing), +not just code execution. 1. Learning goals ================= diff --git a/docs/source/workbook/track_d_student_edition.rst b/docs/source/workbook/track_d_student_edition.rst index cc669da..1ebd6be 100644 --- a/docs/source/workbook/track_d_student_edition.rst +++ b/docs/source/workbook/track_d_student_edition.rst @@ -53,6 +53,7 @@ These pages live inside the workbook documentation subtree (they build cleanly o - :doc:`track_d_dataset_map` — the **dataset mental model** (what each table is, how they relate, and why some rows are “warts” on purpose) - :doc:`track_d_outputs_guide` — how to read the outputs folders and key CSV artifacts - :doc:`track_d_my_own_data` — bring your own exports: a 30-minute "one table + one chart + one check" recipe +- :doc:`track_d_assignments` — **student labs + rubric** (gradeable artifacts + short answers) - :doc:`track_d_lab_ta_notes` — a lab handout + TA notes (walkthrough + interpretation) What you are building (the pipeline)