-
Notifications
You must be signed in to change notification settings - Fork 0
feat: mkdocs/readthedocs #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| name: Read the Docs PR preview | ||
|
|
||
| on: | ||
| pull_request_target: | ||
| types: | ||
| - opened | ||
| paths: | ||
| - "docs/**" | ||
| - "mkdocs.yml" | ||
| - ".readthedocs.yaml" | ||
|
|
||
| permissions: | ||
| pull-requests: write | ||
|
|
||
| jobs: | ||
| documentation-links: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: readthedocs/actions/preview@v1 | ||
| with: | ||
| project-slug: "fgmetric" | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| version: 2 | ||
|
|
||
| build: | ||
| os: ubuntu-24.04 | ||
| tools: | ||
| python: "3.12" | ||
| jobs: | ||
| pre_create_environment: | ||
| - asdf plugin add uv | ||
| - asdf install uv latest | ||
| - asdf global uv latest | ||
| create_environment: | ||
| - uv venv "${READTHEDOCS_VIRTUALENV_PATH}" | ||
| install: | ||
| - UV_PROJECT_ENVIRONMENT="${READTHEDOCS_VIRTUALENV_PATH}" uv sync --frozen --group docs | ||
| build: | ||
| html: | ||
| - uv run --frozen mkdocs build --strict --site-dir "${READTHEDOCS_OUTPUT}/html" | ||
|
|
||
| mkdocs: | ||
| configuration: mkdocs.yml |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # Metric | ||
|
|
||
| ::: fgmetric.Metric | ||
| options: | ||
| show_bases: false | ||
| members: | ||
| - read |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # MetricWriter | ||
|
|
||
| ::: fgmetric.MetricWriter | ||
| options: | ||
| show_bases: false | ||
| members: | ||
| - __init__ | ||
| - write | ||
| - writeall | ||
| - close |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| --8<-- "CONTRIBUTING.md" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,148 @@ | ||
| # User Guide | ||
|
|
||
| ## Defining a Metric | ||
|
|
||
| A `Metric` is a Pydantic model that maps to rows in a delimited file. | ||
| Define one by subclassing `Metric` and declaring fields with type annotations: | ||
|
|
||
| ```python | ||
| from fgmetric import Metric | ||
|
|
||
|
|
||
| class AlignmentMetric(Metric): | ||
| read_name: str | ||
| mapping_quality: int | ||
| is_duplicate: bool = False | ||
| ``` | ||
|
|
||
| Each field corresponds to a column in your file. | ||
| Pydantic handles type coercion automatically --- string values like `"60"` become `int`, `"true"` becomes `bool`, and so on. | ||
|
|
||
| ## Reading Metrics | ||
|
|
||
| Use the `read()` class method to iterate over rows in a delimited file: | ||
|
|
||
| ```python | ||
| from pathlib import Path | ||
|
|
||
| for metric in AlignmentMetric.read(Path("alignments.tsv")): | ||
| print(f"{metric.read_name}: MQ={metric.mapping_quality}") | ||
| ``` | ||
|
|
||
| `read()` yields one `Metric` instance per row, lazily --- so you can process files larger than memory. | ||
| If any row fails validation, Pydantic raises a `ValidationError` with details about which field failed and why. | ||
|
|
||
| Example input file (`alignments.tsv`): | ||
|
|
||
| ```tsv | ||
| read_name mapping_quality is_duplicate | ||
| read1 60 false | ||
| read2 30 true | ||
| ``` | ||
|
|
||
| ### Custom Delimiters | ||
|
|
||
| By default, `read()` expects tab-separated values. Pass a `delimiter` argument for other formats: | ||
|
|
||
| ```python | ||
| # Reading CSV files | ||
| for metric in AlignmentMetric.read(Path("data.csv"), delimiter=","): | ||
| ... | ||
| ``` | ||
|
|
||
| ### Empty Fields | ||
|
|
||
| Empty fields in optional columns are automatically converted to `None`: | ||
|
|
||
| ```python | ||
| class QualityMetric(Metric): | ||
| sample: str | ||
| score: float | None # Empty string in file becomes None | ||
| ``` | ||
|
|
||
| ## Writing Metrics | ||
|
|
||
| Use `MetricWriter` as a context manager to write metrics to a file: | ||
|
|
||
| ```python | ||
| from fgmetric import MetricWriter | ||
|
|
||
| metrics = [ | ||
| AlignmentMetric(read_name="read1", mapping_quality=60), | ||
| AlignmentMetric(read_name="read2", mapping_quality=30, is_duplicate=True), | ||
| ] | ||
|
|
||
| with MetricWriter(AlignmentMetric, Path("output.tsv")) as writer: | ||
| writer.writeall(metrics) | ||
| ``` | ||
|
Comment on lines
+67
to
+77
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: cd "$(git rev-parse --show-toplevel)" && cat -n docs/guide.md | sed -n '67,77p'Repository: fg-labs/fgmetric Length of output: 436 🏁 Script executed: cd "$(git rev-parse --show-toplevel)" && rg "class AlignmentMetric" --type pythonRepository: fg-labs/fgmetric Length of output: 90 🏁 Script executed: cd "$(git rev-parse --show-toplevel)" && rg "class AlignmentMetric"Repository: fg-labs/fgmetric Length of output: 313 🏁 Script executed: cd "$(git rev-parse --show-toplevel)" && cat -n fgmetric/__init__.pyRepository: fg-labs/fgmetric Length of output: 240 🏁 Script executed: cd "$(git rev-parse --show-toplevel)" && cat -n docs/guide.md | sed -n '1,67p' | tail -20Repository: fg-labs/fgmetric Length of output: 614 🏁 Script executed: cd "$(git rev-parse --show-toplevel)" && rg "AlignmentMetric" docs/guide.md -B 5 | head -50Repository: fg-labs/fgmetric Length of output: 949 Make the writing example self-contained. Lines 75-76 use Proposed change ```python
+from pathlib import Path
+from fgmetric import Metric
+
+
+class AlignmentMetric(Metric):
+ read_name: str
+ mapping_quality: int
+ is_duplicate: bool = False
+
+
from fgmetric import MetricWriter
metrics = [Alternatively, if the intent is to keep the class definition separate, import it from the earlier section (though this assumes readers have already executed that code block): ```python
+from pathlib import Path
+from fgmetric import Metric
+
+class AlignmentMetric(Metric):
+ read_name: str
+ mapping_quality: int
+ is_duplicate: bool = False
+
from fgmetric import MetricWriter
metrics = [🤖 Prompt for AI Agents |
||
|
|
||
| The writer automatically outputs the header row based on the Metric's field names. | ||
| You can also write one metric at a time with `writer.write(metric)`. | ||
|
|
||
| ### Custom Delimiters | ||
|
|
||
| Just like reading, writing supports custom delimiters: | ||
|
|
||
| ```python | ||
| with MetricWriter(AlignmentMetric, Path("output.csv"), delimiter=",") as writer: | ||
| writer.writeall(metrics) | ||
| ``` | ||
|
|
||
| ## List Fields | ||
|
|
||
| Fields typed as `list[T]` are automatically parsed from and serialized to delimited strings: | ||
|
|
||
| ```python | ||
| class TaggedRead(Metric): | ||
| read_id: str | ||
| tags: list[str] # "A,B,C" becomes ["A", "B", "C"] | ||
| scores: list[int] # "1,2,3" becomes [1, 2, 3] | ||
| optional_tags: list[str] | None # "" becomes None | ||
| ``` | ||
|
|
||
| The list delimiter defaults to `,` but can be customized per-metric with the `collection_delimiter` class variable: | ||
|
|
||
| ```python | ||
| class SemicolonMetric(Metric): | ||
| collection_delimiter = ";" | ||
| values: list[int] # "1;2;3" becomes [1, 2, 3] | ||
| ``` | ||
|
|
||
| ## Counter Fields | ||
|
|
||
| When your file has categorical data with one column per category (e.g. base counts A, C, G, T), you can model them as a single `Counter[StrEnum]` field: | ||
|
|
||
| ```python | ||
| from collections import Counter | ||
| from enum import StrEnum | ||
|
|
||
| from fgmetric import Metric | ||
|
|
||
|
|
||
| class Base(StrEnum): | ||
| A = "A" | ||
| C = "C" | ||
| G = "G" | ||
| T = "T" | ||
|
|
||
|
|
||
| class BaseCountMetric(Metric): | ||
| position: int | ||
| counts: Counter[Base] | ||
| ``` | ||
|
|
||
| Given an input file: | ||
|
|
||
| ```tsv | ||
| position A C G T | ||
| 1 10 5 3 2 | ||
| ``` | ||
|
|
||
| This parses to: | ||
|
|
||
| ```python | ||
| BaseCountMetric(position=1, counts=Counter({Base.A: 10, Base.C: 5, Base.G: 3, Base.T: 2})) | ||
| ``` | ||
|
|
||
| The enum members define both the expected column names and the keys in the resulting `Counter`. | ||
| When writing, the `Counter` is "pivoted" back into separate columns. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| # fgmetric | ||
|
|
||
| [](https://github.com/fulcrumgenomics/fgmetric/actions/workflows/python_package.yml?query=branch%3Amain) | ||
| [](https://github.com/fulcrumgenomics/fgmetric) | ||
| [](http://mypy-lang.org/) | ||
|
|
||
| **Type-validated Python models for delimited data files.** | ||
|
|
||
| `fgmetric` lets you define Python classes ("Metrics") that map directly to rows in CSV/TSV files. | ||
| It handles parsing, type coercion (strings to int, float, bool), and validation automatically using [Pydantic](https://docs.pydantic.dev/latest/). | ||
|
|
||
| ## Installation | ||
|
|
||
| === "pip" | ||
|
|
||
| ```console | ||
| pip install fgmetric | ||
| ``` | ||
|
|
||
| === "uv" | ||
|
|
||
| ```console | ||
| uv add fgmetric | ||
| ``` | ||
|
|
||
| ## Quick Example | ||
|
|
||
| Define a class to represent each row, then read or write: | ||
|
|
||
| ```python | ||
| from pathlib import Path | ||
| from fgmetric import Metric, MetricWriter | ||
|
|
||
|
|
||
| class AlignmentMetric(Metric): | ||
| read_name: str | ||
| mapping_quality: int | ||
| is_duplicate: bool = False | ||
|
|
||
|
|
||
| # Reading | ||
| for metric in AlignmentMetric.read(Path("alignments.tsv")): | ||
| print(f"{metric.read_name}: MQ={metric.mapping_quality}") | ||
|
|
||
| # Writing | ||
| metrics = [ | ||
| AlignmentMetric(read_name="read1", mapping_quality=60), | ||
| AlignmentMetric(read_name="read2", mapping_quality=30, is_duplicate=True), | ||
| ] | ||
| with MetricWriter(AlignmentMetric, Path("output.tsv")) as writer: | ||
| writer.writeall(metrics) | ||
| ``` | ||
|
|
||
| ## Why fgmetric? | ||
|
|
||
| - **vs. csv + dataclasses** --- Automatic type coercion and validation without boilerplate. Built on Pydantic, so custom validators and serializers can be readily added. | ||
| - **vs. pandas** --- Processes records lazily, handling files larger than memory. Metrics are type-validated and can be made immutable. | ||
| - **vs. Pydantic alone** --- Handles CSV/TSV specifics (header parsing, delimiter configuration) and provides out-of-the-box features like empty value handling and Counter field pivoting. | ||
|
|
||
| [Get started with the User Guide](guide.md) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| site_name: fgmetric | ||
| site_url: https://fgmetric.readthedocs.io/en/stable/ | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🌐 Web query:
💡 Result: For MkDocs projects on Read the Docs (including versioned builds), configure # mkdocs.yml
site_url: !ENV READTHEDOCS_CANONICAL_URLRead the Docs computes and exposes Use Read the Docs' canonical URL environment variable instead of hardcoding the stable URL. Hardcoding Proposed change-site_url: https://fgmetric.readthedocs.io/en/stable/
+site_url: !ENV [READTHEDOCS_CANONICAL_URL, "https://fgmetric.readthedocs.io/en/stable/"]🤖 Prompt for AI Agents |
||
| site_description: Type-validated Python models for delimited data files. | ||
| repo_name: fulcrumgenomics/fgmetric | ||
| repo_url: https://github.com/fulcrumgenomics/fgmetric | ||
|
|
||
| nav: | ||
| - Home: index.md | ||
| - User Guide: guide.md | ||
| - API Reference: | ||
| - Metric: api/metric.md | ||
| - MetricWriter: api/metric_writer.md | ||
| - Benchmarks: benchmarks.md | ||
| - Contributing: contributing.md | ||
|
|
||
| theme: | ||
| name: material | ||
| palette: | ||
| - media: "(prefers-color-scheme: light)" | ||
| scheme: default | ||
| toggle: | ||
| icon: material/brightness-7 | ||
| name: Switch to dark mode | ||
| - media: "(prefers-color-scheme: dark)" | ||
| scheme: slate | ||
| toggle: | ||
| icon: material/brightness-4 | ||
| name: Switch to light mode | ||
| features: | ||
| - content.code.copy | ||
| - navigation.instant | ||
| - navigation.instant.progress | ||
| - navigation.sections | ||
| - navigation.top | ||
|
|
||
| markdown_extensions: | ||
| - admonition | ||
| - pymdownx.details | ||
| - pymdownx.highlight: | ||
| anchor_linenums: true | ||
| - pymdownx.superfences | ||
| - pymdownx.tabbed: | ||
| alternate_style: true | ||
| - pymdownx.snippets: | ||
| base_path: ["."] | ||
| - toc: | ||
| permalink: true | ||
|
|
||
| plugins: | ||
| - search | ||
| - mkdocstrings: | ||
| default_handler: python | ||
| handlers: | ||
| python: | ||
| options: | ||
| docstring_style: google | ||
| merge_init_into_class: true | ||
| members_order: source | ||
| separate_signature: true | ||
| show_root_heading: true | ||
| show_source: false | ||
| show_symbol_type_heading: true | ||
| show_symbol_type_toc: true | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
For GitHub Actions pull_request_target workflows, which event types are needed so jobs rerun when new commits are pushed to an existing PR?💡 Result:
To have a
pull_request_targetworkflow re-run when new commits are pushed to an existing PR, you need thesynchronizeactivity type (that’s the PR action GitHub sends on new commits). (docs.github.com)If you’re explicitly setting
types:, include at least:(If you omit
types, GitHub’s default forpull_request_targetalready includesopened,synchronize, andreopened.) (docs.github.com)Citations:
Preview link won't refresh after new PR commits.
Line 6 listens only to
opened, so latersynchronizeupdates won't retrigger this workflow and preview data can go stale.💡 Suggested fix
on: pull_request_target: types: - opened + - synchronize + - reopened🤖 Prompt for AI Agents