helpers — Causify’s Python Toolkit for High-Leverage Engineering

helpers is a public collection of Python utilities, configuration patterns, and developer tooling extracted from real production work at Causify.

It exists for one reason:

Make engineering repeatable.
Turn "tribal knowledge" into composable primitives: less glue code, fewer one-off scripts, and more reliable systems.

This repo is useful in three common contexts:

Platform/product engineering: predictable building blocks for I/O, debugging, system operations, Git/Docker, dates/time, dataframes, and more.
Repo hygiene at scale: linting, import validation, CI utilities, pre-commit workflows; things large repos need to stay healthy.
LLM/agentic workflows: lightweight wrappers for completions, structured outputs, caching modes, and cost tracking.

One-minute map (how the repo fits together)

---
config:
  flowchart:
    curve: monotoneX
    htmlLabels: true
  layout: elk
  themeVariables:
    primaryColor: '#e8ebf0'
    primaryTextColor: '#0f172a'
    secondaryColor: '#f5f6f8'
    tertiaryColor: '#f9fafb'
    lineColor: '#94a3b8'
---

flowchart LR
 subgraph Lib["🐍 Python Library (`helpers/`)"]
    direction TB
        Core["Core Helpers<br>(hdbg, hio, hsystem, hgit, hdocker, hdatetime, …)"]
        Data["Data Helpers<br>(hpandas and related modules)"]
        LLM["LLM &amp; Agentic Helpers<br>(hllm, hllm_cost, hllm_cli, hchatgpt, …)"]
  end
 subgraph Config["⚙️ Configuration Patterns (`config_root/`)"]
        Conf["Env-aware configuration objects<br>and builders"]
  end
 subgraph Tooling["🛠️ Tooling & Automation"]
        Scripts@{ label: "Dev Scripts<br/>(`dev_scripts_helpers/`)" }
        Import@{ label: "Import Hygiene<br/>(`import_check/`)" }
        Linters@{ label: "Linting Framework<br/>(`linters/`, `linters2/`)" }
        Tasks@{ label: "Repo Tasks & Automation<br/>(`tasks.py`, `invoke.yaml`)" }
  end
 subgraph Docs["Documentation & Examples"]
        Human@{ label: "Human Docs<br/>(`docs/`)" }
        Mk@{ label: "MkDocs Site<br/>(`docs_mkdocs/`)" }
        NB@{ label: "Tutorial Notebooks<br/>(`helpers/notebooks/`)" }
  end
 subgraph CI["Continuous Integration & Hygiene"]
        GH@{ label: "GitHub Workflows<br/>(`.github/`)" }
        PC@{ label: "Pre-commit & Scanning<br/>(`.pre-commit-config.yaml`, semgrep, …)" }
  end
    Core -- provides base utilities to --> Conf
    Data -- feeds data into --> Conf
    LLM -- adds AI logic to --> Conf
    Scripts -- triggers tasks in --> GH
    Import -- enforces rules in --> GH
    Linters -- runs in --> GH
    Tasks -- executes workflows in --> GH
    Human -- documented in --> GH
    Mk -- deployed via --> GH
    NB -- tested in --> GH
    PC -- validates via --> GH

    Core@{ shape: rounded}
    Data@{ shape: rounded}
    LLM@{ shape: rounded}
    Conf@{ shape: rect}
    Scripts@{ shape: rect}
    Import@{ shape: rect}
    Linters@{ shape: rect}
    Tasks@{ shape: rect}
    Human@{ shape: doc}
    Mk@{ shape: doc}
    NB@{ shape: doc}
    GH@{ shape: rect}
    PC@{ shape: rect}
     Core:::lib
     Data:::lib
     LLM:::lib
     Conf:::config
     Scripts:::tooling
     Import:::tooling
     Linters:::tooling
     Tasks:::tooling
     Human:::docs
     Mk:::docs
     NB:::docs
     GH:::ci
     PC:::ci
    classDef lib fill:#e0f7fa,stroke:#0097a7,stroke-width:2px,color:#004d40
    classDef config fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,color:#795548
    classDef tooling fill:#e1bee7,stroke:#7b1fa2,stroke-width:2px,color:#4a148c
    classDef docs fill:#c5cae9,stroke:#3949ab,stroke-width:2px,color:#1a237e
    classDef ci fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20

Why this exists (the philosophy)

Most engineering orgs accumulate dozens of tiny scripts and helper snippets. Over time that becomes:

duplicated logic across repos,
inconsistent behavior,
hard-to-test workflows,
brittle operations.

helpers is the opposite: small, well-scoped primitives that are easy to discover and safe to reuse. The code is intentionally "boring"; because boring tools are what make fast teams.

Micro-examples

These examples are deliberately small. The repo is optimized for everyday leverage.

1) Defensive checks & clear failures

import helpers.hdbg as hdbg

hdbg.dassert_eq(2 + 2, 4)
hdbg.dassert(10 > 3, "Math still works")

2) Safe system command execution (without subprocess boilerplate)

import helpers.hsystem as hsystem

out = hsystem.system_to_string("python --version")
print(out)

3) Simple JSON I/O helpers

import helpers.hio as hio

payload = {"hello": "world"}
hio.to_json("tmp.json", payload)

loaded = hio.from_json("tmp.json")
print(loaded)

4) DataFrame utilities (example: convert DF to JSON string)

import pandas as pd
import helpers.hpandas as hpandas

df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(hpandas.convert_df_to_json_string(df))

5) LLM structured output + cost tracking (agentic building blocks)

helpers.hllm supports providers like OpenAI and OpenRouter (via openrouter/<model>). It also supports caching modes and structured output.

from pydantic import BaseModel

import helpers.hllm as hllm
import helpers.hllm_cost as hllmcost

class Summary(BaseModel):
    summary: str
    risks: list[str]

# Track cost for a specific provider/model.
tracker = hllmcost.LLMCostTracker(provider_name="openai", model="gpt-4o-mini")

result = hllm.get_structured_completion(
    "Summarize key risks of shipping without observability.",
    response_format=Summary,
    system_prompt="Be concise and practical.",
    model="gpt-4o-mini",
    cache_mode="DISABLE_CACHE",
    print_cost=True,
    cost_tracker=tracker,
)

print(result.summary)
print("Cost so far ($):", tracker.get_current_cost())

What’s in here (a guided tour)

Core Python helpers (`helpers/`)

The main Python library lives in helpers/. Many modules follow the h<name>.py naming pattern (e.g., hdbg.py, hio.py, hsystem.py) to make discovery fast and avoid collisions.

A few high-signal examples:

helpers/hdbg.py: defensive checks and assertions
helpers/hsystem.py: safe system command execution
helpers/hio.py: practical file I/O helpers
helpers/hgit.py: git utilities
helpers/hdocker.py: docker helpers
helpers/henv.py: environment utilities
helpers/hdatetime.py: date/time utilities

Data utilities (`hpandas` and friends)

helpers/hpandas.py re-exports a suite of pandas helpers implemented across hpandas_* modules (conversion, compare, display, cleaning, assertions, IO, etc.). This makes the most common "pandas hygiene" tasks consistent across projects.

LLM & agentic utilities

This repo includes lightweight building blocks for LLM workflows:

helpers/hllm.py: completions + structured outputs + caching modes
helpers/hllm_cost.py: cost tracking and provider cost utilities (incl. LLMCostTracker)
helpers/hllm_cli.py: CLI-oriented LLM workflows
helpers/hchatgpt.py / helpers/hchatgpt_instructions.py: utilities around assistants/instructions workflows

Tutorial notebooks live under helpers/notebooks/ (see hllm.tutorial.py).

Config patterns (`config_root/`)

Reusable configuration patterns: config objects/builders and environment-aware configuration.

Tooling and repo hygiene

This repo also ships "how we keep repos healthy" primitives:

dev_scripts_helpers/: automation scripts for common workflows
import_check/: import hygiene validation
linters/, linters2/: lint framework and configs
.github/: CI workflows and checks
.claude/: Claude Code configuration and hooks
CLAUDE.md: Architecture overview and development patterns for Claude Code
conftest.py: Pytest configuration and shared test fixtures
instr.md: Development instructions and task specifications
main_pytest.py: Main pytest runner and test execution controller
tasks.py: Entry point for pyinvoke task automation system
pre-commit and scanning configs (.pre-commit-config.yaml, .semgrepignore, etc.)

Design principles (what makes this repo useful)

Small modules > big frameworks Helpers should be composable and easy to understand.
Safety by default Defensive checks, predictable failures, sensible defaults.
Fast discovery Naming conventions encourage "find the right tool quickly."
Reproducibility If it matters, it should be runnable in CI and documented.
Low dependency footprint Dependencies are added only when they buy real leverage.

Getting started

Use as a git submodule (common in larger repos)

git submodule add https://github.com/causify-ai/helpers.git helpers_root
git submodule update --init --recursive

Use locally (development / experimentation)

python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e .

If contributing, enable pre-commit hooks:

pip install pre-commit
pre-commit install

Tests & quality checks

Run tests:

pytest -q
# or
python main_pytest.py

Run pre-commit:

pre-commit run -a

List repo automation tasks (if using invoke):

pip install invoke
invoke -l

Documentation

Human docs live under docs/. A browsable site is maintained under docs_mkdocs/.

Serve MkDocs locally:

pip install mkdocs mkdocs-material
cd docs_mkdocs
mkdocs serve

Contributing

Keep changes small and reviewable.
Add tests when behavior changes.
Prefer reusable utilities over one-off scripts.
Keep backwards compatibility in mind for downstream consumers.

Security

This repo includes secret-scanning and standard hygiene. Do not commit secrets. If you suspect a leak: rotate credentials and open an incident.

License

See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 857 Commits
.claude		.claude
.github		.github
.vscode		.vscode
config_root		config_root
dev_scripts_helpers		dev_scripts_helpers
devops		devops
docs		docs
docs_mkdocs		docs_mkdocs
helpers		helpers
import_check		import_check
linters		linters
linters2		linters2
papers		papers
.coveragerc		.coveragerc
.gitignore		.gitignore
.gitleaksignore		.gitleaksignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.semgrepignore		.semgrepignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
changelog.txt		changelog.txt
conftest.py		conftest.py
coverage.sh		coverage.sh
fix_repo.sh		fix_repo.sh
invoke.yaml		invoke.yaml
main_pytest.py		main_pytest.py
mypy.ini		mypy.ini
notify.sh		notify.sh
puppeteerConfig.json		puppeteerConfig.json
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
repo_config.yaml		repo_config.yaml
setenv.sh		setenv.sh
tasks.py		tasks.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

helpers — Causify’s Python Toolkit for High-Leverage Engineering

One-minute map (how the repo fits together)

Why this exists (the philosophy)

Micro-examples

1) Defensive checks & clear failures

2) Safe system command execution (without subprocess boilerplate)

3) Simple JSON I/O helpers

4) DataFrame utilities (example: convert DF to JSON string)

5) LLM structured output + cost tracking (agentic building blocks)

What’s in here (a guided tour)

Core Python helpers (`helpers/`)

Data utilities (`hpandas` and friends)

LLM & agentic utilities

Config patterns (`config_root/`)

Tooling and repo hygiene

Design principles (what makes this repo useful)

Getting started

Use as a git submodule (common in larger repos)

Use locally (development / experimentation)

Tests & quality checks

Documentation

Contributing

Security

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

helpers — Causify’s Python Toolkit for High-Leverage Engineering

One-minute map (how the repo fits together)

Why this exists (the philosophy)

Micro-examples

1) Defensive checks & clear failures

2) Safe system command execution (without subprocess boilerplate)

3) Simple JSON I/O helpers

4) DataFrame utilities (example: convert DF to JSON string)

5) LLM structured output + cost tracking (agentic building blocks)

What’s in here (a guided tour)

Core Python helpers (helpers/)

Data utilities (hpandas and friends)

LLM & agentic utilities

Config patterns (config_root/)

Tooling and repo hygiene

Design principles (what makes this repo useful)

Getting started

Use as a git submodule (common in larger repos)

Use locally (development / experimentation)

Tests & quality checks

Documentation

Contributing

Security

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Core Python helpers (`helpers/`)

Data utilities (`hpandas` and friends)

Config patterns (`config_root/`)

Packages