Skip to content

Modern Python ecosystem for complex survey design, weighting, estimation, and small area estimation.

License

Notifications You must be signed in to change notification settings

samplics-org/svy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

svy

Modern Python tools for complex survey analysis, built for real-world statistical workflows.

svy is a rigorously design-based yet production-oriented ecosystem for survey design, weighting, estimation, and small area estimation — without sacrificing transparency or scalability.

🌐 Website: https://svylab.com
📘 Documentation: https://svylab.com/docs


Tip

Validation: Want to assess the correctness of svy?
See our comparison with R’s survey package, showing numerically identical results across Taylor linearization, replication methods, and complex survey designs.

⚠️ Current Status (Read This First)

The svy libraries are not yet publicly downloadable.

This repository is intentionally public before the code release so that early users can:

  • ask questions,
  • report documentation gaps,
  • suggest features,
  • discuss real-world survey use cases,
  • help shape stable APIs.

📘 Documentation is live
🧪 Code is under finalization
🐞 Issues & discussions are open

When the first public releases are ready, this repository will become the main code home.


What is svy?

svy is designed for people who actually work with complex survey data, including:

  • National statistical offices
  • Public health and development programs
  • Survey methodologists
  • Data scientists working with complex samples

The guiding principle is:

Correct inference first — without hiding assumptions or sacrificing usability.

svy prioritizes statistical validity while remaining compatible with modern Python workflows.


Planned Capabilities

The svy ecosystem is being built to support:

  • Complex survey design (strata, clusters, weights)
  • Design-based estimation with valid standard errors
  • Replication methods (BRR, bootstrap, jackknife)
  • Small Area Estimation (area- and unit-level models)
  • Explicit, inspectable, reproducible outputs
  • Integration with Polars, NumPy, SciPy, and JAX-based tooling

All methods are grounded in established survey methodology.


Example (Illustrative API)

The example below shows the intended public API. It reflects the current design but cannot yet be run until the first release.

Design-based estimation

pip install svy
import svy

hld_data = svy.load_dataset(name="hld_sample_wb_2023", limit=None)

hld_design = svy.Design(stratum=("geo1", "urbrur"), psu="ea", wgt="hhweight")

hld_sample = svy.Sample(data=hld_data, design=hld_design)

tot_exp_mean = hld_sample.estimation.mean(y="tot_exp")

print(tot_exp_mean)

Fay Herriot Model - SAE

pip install svy-sae
import svy_sae as sae

milk = svy.load_dataset(name="milk", limit=None)

milk_model = sae.AreaLevel(milk)

milk_preds = milk_model.fh(
    y="yi",
    x=svy.Cat("MajorArea", ref=1),
    variance="variance",
    area="SmallArea",
    method="REML",
    mse="prasad_rao",
)

print(milk_preds)

No shortcuts.
No hidden assumptions.
Just correct survey inference.


Ecosystem Packages (Upcoming)

Package Purpose Status
svy Core survey design & estimation In progress
svy-sae Small Area Estimation In progress
svy-io SPSS / Stata / SAS I/O In progress

Installation instructions will be added once packages are published.


Documentation (Available Now)

👉 https://svylab.com/docs

Includes conceptual guides, tutorials, and methodological notes reflecting the intended stable APIs.


Feedback & Early Engagement

Early feedback is strongly encouraged.

If you work with complex surveys and want to influence the design of a modern Python survey stack, this is the right place to engage.


License

MIT License
Copyright © 2026 Samplics LLC


svy is built for practitioners who need statistical rigor that survives contact with reality.

About

Modern Python ecosystem for complex survey design, weighting, estimation, and small area estimation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published