PhosPy

PhosPy is a small Python library for selected phosphoproteomics workflows inspired by the R PhosR package.

Use it when you want to:

preprocess total and phospho tables
analyse kinase activity from a predMat
run the native Python kinase workflow

It is intentionally narrow. It does not aim to reproduce all of PhosR.

Install

PhosPy supports Python 3.10 and newer.

pip install phospy

The file paths below use examples/data/..., so they assume a repository checkout. If you installed from PyPI, use your own input-file paths instead.

Pick the Method You Need

1. Core preprocessing from total and phospho tables

Use PhosphoDataset when you want validated inputs and the standard preprocessing flow.

from phospy import CoreOutputWriter, PhosphoDataset

dataset = PhosphoDataset.from_files(
    "examples/data/total.tsv",
    "examples/data/phospho.tsv",
    phospho_encoding="utf-16le",
)
core = dataset.preprocessing.run(max_unmatched_fraction=0.1)

writer = CoreOutputWriter()
writer.write(core, outdir="examples/output", format="csv")

site_matrix = core.site_matrix.matrix
corrected = core.phospho_corrected

dataset.preprocessing.run(...) returns a CoreProcessingResult with:

total_unique
total_filtered
phospho_filtered
phospho_corrected
site_matrix

Use CoreOutputWriter when you want to write the core outputs to disk.

2. Kinase activity analysis from an existing `predMat`

Use KinaseActivityAnalyzer when you already have a phosphosite matrix and a predMat.

from phospy import KinaseActivityAnalyzer, PhosphoDataset

dataset = PhosphoDataset.from_files(
    "examples/data/total.tsv",
    "examples/data/phospho.tsv",
    phospho_encoding="utf-16le",
)
core = dataset.preprocessing.run(max_unmatched_fraction=0.1)

analyzer = KinaseActivityAnalyzer()
result = analyzer.load_and_analyze(
    pred_mat_path="examples/data/predMat.csv",
    phospho_matrix=core.site_matrix.matrix,
    threshold=0.6,
    min_substrates=1,
    top_n_substrates=1,
)
analyzer.write_outputs(result, outdir="examples/output")

ksea_scores = result.ksea_scores
target_counts = result.target_counts

The bundled example uses min_substrates=1 and top_n_substrates=1 because the example data is very small.

3. One-shot pipeline from files

Use PhosRPipeline when you want file loading, preprocessing, optional kinase analysis, and output writing in one call.

from phospy import PhosRPipeline

pipeline = PhosRPipeline.from_files(
    total_path="examples/data/total.tsv",
    phospho_path="examples/data/phospho.tsv",
    pred_mat_path="examples/data/predMat.csv",
    phospho_encoding="utf-16le",
    max_unmatched_fraction=0.1,
)
outputs = pipeline.run(outdir="examples/output")

This writes the core CSV outputs and, when pred_mat_path is provided, the downstream kinase-analysis tables as well. A pipeline run also writes run_manifest.json.

4. Native Python kinase workflow

Use KinaseWorkflow for the native end-to-end prediction workflow.

A complete runnable example lives in examples/native_workflow_demo.py.

From a repository checkout, run:

make native-workflow-demo

Input Files

For file-based workflows:

total input is read as TSV
phospho input is read as TSV
predMat is read as CSV, using the first column as the phosphosite index

For the default schema, the expected columns are documented in docs/api.md.

CLI

After installation, you can run the CLI on your own files.

phospy \
  --total examples/data/total.tsv \
  --phospho examples/data/phospho.tsv \
  --pred-mat examples/data/predMat.csv \
  --phospho-encoding utf-16le \
  --max-unmatched-fraction 0.1 \
  --outdir examples/output

The CLI covers the file-based preprocessing path and optional predMat analysis.

Where to Read Next

docs/api.md for method signatures, parameters, validation, and examples
docs/validation.md for the validation quick guide
docs/parity.md for parity to the R PhosR package
docs/fixtures.md for fixture and trace layout
CONTRIBUTING.md for local development and test commands

Name		Name	Last commit message	Last commit date
Latest commit History 214 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
src/phospy		src/phospy
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
NOTICE.md		NOTICE.md
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhosPy

Install

Pick the Method You Need

1. Core preprocessing from total and phospho tables

2. Kinase activity analysis from an existing `predMat`

3. One-shot pipeline from files

4. Native Python kinase workflow

Input Files

CLI

Where to Read Next

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

PhosPy

Install

Pick the Method You Need

1. Core preprocessing from total and phospho tables

2. Kinase activity analysis from an existing predMat

3. One-shot pipeline from files

4. Native Python kinase workflow

Input Files

CLI

Where to Read Next

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

2. Kinase activity analysis from an existing `predMat`

Packages