Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
0447504
Add functionality to build reads by stage table
kathrynmurie Oct 24, 2025
55933aa
Add to merge function
kathrynmurie Oct 24, 2025
872d00f
update for version v1.0.0 of format; updated unit tests;
nickjhathaway Nov 3, 2025
ed9e44d
Merge pull request #51 from PlasmoGenEpi/feature/add_reads_by_stage
nickjhathaway Nov 4, 2025
bee227e
be more permissive with specimen_name;
nickjhathaway Nov 5, 2025
7d6b7be
fix for auto completion;
nickjhathaway Nov 5, 2025
e0889fd
curated descriptions within schema file;
nickjhathaway Nov 5, 2025
36a3d35
fixed example files; de-bumped version;
nickjhathaway Nov 5, 2025
f1c41a3
forgot to update unit test mdsum;
nickjhathaway Nov 5, 2025
5c638b3
Add integration test for when schema updates
kathrynmurie Nov 8, 2025
811fae0
Include parasitemia in test
kathrynmurie Nov 8, 2025
4aa1e6a
Merge pull request #52 from PlasmoGenEpi/hotfix/update_to_format_2025_10
nickjhathaway Nov 11, 2025
f6b4638
Merge branch 'develop' into feature/auto_test_schema_updates
kathrynmurie Nov 17, 2025
57ffcda
Add function to merge panels
kathrynmurie Nov 17, 2025
1f763fa
Remove unused library
kathrynmurie Nov 17, 2025
97962dd
Update validation to most recent schema
kathrynmurie Nov 17, 2025
9aedb2e
Update specimen function for new schema; check nulls in required; str…
kathrynmurie Nov 18, 2025
41b24ba
update library sample function
kathrynmurie Nov 18, 2025
7781896
Allow genomes to be dict or list
kathrynmurie Nov 18, 2025
e5de099
Update the genome_id column
kathrynmurie Nov 18, 2025
c69f032
Remove duplicate test
kathrynmurie Nov 19, 2025
38b7b52
was still using old fields;
nickjhathaway Nov 19, 2025
8d3943c
fix for having the same specimen in different files when combining;
nickjhathaway Nov 19, 2025
20ab2cf
Merge pull request #53 from PlasmoGenEpi/feature/auto_test_schema_upd…
nickjhathaway Nov 19, 2025
fedf08f
Merge pull request #54 from PlasmoGenEpi/hotfix/update_spec_info
nickjhathaway Nov 19, 2025
99cfa9d
Merge pull request #55 from PlasmoGenEpi/hotfix/update_panel_info_bui…
nickjhathaway Nov 19, 2025
e72684a
Merge branch 'develop' into feature/merge_multiple_panels
nickjhathaway Nov 19, 2025
bf23701
update run_accession;
nickjhathaway Nov 19, 2025
ea78276
Merge pull request #56 from PlasmoGenEpi/feature/merge_multiple_panels
nickjhathaway Nov 19, 2025
44ea710
added update function for travel info;
nickjhathaway Nov 19, 2025
cd9c27a
added function to add travel info;
nickjhathaway Nov 20, 2025
cf2a344
Revert "added update function for travel info;"
kathrynmurie Nov 20, 2025
040633e
added more tests for properly throwing dup library_sample_names and s…
nickjhathaway Nov 21, 2025
ca7e565
Merge pull request #57 from PlasmoGenEpi/hotfix/fix_combine_multiple_…
nickjhathaway Nov 22, 2025
52ad50b
Add function to count targets per panel
kathrynmurie Nov 22, 2025
a20cc2d
moved to update traveler info to pmo_builder; change checking of date…
nickjhathaway Nov 24, 2025
1fcc76d
length of unique targets in panel incase duplicates across reactions
kathrynmurie Nov 24, 2025
bc37baf
Merge pull request #61 from PlasmoGenEpi/feature/count_targets_per_panel
nickjhathaway Nov 24, 2025
b0b9685
added several exporter functions to go from PMO to various tables;
nickjhathaway Nov 25, 2025
58e5e3a
clarify overwrite warning message;
nickjhathaway Nov 25, 2025
ce12199
Merge pull request #65 from PlasmoGenEpi/hotfix/change_pmo_writer_war…
kathrynmurie Nov 25, 2025
3794d23
changed name of file;
nickjhathaway Nov 25, 2025
7a654e5
Merge pull request #58 from PlasmoGenEpi/feature/add_update_traveler_…
kathrynmurie Nov 25, 2025
673c451
Merge branch 'develop' into feature/add_exporter_functions
nickjhathaway Nov 25, 2025
420fcbd
Merge pull request #64 from PlasmoGenEpi/feature/add_exporter_functions
kathrynmurie Nov 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,17 @@ This package is built to either be used as a library in python projects and a co
If you want to add auto-completion to the scripts master function [pmotools-python](scripts/pmotools-runner.py) you can add the following to your `~/.bash_completion`. This can also be found in etc/bash_completion in the current directory. Or can be generated with `pmotools-python --bash-completion`

```bash
# bash completion for pmotools-python
# add the below to your ~/.bash_completion

_pmotools_python_complete()
{
# Make sure underscores (and =) are NOT treated as word breaks
# so options like --pmo_files or --file=path complete as one token.
local _OLD_WB="${COMP_WORDBREAKS-}"
COMP_WORDBREAKS="${COMP_WORDBREAKS//_/}"
COMP_WORDBREAKS="${COMP_WORDBREAKS//=}"

local cur prev
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
Expand All @@ -34,6 +43,9 @@ _pmotools_python_complete()
lines="$(${COMP_WORDS[0]} --list-plain 2>/dev/null)"
cmds="$(printf '%s\n' "${lines}" | awk -F'\t' '{print $1}')"
COMPREPLY=( $(compgen -W "${cmds}" -- "${cur}") )

# restore wordbreaks before returning
COMP_WORDBREAKS="$_OLD_WB"
return 0
fi

Expand All @@ -42,19 +54,26 @@ _pmotools_python_complete()
local helps opts
helps="$(${COMP_WORDS[0]} ${COMP_WORDS[1]} -h 2>/dev/null)"
# Pull out flag tokens and split comma-separated forms
# Keep underscores intact in the tokens.
opts="$(printf '%s\n' "${helps}" \
| sed -n 's/^[[:space:]]\{0,\}\(-[-[:alnum:]][-[:alnum:]]*\)\(, *-[[:alnum:]][-[:alnum:]]*\)\{0,\}.*/\1/p' \
| sed -n 's/^[[:space:]]\{0,\}\(-[-[:alnum:]_][-[:alnum:]_]*\)\(, *-[[:alnum:]_][-[:alnum:]_]*\)\{0,\}.*/\1/p' \
| sed 's/, / /g')"
COMPREPLY=( $(compgen -W "${opts}" -- "${cur}") )

COMP_WORDBREAKS="$_OLD_WB"
return 0
fi

# 3) Otherwise, fall back to filename completion for positional args
COMPREPLY=( $(compgen -f -- "${cur}") )

# restore original word breaks
COMP_WORDBREAKS="$_OLD_WB"
return 0
}

complete -F _pmotools_python_complete pmotools-python

```

## Developer Setup
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
project = "pmotools-python"
copyright = "2024, Plasmogenepi"
author = "Plasmogenepi"
release = "1.0.0"
release = "0.1.0"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand Down
17 changes: 16 additions & 1 deletion etc/bash_completion
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@

_pmotools_python_complete()
{
# Make sure underscores (and =) are NOT treated as word breaks
# so options like --pmo_files or --file=path complete as one token.
local _OLD_WB="${COMP_WORDBREAKS-}"
COMP_WORDBREAKS="${COMP_WORDBREAKS//_/}"
COMP_WORDBREAKS="${COMP_WORDBREAKS//=}"

local cur prev
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
Expand All @@ -16,6 +22,9 @@ _pmotools_python_complete()
lines="$(${COMP_WORDS[0]} --list-plain 2>/dev/null)"
cmds="$(printf '%s\n' "${lines}" | awk -F'\t' '{print $1}')"
COMPREPLY=( $(compgen -W "${cmds}" -- "${cur}") )

# restore wordbreaks before returning
COMP_WORDBREAKS="$_OLD_WB"
return 0
fi

Expand All @@ -24,15 +33,21 @@ _pmotools_python_complete()
local helps opts
helps="$(${COMP_WORDS[0]} ${COMP_WORDS[1]} -h 2>/dev/null)"
# Pull out flag tokens and split comma-separated forms
# Keep underscores intact in the tokens.
opts="$(printf '%s\n' "${helps}" \
| sed -n 's/^[[:space:]]\{0,\}\(-[-[:alnum:]][-[:alnum:]]*\)\(, *-[[:alnum:]][-[:alnum:]]*\)\{0,\}.*/\1/p' \
| sed -n 's/^[[:space:]]\{0,\}\(-[-[:alnum:]_][-[:alnum:]_]*\)\(, *-[[:alnum:]_][-[:alnum:]_]*\)\{0,\}.*/\1/p' \
| sed 's/, / /g')"
COMPREPLY=( $(compgen -W "${opts}" -- "${cur}") )

COMP_WORDBREAKS="$_OLD_WB"
return 0
fi

# 3) Otherwise, fall back to filename completion for positional args
COMPREPLY=( $(compgen -f -- "${cur}") )

# restore original word breaks
COMP_WORDBREAKS="$_OLD_WB"
return 0
}

Expand Down
99 changes: 84 additions & 15 deletions src/pmotools/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
from pmotools.scripts.extractors_from_pmo.extract_pmo_with_read_filter import (
extract_pmo_with_read_filter,
)
from pmotools.scripts.extractors_from_pmo.extract_allele_table import (
from pmotools.scripts.pmo_to_tables.extract_allele_table import (
extract_for_allele_table,
)

Expand Down Expand Up @@ -66,13 +66,37 @@
)

# panel info subset
from pmotools.scripts.extract_info_from_pmo.extract_insert_of_panels import (
from pmotools.scripts.pmo_to_tables.extract_insert_of_panels import (
extract_insert_of_panels,
)
from pmotools.scripts.extract_info_from_pmo.extract_refseq_of_inserts_of_panels import (
from pmotools.scripts.pmo_to_tables.extract_refseq_of_inserts_of_panels import (
extract_refseq_of_inserts_of_panels,
)

# pmo to tables

from pmotools.scripts.pmo_to_tables.export_specimen_meta_table import (
export_specimen_meta_table,
)
from pmotools.scripts.pmo_to_tables.export_library_sample_meta_table import (
export_library_sample_meta_table,
)
from pmotools.scripts.pmo_to_tables.export_project_info_meta_table import (
export_project_info_meta_table,
)
from pmotools.scripts.pmo_to_tables.export_sequencing_info_meta_table import (
export_sequencing_info_meta_table,
)
from pmotools.scripts.pmo_to_tables.export_specimen_travel_meta_table import (
export_specimen_travel_meta_table,
)
from pmotools.scripts.pmo_to_tables.export_target_info_meta_table import (
export_target_info_meta_table,
)
from pmotools.scripts.pmo_to_tables.export_panel_info_meta_table import (
export_panel_info_meta_table,
)


@dataclass(frozen=True)
class PmoCommand:
Expand Down Expand Up @@ -115,17 +139,6 @@ class PmoCommand:
"extract_pmo_with_read_filter": PmoCommand(
extract_pmo_with_read_filter, "Extract with a read filter"
),
"extract_allele_table": PmoCommand(
extract_for_allele_table,
"Extract allele tables for tools like dcifer or moire",
),
"extract_insert_of_panels": PmoCommand(
extract_insert_of_panels, "Extract inserts of panels from a PMO"
),
"extract_refseq_of_inserts_of_panels": PmoCommand(
extract_refseq_of_inserts_of_panels,
"Extract ref_seq of panel inserts from a PMO",
),
},
"working_with_multiple_pmos": {
"combine_pmos": PmoCommand(
Expand Down Expand Up @@ -160,6 +173,46 @@ class PmoCommand:
validate_pmo, "Validate a PMO file against a JSON Schema"
)
},
"pmo_to_table": {
"export_specimen_meta_table": PmoCommand(
export_specimen_meta_table, "export the specimen meta table from a PMO file"
),
"export_library_sample_meta_table": PmoCommand(
export_library_sample_meta_table,
"export the library_sample meta table from a PMO file",
),
"export_project_info_meta_table": PmoCommand(
export_project_info_meta_table,
"export the project_info meta table from a PMO file",
),
"export_sequencing_info_meta_table": PmoCommand(
export_sequencing_info_meta_table,
"export the sequencing_info meta table from a PMO file",
),
"export_specimen_travel_meta_table": PmoCommand(
export_specimen_travel_meta_table,
"export the specimen travel_info meta table from a PMO file",
),
"export_target_info_meta_table": PmoCommand(
export_target_info_meta_table,
"export the target info meta table from a PMO file",
),
"export_panel_info_meta_table": PmoCommand(
export_panel_info_meta_table,
"export the panel info meta table from a PMO file",
),
"extract_allele_table": PmoCommand(
extract_for_allele_table,
"Extract allele tables for tools like dcifer or moire",
),
"extract_insert_of_panels": PmoCommand(
extract_insert_of_panels, "Extract inserts of panels from a PMO"
),
"extract_refseq_of_inserts_of_panels": PmoCommand(
extract_refseq_of_inserts_of_panels,
"Extract ref_seq of panel inserts from a PMO",
),
},
}


Expand Down Expand Up @@ -223,6 +276,12 @@ def _print_bash_completion():

_pmotools_python_complete()
{
# Make sure underscores (and =) are NOT treated as word breaks
# so options like --pmo_files or --file=path complete as one token.
local _OLD_WB="${COMP_WORDBREAKS-}"
COMP_WORDBREAKS="${COMP_WORDBREAKS//_/}"
COMP_WORDBREAKS="${COMP_WORDBREAKS//=}"

local cur prev
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
Expand All @@ -236,6 +295,9 @@ def _print_bash_completion():
lines="$(${COMP_WORDS[0]} --list-plain 2>/dev/null)"
cmds="$(printf '%s\n' "${lines}" | awk -F'\t' '{print $1}')"
COMPREPLY=( $(compgen -W "${cmds}" -- "${cur}") )

# restore wordbreaks before returning
COMP_WORDBREAKS="$_OLD_WB"
return 0
fi

Expand All @@ -244,19 +306,26 @@ def _print_bash_completion():
local helps opts
helps="$(${COMP_WORDS[0]} ${COMP_WORDS[1]} -h 2>/dev/null)"
# Pull out flag tokens and split comma-separated forms
# Keep underscores intact in the tokens.
opts="$(printf '%s\n' "${helps}" \
| sed -n 's/^[[:space:]]\{0,\}\(-[-[:alnum:]][-[:alnum:]]*\)\(, *-[[:alnum:]][-[:alnum:]]*\)\{0,\}.*/\1/p' \
| sed -n 's/^[[:space:]]\{0,\}\(-[-[:alnum:]_][-[:alnum:]_]*\)\(, *-[[:alnum:]_][-[:alnum:]_]*\)\{0,\}.*/\1/p' \
| sed 's/, / /g')"
COMPREPLY=( $(compgen -W "${opts}" -- "${cur}") )

COMP_WORDBREAKS="$_OLD_WB"
return 0
fi

# 3) Otherwise, fall back to filename completion for positional args
COMPREPLY=( $(compgen -f -- "${cur}") )

# restore original word breaks
COMP_WORDBREAKS="$_OLD_WB"
return 0
}

complete -F _pmotools_python_complete pmotools-python

"""
import sys

Expand Down
45 changes: 45 additions & 0 deletions src/pmotools/pmo_builder/json_convert_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,48 @@ def check_additional_columns_exist(df, additional_column_list):
missing_cols = set(additional_column_list) - set(df.columns)
if missing_cols:
raise ValueError(f"Missing additional columns: {missing_cols}")


def remove_optional_null_values(json_data, optional_columns):
"""
Remove empty values from optional fields in a list of dictionaries.

:param json_data: List of dictionaries to process
:param optional_columns: List of optional field names to check for empty values
:return: List of dictionaries with empty optional fields removed

Empty values include: None, empty strings (''), empty dicts ({}), and empty lists ([])
"""
# Convert optional_columns to a set for faster lookup
optional_fields_set = set(optional_columns) if optional_columns else set()

for item in json_data:
# Collect keys to remove to avoid modifying dict while iterating
keys_to_remove = []
for key, value in item.items():
if key in optional_fields_set:
# Check if value is empty: None, empty string, empty dict, or empty list
if value is None or value == "" or value == {} or value == []:
keys_to_remove.append(key)

# Remove the empty optional fields
for key in keys_to_remove:
del item[key]

return json_data


def check_null_values(df, columns):
"""
Check for null values in a list of columns

:param df: DataFrame to check
:param columns: List of column names to check
:return: None
"""
null_columns = []
for col in columns:
if df[col].isna().any():
null_columns.append(col)
if null_columns:
raise ValueError(f"The following columns contain null values: {null_columns}")
Loading