Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ algorithm_params = _config.config.algorithm_params
algorithm_directed = _config.config.algorithm_directed
pca_params = _config.config.pca_params
hac_params = _config.config.hac_params
FRAMEWORK = _config.config.container_framework
FRAMEWORK = _config.config.container_settings.framework
include_aggregate_algo_eval = _config.config.analysis_include_evaluation_aggregate_algo

# Return the dataset or gold_standard dictionary from the config file given the label
Expand Down
47 changes: 24 additions & 23 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,30 @@
# The length of the hash used to identify a parameter combination
hash_length: 7

# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity OR apptainer -- Apptainer (formerly Singularity) is useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud
container_framework: docker
containers:
# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity OR apptainer -- Apptainer (formerly Singularity) is useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud
framework: docker

# Only used if framework is set to singularity/apptainer, this will unpack the containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them. For convenience, these unpacked files will exist in the current working directory
# under `unpacked`.
unpack_singularity: false

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio

# Enabling profiling adds a file called 'usage-profile.tsv' to the output directory of each algorithm.
# The contents of this file describe the CPU utilization and peak memory consumption of the algorithm
Expand All @@ -21,24 +40,6 @@ container_framework: docker
# requirements = versionGE(split(Target.CondorVersion)[1], "24.8.0") && (isenforcingdiskusage =!= true)
enable_profiling: false

# Only used if container_framework is set to singularity/apptainer, this will unpack the containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them. For convenience, these unpacked files will exist in the current working directory
# under `unpacked`.
unpack_singularity: false

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
container_registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio

# This list of algorithms should be generated by a script which checks the filesystem for installs.
# It shouldn't be changed by mere mortals. (alternatively, we could add a path to executable for each algorithm
# in the list to reduce the number of assumptions of the program at the cost of making the config a little more involved)
Expand Down
43 changes: 23 additions & 20 deletions config/egfr.yaml
Original file line number Diff line number Diff line change
@@ -1,28 +1,31 @@
# The length of the hash used to identify a parameter combination
hash_length: 7

# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity -- Also known as apptainer, useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud
container_framework: docker
containers:
# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity -- Also known as apptainer, useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud with the All of Us cloud environment.
# - There is no support for other environments at the moment.
framework: docker

# Only used if container_framework is set to singularity, this will unpack the singularity containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks singularity containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them.
unpack_singularity: false
# Only used if framework is set to singularity, this will unpack the singularity containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks singularity containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them. For convenience, these unpacked files will exist in the current working directory
# under `unpacked`.
unpack_singularity: false

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
container_registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio
# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio

algorithms:
- name: pathlinker
Expand Down
40 changes: 25 additions & 15 deletions docker-wrappers/SPRAS/example_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,31 @@
# The length of the hash used to identify a parameter combination
hash_length: 7

# Specify the container framework. Current supported versions include 'docker' and
# 'singularity'. If container_framework is not specified, SPRAS will default to docker.
container_framework: singularity

# Unpack singularity. See config/config.yaml for details.
unpack_singularity: true

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
container_registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio
containers:
# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity OR apptainer -- Apptainer (formerly Singularity) is useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud
framework: singularity

# Only used if framework is set to singularity/apptainer, this will unpack the containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them. For convenience, these unpacked files will exist in the current working directory
# under `unpacked`.
# Here, we unpack it since we're running on HTCondor.
unpack_singularity: true

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio

# This list of algorithms should be generated by a script which checks the filesystem for installs.
# It shouldn't be changed by mere mortals. (alternatively, we could add a path to executable for each algorithm
Expand Down
11 changes: 6 additions & 5 deletions docs/_static/config/beginner.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
hash_length: 7
container_framework: docker
unpack_singularity: false
container_registry:
base_url: docker.io
owner: reedcompbio
containers:
framework: docker
unpack_singularity: false
registry:
base_url: docker.io
owner: reedcompbio

# Each algorithm has an 'include' parameter. By toggling 'include' to true/false the user can change
# which algorithms are run in a given experiment.
Expand Down
11 changes: 6 additions & 5 deletions docs/_static/config/intermediate.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
hash_length: 7
container_framework: docker
unpack_singularity: false
container_registry:
base_url: docker.io
owner: reedcompbio
containers:
framework: docker
unpack_singularity: false
registry:
base_url: docker.io
owner: reedcompbio

# Each algorithm has an 'include' parameter. By toggling 'include' to true/false the user can change
# which algorithms are run in a given experiment.
Expand Down
2 changes: 1 addition & 1 deletion docs/contributing/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ Local Neighborhood has no other parameters. Optionally set
``include: false`` for the other pathway reconstruction algorithms to
make testing faster.

The config file has an option ``owner`` under the ``container_registry``
The config file has an option ``owner`` under the ``containers.registry``
settings that controls which Docker Hub account will be used when
pulling Docker images. The same Docker Hub account will be used for all
images and cannot currently be set different for each algorithm. Set the
Expand Down
2 changes: 1 addition & 1 deletion docs/htcondor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ it uses the SPRAS apptainer image you created:
container_image = < your spras image >.sif

Make sure to modify the configuration file to have
``unpack_singularity`` set to ``true``, and ``container_framework`` set
``unpack_singularity`` set to ``true``, and ``containers.framework`` set
to ``singularity``: else, the workflow will (likely) fail.

Then run ``condor_submit spras.sub``, which will submit SPRAS to
Expand Down
3 changes: 2 additions & 1 deletion docs/tutorial/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,7 @@ The global workflow control section in the configuration file allows a user to s

.. code-block:: yaml

container_framework: docker
containers:
framework: docker

The frameworks include Docker, Apptainer/Singularity, or dsub
28 changes: 6 additions & 22 deletions spras/config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
value. For example

import spras.config.config as config
container_framework = config.config.container_framework
container_framework = config.config.container_settings.framework

will grab the top level registry configuration option as it appears in the config file
"""
Expand All @@ -22,7 +22,8 @@
import numpy as np
import yaml

from spras.config.schema import ContainerFramework, RawConfig
from spras.config.container_schema import ProcessedContainerSettings
from spras.config.schema import RawConfig
from spras.util import NpHashEncoder, hash_params_sha1_base32

config = None
Expand Down Expand Up @@ -65,12 +66,6 @@ def __init__(self, raw_config: dict[str, Any]):

# Directory used for storing output
self.out_dir = parsed_raw_config.reconstruction_settings.locations.reconstruction_dir
# Container framework used by PRMs. Valid options are "docker", "dsub", and "singularity"
self.container_framework = None
# The container prefix (host and organization) to use for images. Default is "docker.io/reedcompbio"
self.container_prefix: str = DEFAULT_CONTAINER_PREFIX
# A Boolean specifying whether to unpack singularity containers. Default is False
self.unpack_singularity = False
# A Boolean indicating whether to enable container runtime profiling (apptainer/singularity only)
self.enable_profiling = False
# A dictionary to store configured datasets against which SPRAS will be run
Expand All @@ -79,6 +74,8 @@ def __init__(self, raw_config: dict[str, Any]):
self.gold_standards = None
# The hash length SPRAS will use to identify parameter combinations.
self.hash_length = parsed_raw_config.hash_length
# Container settings used by PRMs.
self.container_settings = ProcessedContainerSettings.from_container_settings(parsed_raw_config.containers, self.hash_length)
# The list of algorithms to run in the workflow. Each is a dict with 'name' as an expected key.
self.algorithms = None
# A nested dict mapping algorithm names to dicts that map parameter hashes to parameter combinations.
Expand Down Expand Up @@ -295,20 +292,7 @@ def process_config(self, raw_config: RawConfig):
# Set up a few top-level config variables
self.out_dir = raw_config.reconstruction_settings.locations.reconstruction_dir

if raw_config.container_framework == ContainerFramework.dsub:
warnings.warn("'dsub' framework integration is experimental and may not be fully supported.", stacklevel=2)
self.container_framework = raw_config.container_framework

# Unpack settings for running in singularity mode. Needed when running PRM containers if already in a container.
if raw_config.unpack_singularity and self.container_framework != "singularity":
warnings.warn("unpack_singularity is set to True, but the container framework is not singularity. This setting will have no effect.", stacklevel=2)
self.unpack_singularity = raw_config.unpack_singularity

# Grab registry from the config, and if none is provided default to docker
if raw_config.container_registry and raw_config.container_registry.base_url != "" and raw_config.container_registry.owner != "":
self.container_prefix = raw_config.container_registry.base_url + "/" + raw_config.container_registry.owner

if raw_config.enable_profiling and raw_config.container_framework not in ["singularity", "apptainer"]:
if raw_config.enable_profiling and raw_config.containers.framework not in ["singularity", "apptainer"]:
warnings.warn("enable_profiling is set to true, but the container framework is not singularity/apptainer. This setting will have no effect.", stacklevel=2)
self.enable_profiling = raw_config.enable_profiling

Expand Down
77 changes: 77 additions & 0 deletions spras/config/container_schema.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
"""
The separate container schema specification file.
For information about pydantic, see schema.py.

We move this to a separate file to allow `containers.py` to explicitly take in
this subsection of the configuration.
"""

import warnings
from dataclasses import dataclass

from pydantic import BaseModel, ConfigDict

from spras.config.util import CaseInsensitiveEnum

DEFAULT_CONTAINER_PREFIX = "docker.io/reedcompbio"

class ContainerFramework(CaseInsensitiveEnum):
docker = 'docker'
singularity = 'singularity'
apptainer = 'apptainer'
dsub = 'dsub'

class ContainerRegistry(BaseModel):
base_url: str = "docker.io"
"The domain of the registry"

owner: str = "reedcompbio"
"The owner or project of the registry"

model_config = ConfigDict(extra='forbid', use_attribute_docstrings=True)

class ContainerSettings(BaseModel):
framework: ContainerFramework = ContainerFramework.docker
unpack_singularity: bool = False
registry: ContainerRegistry

model_config = ConfigDict(extra='forbid')

@dataclass
class ProcessedContainerSettings:
framework: ContainerFramework = ContainerFramework.docker
unpack_singularity: bool = False
prefix: str = DEFAULT_CONTAINER_PREFIX
hash_length: int = 7
"""
The hash length for container-specific usage. This does not appear in
the output folder, but it may show up in logs, and usually never needs
to be tinkered with. This will be the top-level `hash_length` specified
in the config.

We prefer this `hash_length` in our container-running logic to
avoid a (future) dependency diamond.
"""

@staticmethod
def from_container_settings(settings: ContainerSettings, hash_length: int) -> "ProcessedContainerSettings":
if settings.framework == ContainerFramework.dsub:
warnings.warn("'dsub' framework integration is experimental and may not be fully supported.", stacklevel=2)
container_framework = settings.framework

# Unpack settings for running in singularity mode. Needed when running PRM containers if already in a container.
if settings.unpack_singularity and container_framework != "singularity":
warnings.warn("unpack_singularity is set to True, but the container framework is not singularity. This setting will have no effect.", stacklevel=2)
unpack_singularity = settings.unpack_singularity

# Grab registry from the config, and if none is provided default to docker
container_prefix = DEFAULT_CONTAINER_PREFIX
if settings.registry and settings.registry.base_url != "" and settings.registry.owner != "":
container_prefix = settings.registry.base_url + "/" + settings.registry.owner

return ProcessedContainerSettings(
framework=container_framework,
unpack_singularity=unpack_singularity,
prefix=container_prefix,
hash_length=hash_length
)
Loading
Loading