Add variant_selection pipeline step

## Description
Introduce a new pipeline step named **`variant_selection`** responsible for selecting reportable variants from the `variant_collection` attribute stored in the `Sample` object.

This step will act as a logical filter layer between **variant collection** and **downstream reporting**, ensuring that only variants relevant to each analysis context are propagated further in the pipeline.


## Motivation
Currently, variants are collected and stored in `variant_collection` according to category (e.g. PR, RR) and source, but there is no dedicated step that encapsulates the logic required to **select subsets of variants depending on clinical and analytical context**.

Adding a `variant_selection` step will:

- Centralize and formalize variant selection logic.
- Clearly separate *variant collection* from *variant interpretation/selection*.
- Improve maintainability as selection rules evolve.
- Facilitate future extensions (new categories, modes, or selection criteria).

## Current behavior
- Variants are aggregated and stored in `variant_collection` at the sample level.
- Selection criteria are either applied implicitly downstream or are not explicitly structured as a pipeline step.
- There is no single orchestration point responsible for variant selection across categories and analytical modes.


## Proposed refactor
Add a new step named `variant_selection. `  This step will operate **per sample** and will be responsible for orchestrating variant selection across categories (PR, RR) and modes (screening, advanced for RR)

### High-level design  

- The `variant_selection` step will:
  - Iterate over the categories present in the sample (e.g. PR, RR).
  - For each category, call an **auxiliary variant selection function**.
  - Store the resulting selected variants in a structured, category-aware output (exact storage model to be defined later).

- The **auxiliary selection function** will:
  - Iterate over the set of variants stored in `variant_collection`.
  - Apply selection rules based on:
    - Variant category (PR vs RR).
    - Sex of the individual.
    - RR mode (e.g. screening vs advanced).
  - Return only the variants that satisfy the applicable criteria.

### Selection logic (high-level, non-exhaustive)  

- **Personal Risk (PR)**  
  - Variants will be selected according to disease/gene inheritance models.
  - Different casuistics (e.g. heterozygous, homozygous, compound heterozygous) will be considered depending on the inheritance pattern.

- **Reproductive Risk (RR)**  
  - Variant selection will depend on:
    - RR mode (screening or advanced).
    - Sex of the individual.
  - Different zygosity-based selection rules will apply depending on the context.

### Tasks
- [x] Add `variant_selection` attribute to Sample / SampleContext
- [x] Add `variant_selection.py` step to steps folder
- [x] Add `pr_variant_selection.py` to variant_selection folder (Personal Risk module) and auxiliary functions to ùtils.py`in variant_selection folder
- [x] Add `rr_variant_selection.py` to variant_selection folder (Reproductive Risk module)

## Additional context
- This step is conceptually downstream of `variant_collection`.
- It should be designed to be extensible to additional categories or analytical modes.
- The implementation should remain compatible with previous refactors


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add variant_selection pipeline step #28

Description

Motivation

Current behavior

Proposed refactor

High-level design

Selection logic (high-level, non-exhaustive)

Tasks

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add variant_selection pipeline step #28

Description

Description

Motivation

Current behavior

Proposed refactor

High-level design

Selection logic (high-level, non-exhaustive)

Tasks

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions