Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Get started today and see how Rhino Health's client resources can help you build
- **Tutorials**
- [Tutorial #1 - Rhino Health Federated Computing Platform “Hello World” - Basic Usage](./tutorials/tutorial_1/README.md)
- [Tutorial #2 - Rhino Health Federated Computing Platform Data Harmonization on the FCP SDK](./tutorials/tutorial_2/README.md)
- [Advanced Tutorials](./tutorials/advanced)
- **Sandbox**
- [Pneumonia Predicition](./sandbox/pneumonia-prediction/README,md)
- [Rhino Utils](./rhino-utils/README.md) - Utilities for locally running and pushing containers to your workgroup's ECR for both FCP Generalized Compute and NVFlare models
Expand Down
133 changes: 133 additions & 0 deletions tutorials/advanced/secrets-manager-encryption/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Secrets Manager Integration and Encryption with Rhino FCP

This tutorial demonstrates how to integrate with Rhino FCP's third party Secrets Manager integration to securely encrypt and protect sensitive code and model weights in federated learning applications. The example uses a Chemprop-based molecular property prediction model with NVFlare federated learning framework.

## Overview

This tutorial shows how to:
- Encrypt sensitive Python code and model weights using hybrid RSA/AES encryption
- Integrate with Rhino FCP's third party Secrets Manager to securely store decryption keys
- Deploy encrypted federated learning applications that automatically decrypt at runtime
- Use the encrypted model in a federated learning workflow with NVFlare

## Directory Structure

```
secrets-manager-encryption/
├── README.md # This file
├── data/ # Sample molecular datasets
│ ├── cyp3a4_A.csv # Training dataset A
│ ├── cyp3a4_B.csv # Training dataset B
│ ├── cyp3a4_C.csv # Training dataset C
│ └── cyp3a4_test.csv # Test dataset
└── code/ # Application code
├── chemprop_fl_classification.py # Main federated learning model
├── requirements.txt # Python dependencies
├── Dockerfile # Container configuration
├── entrypoint.sh # Container entrypoint for decryption
├── infer.py # Inference script
├── meta.conf # Model metadata
├── model_parameters.pt # Pre-trained model weights
├── encrypt_code/ # Encryption utilities
│ └── encrypt_code.py # File encryption with AWS Secrets Manager integration
└── app/ # Application files
├── custom/ # Custom application code
│ ├── chemprop_fl_classification.py.enc # Encrypted main model
│ ├── model_parameters.pt.enc # Encrypted model weights
│ ├── decrypt_code.py # Runtime decryption utility
│ └── encrypted_persistor.py # Encrypted model persistence
└── config/ # NVFlare configuration
├── config_fed_client.conf # Federated client configuration
└── config_fed_server.conf # Federated server configuration
```

## Prerequisites

- Python 3.12+
- Docker
- Access to Rhino FCP platform
- Access to a third party Secrets Manager platform (this example uses AWS Secrets Manager)

## Quick Start

### 1. Configure AWS Secrets Manager

Before encrypting files, you need to configure AWS Secrets Manager:

1. **Set up AWS IAM Role**: Create an IAM role with permissions to access Secrets Manager
2. **Update Configuration**: Modify the `encrypt_code.py` file to set your AWS account ID and role name:
```python
ACCOUNT_ID = '<your-account-id>'
ROLE_NAME = '<your-role-name>'
```
3. **Key Management**: Keys will be automatically created and stored as individual secrets in AWS Secrets Manager

The encryption utility will automatically generate and manage RSA key pairs within the Secrets Manager.

### 2. Encrypt Sensitive Files

Encrypt your model code and weights using the AWS Secrets Manager integration:

```bash
cd code/encrypt_code

# Encrypt the main model file
python encrypt_code.py -i ../chemprop_fl_classification.py \
-k model_key \
-o ../app/custom/chemprop_fl_classification.py.enc

# Encrypt model weights
python encrypt_code.py -i ../model_parameters.pt \
-k model_key \
-o ../app/custom/model_parameters.pt.enc

# Optional: Delete original files after encryption
python encrypt_code.py -i ../chemprop_fl_classification.py \
-k model_key \
-o ../app/custom/chemprop_fl_classification.py.enc \
-d
```

The encryption utility will:
- Automatically generate RSA key pairs if they don't exist
- Store each key pair as a separate secret in AWS Secrets Manager using the key name as the secret ID
- Use hybrid RSA/AES encryption for file protection
- Optionally delete original files after encryption (use `-d` flag)

### 3. Configure Rhino FCP to Use Secrets Manager

Reach out to a Rhino FCP representative to configure your Organization to utilize direct Secrets Manager Integration. This can only be configured by a Rhino Admin.

### 4. Deploy the Application

Build and deploy the Docker container:

```bash
cd code
../../../../rhino-utils/docker-push.sh <workgroup-ecr> <container-name>
```

The container will automatically:
- Decrypt files at startup using the private key from Secrets Manager
- Run the federated learning application
- Participate in the federated training process

## How It Works

### Encryption Process

1. **Hybrid Encryption**: Uses RSA for key exchange and AES for data encryption
2. **File Encryption**: Sensitive files are encrypted with AES using a random session key
3. **Key Protection**: The session key is encrypted with RSA public key
4. **Secure Storage**: Each key pair is stored as a separate secret in AWS Secrets Manager
5. **Key Management**: Private keys are retrieved from Rhino FCP's Secrets Manager at runtime

### Runtime Decryption

1. **Container Startup**: The `entrypoint.sh` script runs before the main application
2. **Key Retrieval**: `decrypt_code.py` retrieves the private key from Rhino FCP's integration with a third party Secrets Manager
3. **File Decryption**: All `.enc` files are automatically decrypted using the retrieved key
4. **Application Execution**: The decrypted files are used by the federated learning application

## Getting Help
For additional support, please reach out to [support@rhinohealth.com](mailto:support@rhinohealth.com).
68 changes: 68 additions & 0 deletions tutorials/advanced/secrets-manager-encryption/code/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
FROM python:3.12-slim-bullseye

# Set env vars to be able to run apt-get commands without issues.
ENV LC_ALL="C.UTF-8"
ENV TZ=Etc/UTC

# Install Python 3.12
RUN rm -f /etc/apt/apt.conf.d/docker-clean; \
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
export DEBIAN_FRONTEND=noninteractive \
&& apt-get update -qq \
&& apt-get install -q -y --no-install-recommends -q python3 python3-dev python3-venv

# Set up user and group.
ARG UID=5642
ARG GID=5642
RUN <<"EOF" bash
set -eu -o pipefail
if [[ $UID -eq 0 ]]; then
# Symlink /home/localuser to the root users home dir
home_dir="$(getent passwd $UID | cut -d: -f6)"
ln -s "$home_dir" /home/localuser
else
if [[ $UID -ge 1000 ]] && getent passwd $UID >/dev/null; then
# Delete the existing user
user_name="$(getent passwd $UID | cut -d: -f1)"
userdel "$user_name" >/dev/null
fi
if [[ $GID -ge 1000 ]] && getent group $GID >/dev/null; then
# Delete the existing group
group_name="$(getent group $GID | cut -d: -f1)"
groupdel "$group_name" >/dev/null
fi
getent group $GID >/dev/null || groupadd -r -g $GID localgroup
useradd -m -l -s /bin/bash -g $GID -N -u $UID localuser
fi
EOF

# Create and "activate" venv.
ENV VIRTUAL_ENV="/venv"
RUN mkdir $VIRTUAL_ENV \
&& chmod g+s $VIRTUAL_ENV \
&& chown $UID:$GID $VIRTUAL_ENV \
&& python3 -m venv --upgrade-deps $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

# Install dependencies.
RUN --mount=type=cache,target=/root/.cache/pip \
--mount=type=bind,source=requirements.txt,target=/requirements.txt \
PIP_ROOT_USER_ACTION=ignore pip install -r /requirements.txt

WORKDIR /home/localuser
USER localuser

COPY --chown=$UID:$GID ./app ./app
COPY --chown=$UID:$GID ./infer.py ./meta.conf ./entrypoint.sh ./

ENV PYTHONPATH="/home/localuser/app/custom"
ENV PYTHONUNBUFFERED=1

# Use a custom entrypoint to decrypt code before executing commands.
# Ensure files are executable.
RUN chmod +x ./infer.py
RUN chmod +x ./entrypoint.sh

ENTRYPOINT ["./entrypoint.sh"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
{
format_version = 2
app_script = "chemprop_fl_classification.py"
app_config = ""
executors = [
{
tasks = [
"train"
]
executor {
path = "nvflare.app_opt.pt.client_api_launcher_executor.PTClientAPILauncherExecutor"
args {
launcher_id = "launcher"
pipe_id = "pipe"
heartbeat_timeout = 60
params_exchange_format = "pytorch"
params_transfer_type = "DIFF"
train_with_evaluation = true
}
}
}
]
task_data_filters = []
task_result_filters = []
components = [
{
id = "launcher"
path = "nvflare.app_common.launchers.subprocess_launcher.SubprocessLauncher"
args {
# script = "python3 -u custom/{app_script} {app_config} "
script = "python3 /home/localuser/app/custom/{app_script} {app_config} "
launch_once = true
}
}
{
id = "pipe"
path = "nvflare.fuel.utils.pipe.cell_pipe.CellPipe"
args {
mode = "PASSIVE"
site_name = "{SITE_NAME}"
token = "{JOB_ID}"
root_url = "{ROOT_URL}"
secure_mode = "{SECURE_MODE}"
workspace_dir = "{WORKSPACE}"
}
}
{
id = "metrics_pipe"
path = "nvflare.fuel.utils.pipe.cell_pipe.CellPipe"
args {
mode = "PASSIVE"
site_name = "{SITE_NAME}"
token = "{JOB_ID}"
root_url = "{ROOT_URL}"
secure_mode = "{SECURE_MODE}"
workspace_dir = "{WORKSPACE}"
}
}
{
id = "metric_relay"
path = "nvflare.app_common.widgets.metric_relay.MetricRelay"
args {
pipe_id = "metrics_pipe"
event_type = "fed.analytix_log_stats"
read_interval = 0.1
}
}
{
id = "config_preparer"
path = "nvflare.app_common.widgets.external_configurator.ExternalConfigurator"
args {
component_ids = [
"metric_relay"
]
}
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
{
format_version = 2
task_data_filters = []
task_result_filters = []
model_class_path = "chemprop_fl_classification.ClassificationMPNN"
workflows = [
{
id = "scatter_and_gather"
path = "nvflare.app_common.workflows.scatter_and_gather.ScatterAndGather"
args {
min_clients = 3
num_rounds = 2
start_round = 0
wait_time_after_min_received = 10
aggregator_id = "aggregator"
persistor_id = "persistor"
shareable_generator_id = "shareable_generator"
train_task_name = "train"
train_timeout = 0
}
}
]
components = [
{
id = "persistor"
path = "encrypted_persistor.EncryptedPersistor"
args {
model {
path = "{model_class_path}"
}
global_model_file_name = "/output/model_parameters.pt.enc"
}
}
{
id = "shareable_generator"
path = "nvflare.app_common.shareablegenerators.full_model_shareable_generator.FullModelShareableGenerator"
args {}
}
{
id = "aggregator"
path = "nvflare.app_common.aggregators.intime_accumulate_model_aggregator.InTimeAccumulateWeightedAggregator"
args {
expected_data_kind = "WEIGHT_DIFF"
}
}
{
id = "model_selector"
path = "nvflare.app_common.widgets.intime_model_selector.IntimeModelSelector"
args {
key_metric = "val/roc"
}
}
{
id = "receiver"
path = "nvflare.app_opt.tracking.tb.tb_receiver.TBAnalyticsReceiver"
args {
events = [
"fed.analytix_log_stats"
]
}
}
]
}
Loading