kminhta · kminhta · Oct 21, 2024 · Oct 21, 2024 · Oct 22, 2024 · Oct 23, 2024
diff --git a/openfl-workspace/flower-app-pytorch/.workspace b/openfl-workspace/flower-app-pytorch/.workspace
@@ -0,0 +1,2 @@
+current_plan_name: default
+
diff --git a/openfl-workspace/flower-app-pytorch/README.md b/openfl-workspace/flower-app-pytorch/README.md
@@ -0,0 +1,280 @@
+# Open(FL)ower
+
+This workspace demonstrates a new functionality in OpenFL to interoperate with [Flower](https://flower.ai/). In particular, a user can now use the Flower API to run on OpenFL infrastructure. OpenFL will act as an intermediary step between the Flower SuperLink and Flower SuperNode to relay messages across the network using OpenFL's transport mechanisms.
+
+## Overview
+
+In this repository, you'll notice a directory under `src` called `app-pytorch`. This is essentially a Flower PyTorch app created using Flower's `flwr new` command that has been modified to run a local federation. The `client_app.py` and `server_app.py` dictate what will be run by the client and server respectively. `task.py` defines the logic that will be executed by each app, such as the model definition, train/test tasks, etc. Under `server_app.py` a section titled "Save Model" is added in order to save the `best.pbuf` and `last.pbuf` models from the experiment in your local workspace under `./save`. This uses native OpenFL logic to store the model as a `.pbuf` in order to later be retrieved by `fx model save` into a native format (limited to `.npz` to be deep learning framework agnostic), but this can be overridden to save the model directly following Flower's recommended method for [saving model checkpoints](https://flower.ai/docs/framework/how-to-save-and-load-model-checkpoints.html).
+
+## Execution Methods
+
+There are two ways to execute this:
+
+1. Automatic shutdown which will spawn a `server-app` in isolation and trigger an experiment termination once the it shuts down. (Default/Recommended)
+2. Running `SuperLink` and `SuperNode` as [long-lived components](#long-lived-superlink-and-supernode) that will indefinitely wait for new runs. (Limited Functionality)
+
+## Getting Started
+
+### Install OpenFL
+
+Create virtual env
+```sh
+pip install virtualenv
+virtualenv ./venv
+source ./venv/bin/activate
+```
+
+Install OpenFL from source
+```sh
+git clone https://github.com/securefederatedai/openfl.git
+cd openfl
+pip install -e .
+```
+
+### Create a Workspace
+
+Start by creating a workspace:
+
+```sh
+fx workspace create --template flower-app-pytorch --prefix my_workspace
+cd my_workspace
+```
+
+This will create a workspace in your current working directory called `./my_workspace` as well as install the Flower app defined in `./app-pytorch.` This will be where the experiment takes place.
+
+### Configure the Experiment
+Notice under `./plan`, you will find the familiar OpenFL YAML files to configure the experiment. `col.yaml` and `data.yaml` will be populated by the collaborators that will run the Flower client app and the respective data shard or directory they will perform their training and testing on.
+`plan.yaml` configures the experiment itself. The Open-Flower integration makes a few key changes to the `plan.yaml`:
+
+1. Introduction of a new top-level key (`connector`) to configure a newly introduced component called `Connector`. Specifically, the Flower integration uses a `Connector` subclass called `ConnectorFlower`. This component is run by the aggregator and is responsible for initializing the Flower `SuperLink` and connecting to the OpenFL server. The `SuperLink` parameters can be configured using `connector.settings.superlink_params`. If nothing is supplied, it will simply run `flower-superlink --insecure` with the command's default settings as dictated by Flower. It also includes the option to run the flwr run command via `connector.settings.flwr_run_params`. If `flwr_run_params` are not provided, the user will be expected to run `flwr run <app>` from the aggregator machine to initiate the experiment. 
+
+```yaml
+connector:
+  defaults: plan/defaults/connector.yaml
+  template: openfl.component.ConnectorFlower
+  settings:
+    superlink_params:
+      insecure: True
+      serverappio-api-address: 127.0.0.1:9091
+      fleet-api-address: 127.0.0.1:9092
+      exec-api-address: 127.0.0.1:9093
+    flwr_run_params:
+      flwr_app_name: "app-pytorch"
+      federation_name: "local-poc"
+```
+
+2. `ConnectorAssigner` and tasks designed to explicitly run `start_client_adapter` task for every authorized collaborator, which is defined by the Task Runner.
+
+```yaml
+assigner:
+  defaults: plan/defaults/assigner.yaml
+  template: openfl.component.ConnectorAssigner
+  settings:
+    task_groups:
+      - name: Connector_Flower
+        tasks:
+          - start_client_adapter
+```
+
+3. `FlowerTaskRunner` which will execute the `start_client_adapter` task. This task starts the Flower SuperNode and makes a connection to the OpenFL client. Additionally, the `FlowerTaskRunner` has an additional setting `FlowerTaskRunner.settings.auto_shutdown` which is default set to `True`. When set to `True`, the task runner will shut the SuperNode at the completion of an experiment, otherwise, it will run continuously.
+
+```yaml
+task_runner:
+  defaults: plan/defaults/task_runner.yaml
+  template: openfl.federated.task.runner_flower.FlowerTaskRunner
+  settings:
+    auto_shutdown: True
+```
+3. `FlowerDataLoader` with similar high-level functionality to other dataloaders.
+
+**IMPORTANT NOTE**: `aggregator.settings.rounds_to_train` is set to 1. __Do not edit this__. The actual number of rounds for the experiment is controlled by Flower logic inside of `./app-pytorch/pyproject.toml`. The entirety of the Flower experiment will run in a single OpenFL round. The aggregator round is there to stop the OpenFL components at the completion of the experiment.
+
+## Running the Workspace
+Run the workspace as normal (certify the workspace, initialize the plan, register the collaborators, etc.):
+
+```SH
+# Generate a Certificate Signing Request (CSR) for the Aggregator
+fx aggregator generate-cert-request
+
+# The CA signs the aggregator's request, which is now available in the workspace
+fx aggregator certify --silent
+
+# Initialize FL Plan and Model Weights for the Federation
+fx plan initialize
+
+################################
+# Setup Collaborator 1 
+################################
+
+# Create a collaborator named "collaborator1" that will use shard "0"
+fx collaborator create -n collaborator1 -d 0
+
+# Generate a CSR for collaborator1
+fx collaborator generate-cert-request -n collaborator1
+
+# The CA signs collaborator1's certificate
+fx collaborator certify -n collaborator1 --silent
+
+################################
+# Setup Collaborator 2 
+################################
+
+# Create a collaborator named "collaborator2" that will use shard "1"
+fx collaborator create -n collaborator2 -d 1
+
+# Generate a CSR for collaborator2
+fx collaborator generate-cert-request -n collaborator2
+
+# The CA signs collaborator2's certificate
+fx collaborator certify -n collaborator2 --silent
+
+##############################
+# Start to Run the Federation
+##############################
+
+# Run the Aggregator
+fx aggregator start
+```
+
+This will prepare the workspace and start the OpenFL aggregator, Flower superlink, and Flower serverapp. You should see something like:
+
+```SH
+INFO     🧿 Starting the Aggregator Service.
+.
+.
+.
+INFO :      Starting Flower SuperLink
+WARNING :   Option `--insecure` was set. Starting insecure HTTP server.
+INFO :      Flower Deployment Engine: Starting Exec API on 127.0.0.1:9093
+INFO :      Flower ECE: Starting ServerAppIo API (gRPC-rere) on 127.0.0.1:9091
+INFO :      Flower ECE: Starting Fleet API (GrpcAdapter) on 127.0.0.1:9092
+.
+.
+.
+INFO :      [INIT]
+INFO :      Using initial global parameters provided by strategy
+INFO :      Starting evaluation of initial global parameters
+INFO :      Evaluation returned no results (`None`)
+INFO :      
+INFO :      [ROUND 1]
+```
+
+### Start Collaborators
+Open 2 additional terminals for collaborators.
+For collaborator 1's terminal, run:
+```SH
+fx collaborator start -n collaborator1
+```
+For collaborator 2's terminal, run:
+```SH
+fx collaborator start -n collaborator2
+```
+This will start the collaborator nodes, the Flower `SuperNode`, and Flower `ClientApp`, and begin running the Flower experiment. You should see something like:
+
+```SH
+ INFO     🧿 Starting a Collaborator Service.
+.
+.
+.
+INFO :      Starting Flower SuperNode
+WARNING :   Option `--insecure` was set. Starting insecure HTTP channel to 127.0.0.1:...
+INFO :      Starting Flower ClientAppIo gRPC server on 127.0.0.1:...
+INFO :      
+INFO :      [RUN 297994661073077505, ROUND 1]
+```
+### Completion of the Experiment
+Upon the completion of the experiment, on the `aggregator` terminal, the Flower components should send an experiment summary as the `SuperLink `continues to receive requests from the supernode:
+```SH
+INFO :      [SUMMARY]
+INFO :      Run finished 3 round(s) in 93.29s
+INFO :          History (loss, distributed):
+INFO :                  round 1: 2.0937052175497555
+INFO :                  round 2: 1.8027011854633406
+INFO :                  round 3: 1.6812996898487116
+```
+If `automatic_shutdown` is enabled, this will be shortly followed by the OpenFL `aggregator` receiving "results" from the `collaborator` and subsequently shutting down:
+
+```SH
+INFO     Round 0: Collaborators that have completed all tasks: ['collaborator1', 'collaborator2']                                  
+INFO     Experiment Completed. Cleaning up...
+INFO     Sending signal to collaborator collaborator2 to shutdown...
+INFO     Sending signal to collaborator collaborator1 to shutdown...
+INFO     [OpenFL Connector] Stopping server process with PID: ...
+INFO :    SuperLink terminated gracefully.
+INFO     [OpenFL Connector] Server process stopped.
+```    
+Upon the completion of the experiment, on the `collaborator` terminals, the Flower components should be outputting the information about the run:
+
+```SH
+INFO :      [RUN ..., ROUND 3]
+INFO :      Received: evaluate message 
+INFO :      Start `flwr-clientapp` process
+INFO :      [flwr-clientapp] Pull `ClientAppInputs` for token ...
+INFO :      [flwr-clientapp] Push `ClientAppOutputs` for token ...
+```
+
+If `automatic_shutdown` is enabled, this will be shortly followed by the OpenFL `collaborator` shutting down:
+
+```SH
+INFO :      SuperNode terminated gracefully.
+INFO     SuperNode process terminated.
+INFO     Shutting down local gRPC server... 
+INFO     local gRPC server stopped. 
+INFO     Waiting for tasks...     
+INFO     Received shutdown signal. Exiting...
+``` 
+Congratulations, you have run a Flower experiment through OpenFL's task runner!
+
+## Advanced Usage
+### Long-lived SuperLink and SuperNode
+A user can set `automatic_shutdown: False` in the `Connector` settings of the `plan.yaml`. 
+
+```yaml
+connector : 
+  defaults : plan/defaults/connector.yaml
+  template : openfl.component.ConnectorFlower
+  settings :
+    automatic_shutdown: False
+```
+
+By doing so, Flower's `ServerApp` and `ClientApp` will still shut down at the completion of the Flower experiment, but the `SuperLink` and `SuperNode` will continue to run. As a result, on the `aggregator` terminal, you will see a constant request coming from the `SuperNode`:
+
+```SH
+INFO :      GrpcAdapter.PullTaskIns
+INFO :      GrpcAdapter.PullTaskIns
+INFO :      GrpcAdapter.PullTaskIns
+```
+You can run another experiment by opening another terminal, navigating to this workspace, and running:
+```SH
+flwr run ./src/app-pytorch
+```
+It will run another experiment. Once you are done, you can manually shut down OpenFL's `collaborator` and Flower's `SuperNode` with `CTRL+C`. This will trigger a task-completion by the task runner that'll subsequently begin the graceful shutdown process of the OpenFL and Flower components.
+
+### Running in SGX Enclave
+Gramine does not support all Linux system calls. Flower FAB is built and installed at runtime. During this, `utime()` is called, which is an [unsupported call](https://gramine.readthedocs.io/en/latest/devel/features.html#list-of-system-calls), resulting in error or unexpected behavior. To navigate this, when running in an SGX enclave, we opt to build and install the FAB during initialization and package it alongside the OpenFL workspace. To make this work, we introduce some patches to Flower's build command. In addition, since secure enclaves have strict read/write permissions, dictate by a set of trusted/allowed files, we also patch Flower's telemetry command in order to consolidate written file locations.
+
+To run these patches, simply add `patch: True` to the `Connector` and `Task Runner` settings. For the `Task Runner` also include the name of the Flower app for building and installation.
+
+```yaml
+connector : 
+  defaults : plan/defaults/connector.yaml
+  template : openfl.component.ConnectorFlower
+  settings :
+    superlink_params :
+      insecure : True
+      serverappio-api-address : 127.0.0.1:9091 
+      fleet-api-address :  127.0.0.1:9092 
+      exec-api-address : 127.0.0.1:9093
+      patch : True
+    flwr_run_params :
+      flwr_app_name : "app-pytorch"
+      federation_name : "local-poc"
+      patch : True
+
+task_runner :
+  defaults : plan/defaults/task_runner.yaml
+  template : openfl.federated.task.runner_flower.FlowerTaskRunner
+  settings :
+    patch : True
+    flwr_app_name : "app-pytorch"
+```
diff --git a/openfl-workspace/flower-app-pytorch/plan/cols.yaml b/openfl-workspace/flower-app-pytorch/plan/cols.yaml
@@ -0,0 +1,5 @@
+# Copyright (C) 2024 Intel Corporation
+# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
+
+collaborators:
+
diff --git a/openfl-workspace/flower-app-pytorch/plan/data.yaml b/openfl-workspace/flower-app-pytorch/plan/data.yaml
@@ -0,0 +1,2 @@
+# Copyright (C) 2024 Intel Corporation
+# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
diff --git a/openfl-workspace/flower-app-pytorch/plan/plan.yaml b/openfl-workspace/flower-app-pytorch/plan/plan.yaml
@@ -0,0 +1,58 @@
+# Copyright (C) 2024 Intel Corporation
+# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
+
+aggregator :
+  defaults : plan/defaults/aggregator.yaml
+  template : openfl.component.Aggregator
+  settings :
+    rounds_to_train : 1 # DO NOT EDIT. This is to indicate OpenFL communication rounds
+    persist_checkpoint : false
+    write_logs : false
+
+connector : 
+  defaults : plan/defaults/connector.yaml
+  template : openfl.component.ConnectorFlower
+  settings :
+    superlink_params :
+      insecure : True
+      serverappio-api-address : 127.0.0.1:9091 # note [kta-intel]: ServerApp will connect here
+      fleet-api-address :  127.0.0.1:9092 # note [kta-intel]: local gRPC client will connect here
+      exec-api-address : 127.0.0.1:9093 # note [kta-intel]: port for server-app toml (for flwr run)
+    flwr_run_params :
+      flwr_app_name : "app-pytorch"
+      federation_name : "local-poc"
+
+collaborator :
+  defaults : plan/defaults/collaborator.yaml
+  template : openfl.component.Collaborator
+
+data_loader :
+  defaults : plan/defaults/data_loader.yaml
+  template : openfl.federated.data.loader_flower.FlowerDataLoader
+  settings :
+    collaborator_count : 2
+
+task_runner :
+  defaults : plan/defaults/task_runner.yaml
+  template : openfl.federated.task.runner_flower.FlowerTaskRunner
+
+network :
+  defaults : plan/defaults/network.yaml
+
+assigner :
+  defaults : plan/defaults/assigner.yaml
+  template : openfl.component.RandomGroupedAssigner
+  settings :
+    task_groups :
+      - name  : Connector_Flower
+        percentage : 1.0
+        tasks : 
+          - start_client_adapter
+
+tasks :
+  defaults : plan/defaults/tasks_connector.yaml
+  settings :
+    connect_to : Flower
+
+compression_pipeline :
+  defaults : plan/defaults/compression_pipeline.yaml
diff --git a/openfl-workspace/flower-app-pytorch/requirements.txt b/openfl-workspace/flower-app-pytorch/requirements.txt
@@ -0,0 +1 @@
+./src/app-pytorch
diff --git a/openfl-workspace/flower-app-pytorch/src/app-pytorch/app_pytorch/__init__.py b/openfl-workspace/flower-app-pytorch/src/app-pytorch/app_pytorch/__init__.py
@@ -0,0 +1 @@
+"""app-pytorch: A Flower / PyTorch app."""
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		# Copyright (C) 2024 Intel Corporation
		# Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.