Kubeflow is an open-source platform that makes it easy to deploy and manage machine learning workflows on Kubernetes. MLflow is another open-source platform that provides tools for tracking and managing machine learning experiments. In this project, you will be setting up a Kubeflow cluster and integrating it with MLflow to track the experiments run on the cluster. You will then use this setup to train and track the performance of a model on a dataset.
- Install and configure Kubeflow on a Kubernetes/minikube cluster
- Install and configure MLflow on the same cluster
- Write a Python script to train a model on a dataset and log the experiment with MLflow
- Use the Kubeflow pipeline system to run multiple experiments with different hyperparameters and track them with MLflow
- Compare the performance of the different models using the MLflow UI
- Integration of the MLflow on the charmed Kubeflow as well as artifacts storage for all the data outputs in the model using MINIO
- Docker — version 1.19
- kubectl — version 1.15
- minikube — version 1.15
1a) You can deploy the Kubeflow pipeline on Kubernetes/minikube cluster on Windows host machine powershell with administrative previliges using following few commands :
set PIPELINE_VERSION=2.0.0
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"kubectl get pods -AIt'll show all the pods in the default as well as Kubeflow namespace.
To view the Pods only from kubeflow namespace you can use following command :
kubectl get pods -n kubeflowUse the below command for port-forward :
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80It'll give the local IP address through which we can view our kubeflow dashboard.
After opening the localhost:8080 you can view your Kubeflow dashboard.
Hence, we have successfully installed and configured the Kubeflow on our minikube cluster.
To integrate MLflow and kubeflow together on a Minikube cluster, follow these steps :
2c) Install MLflow on your Minikube cluster. You can use Helm charts to simplify the installation process.
Run following few command's to install MLflow using Helm Charts :
helm repo add community-charts https://community-charts.github.io/helm-charts
helm install my-mlflow community-charts/mlflow --version 0.7.19Use the following kubectl command to verify the installation :
kubectl get pods -n defaultYou'll get to see your mlflow pod up and running.
Once you type the "mlflow ui" in your terminal it'll give you the Localhost address for accessing your MLflow Dashboard.
Hence, we have successfully Integrated Kubeflow and MLflow on our minikube cluster.
- Create conda environment
-
Activate conda environment
You can activate your conda environment created in the previous step using following command :
conda activate <ENV_NAME>
-
Launch Jupyter Notebook from anaconda prompt
Type the following command to launch jupyter notebook :
jupyter notebook
3b) Create the ".ipynb" file to write a Python script to train a model on a dataset and log the experiment with MLflow
At first, let's clone this repository so you have access to the code. You can use the terminal or directly do that in the browser.
git clone https://github.com/adilshaikh165/ML-OPS.gitThen open "MLOPS-INTERNSHIP-ASSESSMENT-TASK.ipynb" to get the gist of the Python Script which I have created to train a model on a "bank-full.csv" and log the experiment with MLflow.
Refer the "create_experiment()" Function from the "MLOPS-INTERNSHIP-ASSESSMENT-TASK.ipynb" file.
It'll create and log a "basic classfier experiment" in the MLflow UI and will record all the relavant metrics as shown in the below few screenshots :
You can view all the relavant metrics, tags and artificats related to that perticular run.
Refer the "hyper_parameter_tuning()" Function from the "MLOPS-INTERNSHIP-ASSESSMENT-TASK.ipynb" file.
It'll create and log a "Optimized Classifier Experiment" in the MLflow UI and will record all the relavant metrics as well as Parameters as shown in the below few screenshots :
This time along with the Metrics, tags and artifcats you'll also get to log all hyper parameters, metrics, and artifacts which contains model, roc_auc curve PNG, confusion Matrix PNG Related to that Optimized Model.
Kubeflow Pipelines (KFP) is the most used component of Kubeflow. It allows you to create for every step or function in your ML project a reusable containerized pipeline component which can be chained together as a ML pipeline.
For the digits recognizer application, the pipeline is already created with the Python SDK. You can find the code in the file
kf-pipeline.ipynb-
Write a Python Function needed to train and predict
We need to create a various functions in order to train and predict our ML Model. The various functions are prepare_data(), train_test_split() and training_basic_classifier(). You can find all these functions in "kf-pipeline.ipynb" file.
-
Define the pipeline function and put together all the components
@dsl.pipeline( name='Basic MLOPS classifier Kubeflow Demo Pipeline', description='A sample pipeline that performs IRIS classifier task' ) def basic_classifier_pipeline(data_path: str): vop = dsl.VolumeOp( name="t-vol-1", resource_name="t-vol-1", size="1Gi", modes=dsl.VOLUME_MODE_RWO) prepare_data_task = create_step_prepare_data().add_pvolumes({data_path: vop.volume}) train_test_split = create_step_train_test_split().add_pvolumes({data_path: vop.volume}).after(prepare_data_task) classifier_training = create_step_training_basic_classifier().add_pvolumes({data_path: vop.volume}).after(train_test_split) prepare_data_task.execution_options.caching_strategy.max_cache_staleness = "P0D" train_test_split.execution_options.caching_strategy.max_cache_staleness = "P0D" classifier_training.execution_options.caching_strategy.max_cache_staleness = "P0D" -
Mounting volume for component's output storage and binding this volume with all the components. The pipeline defines a volume named "t-vol-1" with a size of 1GiB. This volume is used to store the dataset and the model artifacts.
-
Compiling pipeline and generating yaml
Once the pipeline is complied the yaml file is automatically generated and it can be directly uploaded to kubeflow and create experiments and runs using UI. You can refer the sample yaml file in the GitHub repo named as "basic_classifier_pipeline_adil.yaml".
kfp.compiler.Compiler().compile( pipeline_func=basic_classifier_pipeline, package_path='basic_classifier_pipeline_adil.yaml')
-
Create a run from pipeline function using the code.
-
Creation of the Persistent Volume
-
Prepare Data for train-test split. prepare_data_task loads the dataset from a URL and saves it to a subdirectory called data in the pipeline's working directory.
-
Generation of train-test split. train_test_split splits the dataset into a training set and a test set.
-
Training of Basic classifier model. classifier_training trains a logistic regression model on the training set. This step involves conversion of data type String into float for columns "job" and "married".
I have mapped various attributes of the column "job" into the static float values and perform "One-hot" encoding on marital column.
6. Integration of the MLflow on the charmed Kubeflow as well as artifacts storage for all the data outputs in the model using MINIO
Kindly refer this blog link where I have explained deeply about the Kubeflow on the charmed kubeflow using "microk8s". This stable version of Charmed Kubeflow removes all the drawbacks of the traditional or local deployment of kubeflow.
Blog link : https://adilshaikh165.hashnode.dev/mlflow-integration-with-kubeflow-on-charmed-kubeflow



















