This repository contains the setup process for deploying a remote tracking server using MLflow on AWS. The guide provides step-by-step instructions for configuring AWS services, launching an EC2 instance, setting up MLflow, and integrating it with the machine learning model training.
The purpose of this repository is to guide you through the process of setting up a centralized tracking server for managing machine learning experiments using MLflow on AWS. By following this guide, users can establish a scalable and secure infrastructure for tracking experiment metadata, parameters, metrics, and artifacts, facilitating collaboration, reproducibility, and efficient model development in the Mlops pipeline.
- Objective: Launch a new EC2 instance to host the MLflow tracking server.
- Actions:
- Objective: Create a new PostgreSQL database on RDS to be used as the backend store for the mlflow remote server .
.png)
- Objective: Set up the EC2 instance with MLflow and its dependencies.
- Actions:
-
SSH into the EC2 instance and to install the necessary dependencies.

-
Install pip, Python3-pip and MLflow dependencies using Pipenv.
sudo apt update sudo apt install python3-pip pipenv install mlflow pipenv install awscli pipenv install boto3
-
- Objective: To allow uploading artifacts to the S3 bucket, we need to set AWS credentials as environment variables on the EC2 instance.
export AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
export AWS_SECRET_ACCESS_KEY =<your-aws-secret-access-key>- Objective: Before launching the server, we must check that the instance can access the s3 bucket.
- Action : we just run this command from the EC2 instance to see the mlops 101-storage bucket.

-
Objective: Integrate MLflow into the training model code to log parameters, metrics, and artifacts.
- Import MLflow in the code, set tracking URI, and create an experiment.
- Use MLflow Tracking APIs to log parameters, metrics, and artifacts.
import mlflow mlflow.set_tracking_uri("http://ec2-44-208-155-40.compute-1.amazonaws.com:5000") for prod_cat in params["Sales"]["product_categories"]: print(f"Processing product category: {prod_cat}") mlflow.set_experiment(prod_cat) with mlflow.start_run(): mlflow.log_param("Product Category", prod_cat)
- Access the MLflow UI using the EC2 instance's public IP address to view model runs, metadata, and MLflow components.
To reproduce the work you can create a Virtual Environment ```bash # Create a virtual environment named 'myenv' python3 -m venv myenv
# Activate the virtual environment
source myenv/bin/activate
With the virtual environment activated, install the required packages listed in the requirements.txt file using pip:
pip install -r requirements.txt You can also create a Conda Environment if you have an environment.yml file, you can use Conda to create a new environment based on it. Run the following command in your terminal:
conda env create -f environment.ymlOnce the environment is created, activate it using the following command:
conda activate <environment_name>Replace <environment_name> with the name of the Conda environment you created.
After activating the Conda environment, you can run the project as usual, using the appropriate commands or scripts provided.
Contributions to this repository are welcome! If you identify any improvements or have additional insights feel free to open a pull request.



