The challenge is: “Create a Machine Learning Service”
This project addresses the employee attrition prediction problem using Python and packaging the solution using Docker. The application is built a web service with the HR_Employee_Attrition dataset and Machine Learning model, which contains detailed information about each employee along for its training, with the purpose of generate a score indicating the possible turnover of a collaborator given by a "score" in the ML model.
-
Docker installed on your machine (Install Docker)
-
Python virtual env
- Open a terminal and create with:
pyenv virtaulenv 3.12.2 env-name
- Install dependencies from requirements.txt
Replace
cd erp-dev-security-1 pyenv activate env-name pip install -r requirements.txt pip install --upgrade pipenv-namewith your desired virtual env name.
Add and set .env in project root with the vars:
APP_ENVIRONMENT: development.DB_PORT: Port of postgresql database (5432).DB_HOST: Host of database.DB_USER: User of database.DB_PASSWORD: Password of database.DB_NAME: Name of database.
POSTGRES_DB="{DB_NAME}"POSTGRES_USER="{DB_USER}"POSTGRES_PASSWORD="{DB_PASSWORD}"
DATABASE_URL="postgresql://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:${DB_PORT}/${DB_NAME}"
IMAGE_NAME=ml_turnover_service
Follow the next steps to build the python and db containers:
-
Open a terminal.
-
Navigate to the directory of the project.
-
Run the following command:
docker compose build --no-cache docker compose up -d
You can remove flag
-dto show in terminal logs while container is running.Or instead of, you can try:
make build make up
Now you can consume the api on your localhost: http://localhost:88 or use it in via swagger http://localhost/docs
In case your are using swagger/docs [http://localhost/docs] for interact with API, you'll got this documentation:
The first step is to populate the database with the collaborators' data in CSV file HR_Employee_Attrtition.csv for this, use the route collaborators/upload_file 
Display the route, and select Try it out

A file section will be able to select one, upload HR_Employee_Attrtition.csv, then execute the endpoint.

Once the endpoints executes and deliver message "The collaborators and their scores were added correctly", the database are already with the collaborators' data and you can use get endpoint [ collaborators/score] or even post [ collaborators/]

The GET method receives as query param the id (represented by the employee_number as collaborator_id) and return the corresponding score generated by the model for that collaborator.

Try it out the GET endpoint using a employee_number from the CSV file, for example that you know is already stored in DB. Also you can use another collaborator_id (even if exists or not) before or after have used POST method collaborators/

GET method returns the employee_number and its score generated by the model, the score is represented in percentage to make more friendly and readable

The POST method receives in the request body the information if a new collaborator, for instance, a body could be representate as next json:
{
"age": 30,
"attrition": "No",
"business_travel": "Travel_Rarely",
"daily_rate": 750,
"department": "Sales",
"distance_from_home": 10,
"education": 4,
"education_field": "Marketing",
"employee_count": 1,
"employee_number": 88888,
"environment_satisfaction": 3,
"gender": "Male",
"hourly_rate": 35,
"job_involvement": 3,
"job_level": 2,
"job_role": "Sales Executive",
"job_satisfaction": 4,
"marital_status": "Married",
"monthly_income": 6000,
"monthly_rate": 12000,
"num_companies_worked": 2,
"over_18": "Y",
"over_time": "Yes",
"percent_salary_hike": 15,
"performance_rating": 3,
"relationship_satisfaction": 2,
"standard_hours": 8,
"stock_option_level": 1,
"total_working_years": 10,
"training_times_last_year": 2,
"work_life_balance": 3,
"years_at_company": 5,
"years_in_current_role": 3,
"years_since_last_promotion": 1,
"years_with_curr_manager": 2
}At the moment of try it out, you can use the json above, exchange data for a valid collaborator or even use the request body generated by swagger (this will give you an error).
Also you can try use a request body of a collaborator registered.

The POST method returns a generated recommendations (score) and stores collaborator and score generated by de model in dtabase.
The response includes the employee_number and his score in percentage to more readability.


