Backend/Support Engineer Challenge

The challenge is: “Create a Machine Learning Service”

This project addresses the employee attrition prediction problem using Python and packaging the solution using Docker. The application is built a web service with the HR_Employee_Attrition dataset and Machine Learning model, which contains detailed information about each employee along for its training, with the purpose of generate a score indicating the possible turnover of a collaborator given by a "score" in the ML model.

Pre requisites

Docker installed on your machine (Install Docker)
Python virtual env

Open a terminal and create with:
```
pyenv virtaulenv 3.12.2 env-name 
```

Install dependencies from requirements.txt

cd erp-dev-security-1
pyenv activate env-name
pip install -r requirements.txt
pip install --upgrade pip

Replace env-name with your desired virtual env name.

Environment Variables

Add and set .env in project root with the vars:

Set Database credentials

APP_ENVIRONMENT: development.
DB_PORT: Port of postgresql database (5432).
DB_HOST: Host of database.
DB_USER: User of database.
DB_PASSWORD: Password of database.
DB_NAME: Name of database.

Set for postgresql database docker image

POSTGRES_DB="{DB_NAME}"
POSTGRES_USER="{DB_USER}"
POSTGRES_PASSWORD="{DB_PASSWORD}"

Set postgresql database uri

DATABASE_URL="postgresql://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:${DB_PORT}/${DB_NAME}"

Set docker image name

IMAGE_NAME=ml_turnover_service

Building the Docker containers

Follow the next steps to build the python and db containers:

Open a terminal.
Navigate to the directory of the project.
Run the following command:
```
docker compose build --no-cache
docker compose up -d 
```
You can remove flag -d to show in terminal logs while container is running.

Or instead of, you can try:
```
make build
make up
```

Now you can consume the api on your localhost: http://localhost:88 or use it in via swagger http://localhost/docs

How to use

In case your are using swagger/docs [http://localhost/docs] for interact with API, you'll got this documentation:

Fill database with data from CSV file

The first step is to populate the database with the collaborators' data in CSV file HR_Employee_Attrtition.csv for this, use the route collaborators/upload_file

Display the route, and select Try it out

A file section will be able to select one, upload HR_Employee_Attrtition.csv, then execute the endpoint.

Once the endpoints executes and deliver message "The collaborators and their scores were added correctly", the database are already with the collaborators' data and you can use get endpoint [ collaborators/score] or even post [ collaborators/]

GET method

The GET method receives as query param the id (represented by the employee_number as collaborator_id) and return the corresponding score generated by the model for that collaborator.

Try it out the GET endpoint using a employee_number from the CSV file, for example that you know is already stored in DB. Also you can use another collaborator_id (even if exists or not) before or after have used POST method collaborators/

GET method returns the employee_number and its score generated by the model, the score is represented in percentage to make more friendly and readable

POST method

The POST method receives in the request body the information if a new collaborator, for instance, a body could be representate as next json:

{
    "age": 30,
    "attrition": "No",
    "business_travel": "Travel_Rarely",
    "daily_rate": 750,
    "department": "Sales",
    "distance_from_home": 10,
    "education": 4,
    "education_field": "Marketing",
    "employee_count": 1,
    "employee_number": 88888,
    "environment_satisfaction": 3,
    "gender": "Male",
    "hourly_rate": 35,
    "job_involvement": 3,
    "job_level": 2,
    "job_role": "Sales Executive",
    "job_satisfaction": 4,
    "marital_status": "Married",
    "monthly_income": 6000,
    "monthly_rate": 12000,
    "num_companies_worked": 2,
    "over_18": "Y",
    "over_time": "Yes",
    "percent_salary_hike": 15,
    "performance_rating": 3,
    "relationship_satisfaction": 2,
    "standard_hours": 8,
    "stock_option_level": 1,
    "total_working_years": 10,
    "training_times_last_year": 2,
    "work_life_balance": 3,
    "years_at_company": 5,
    "years_in_current_role": 3,
    "years_since_last_promotion": 1,
    "years_with_curr_manager": 2
}

At the moment of try it out, you can use the json above, exchange data for a valid collaborator or even use the request body generated by swagger (this will give you an error). Also you can try use a request body of a collaborator registered.

The POST method returns a generated recommendations (score) and stores collaborator and score generated by de model in dtabase. The response includes the employee_number and his score in percentage to more readability.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
app		app
db		db
illustrative_imgs		illustrative_imgs
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
HR_Employee_Attrition.csv		HR_Employee_Attrition.csv
Makefile		Makefile
README.md		README.md
clf.zahoree		clf.zahoree
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backend/Support Engineer Challenge

Pre requisites