This project showcases the development and deployment of a Digit Recognition App using a Convolutional Neural Network (CNN) trained on the MNIST dataset. The application integrates an interactive web interface for digit recognition, a backend API for prediction services, and is fully containerized and deployed on AWS ECS.
The system allows users to draw digits on a web canvas, and the model predicts the digit based on the drawing in real-time. It leverages Docker for containerization, FastAPI for the backend API, and Streamlit for the frontend. The entire system is deployed in the cloud using AWS ECS Fargate to ensure scalability and reliability.
- Project Overview
- Features
- Model Training
- Frontend: Streamlit Application
- Backend: FastAPI for Predictions
- Connecting the Frontend and Backend
- Containerization with Docker
- Pushing Docker Image to Docker Hub
- Deploying the Application on AWS ECS
- Deployment Results
- Next Steps
The Digit Recognition App allows users to easily recognize handwritten digits (0–9) through a trained Convolutional Neural Network (CNN). The application includes:
- Model Training: A PyTorch-based CNN trained on the MNIST dataset, with GPU acceleration using Google Colab and tracked using Weights & Biases (W&B) for experiment tracking and visualization.
- Interactive Web UI: Built with Streamlit to enable users to draw a digit on a canvas, which is then sent to the backend for prediction.
- REST API: Developed with FastAPI to provide a prediction endpoint, allowing the frontend to send requests and receive predictions.
- Cloud Deployment: The application is fully containerized using Docker and deployed on AWS ECS using the Fargate launch type for easy scaling and management.
- Trainable Model: Allows for easy training of the CNN model on the MNIST dataset using PyTorch and GPU acceleration.
- Logging and Visualization: Training and testing metrics are logged using Weights & Biases (W&B) for analysis and model tracking.
- Frontend and Backend:
- Frontend: Streamlit provides an intuitive user interface where users can draw digits, and predictions are displayed instantly.
- Backend: FastAPI serves as the backend API to handle the digit recognition requests and provides predictions via HTTP endpoints.
- Scalable Deployment: The application is containerized with Docker and deployed to AWS ECS, providing a scalable, cloud-based solution.
This setup makes it easy for users to access the app through a web interface or interact with the backend API for digit recognition, providing both an interactive experience and a robust, production-ready API.
The training script is implemented in PyTorch with the following key features:
The LargerCNN model is a three-layer convolutional neural network with the following architecture:
- Convolutional Layers:
- 1st Layer: 64 filters.
- 2nd Layer: 128 filters.
- 3rd Layer: 256 filters.
- Pooling: MaxPooling layers to reduce spatial dimensions.
- Fully Connected Layers:
- Layer 1: 512 neurons.
- Layer 2: 256 neurons.
- Layer 3: Output of 10 neurons (digits 0–9).
- Dropout: Applied to prevent overfitting.
The model is trained on Google Colab with a T4 GPU for acceleration. The process includes:
- Dataset: MNIST for handwritten digit recognition.
- Optimizer: Adam with a learning rate of 0.001.
- Loss Function: Cross-entropy loss for multi-class classification.
- Logging: All metrics and model checkpoints are logged to W&B.
-
Training Loss vs. Epoch Graph:
Below is a visualization of the training loss across epochs, showing consistent convergence:
-
Test Accuracy: Achieved a high test accuracy of 99.28% on the MNIST dataset.
-
Setup the Environment:
- Open
train_model.pyin Google Colab. - Ensure that the GPU runtime is enabled (
Runtime > Change Runtime Type > GPU).
- Open
-
Install Dependencies:
pip install torch torchvision matplotlib wandb
-
Run the Training Script:
python train_model.py
-
Save the Model: The trained model weights will be saved as
mnist_larger_cnn.pthinside the root directory. -
Visualize Results: The script includes a visualization function to display predictions on test images.
The frontend for the digit recognition application allows users to draw a digit on a canvas and get predictions. The application is implemented using Streamlit, making it interactive and easy to use.
- Interactive Canvas: Users can draw a digit using a drawable canvas widget.
- Real-time Predictions: The drawn digit is sent to a backend API for inference, and the prediction is displayed immediately.
The frontend is implemented in app.py using Streamlit and the streamlit-drawable-canvas library. The user interface includes:
- A canvas for drawing digits.
- Communication with a backend FastAPI server for predictions.
-
Install the required dependencies:
pip install streamlit streamlit-drawable-canvas requests pillow numpy
-
Start the frontend application:
streamlit run app.py
-
Access the app in your browser at
http://localhost:8501.
The backend is implemented using FastAPI and is responsible for serving predictions from the trained CNN model. It processes the uploaded images, preprocesses them, and performs inference using the PyTorch model.
- Model Loading: Loads the trained
LargerCNNmodel for inference. - Image Preprocessing: Converts uploaded images to grayscale, resizes them to 28x28 pixels, and normalizes them for model input.
- Error Handling: Detects blank images and returns an appropriate error message.
The backend is implemented in backendapi.py and includes:
- A root endpoint (
/) for health checks. - A prediction endpoint (
/predict/) that accepts an image file, processes it, and returns the predicted digit.
-
Install the required dependencies:
pip install fastapi uvicorn torch torchvision pillow numpy
-
Start the backend server:
uvicorn backendapi:app --host 0.0.0.0 --port 8000
-
The API will be accessible at
http://127.0.0.1:8000. Use the/predict/endpoint to send images for predictions.
To connect the frontend with the backend:
- Ensure the backend is running on
http://127.0.0.1:8000. - Update the
app.pyfile'srequests.post()URL if needed:response = requests.post("http://127.0.0.1:8000/predict/", files={"file": buffer})
When both services are running, users can draw digits in the Streamlit app, which sends the image to the FastAPI backend for prediction.
To simplify deployment and ensure consistency across environments, the application (both frontend and backend) is containerized using Docker. This allows running the entire application in isolated environments.
The Dockerfile specifies the instructions to build the Docker image for the application:
# Step 1: Use the official Python runtime as the base image
FROM python:3.9-slim
# Step 2: Set the working directory inside the container
WORKDIR /app
# Step 3: Copy the requirements.txt file first and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Step 4: Copy the entire app directory into the container (the contents of ./app)
COPY ./app /app
# Step 5: Expose necessary ports
EXPOSE 8000 8501
# Step 6: Command to run both FastAPI and Streamlit concurrently
CMD ["sh", "-c", "uvicorn backendapi:app --host 0.0.0.0 --port 8000 & streamlit run /app/app.py --server.port=8501 --server.address=0.0.0.0"]- Base Image: Use the official Python 3.9-slim image to keep the container lightweight.
- Working Directory: Set
/appas the working directory inside the container. - Install Dependencies: Copy the
requirements.txtfile and install Python dependencies usingpip. - Copy Application Code: Copy the entire
appdirectory (includingapp.pyandbackendapi.py) into the container. - Expose Ports: Open ports
8000for FastAPI and8501for Streamlit. - Run Commands: Use a shell command (
sh -c) to run both FastAPI and Streamlit services concurrently.
- Save the
Dockerfilein your project directory. - Run the following command to build the Docker image:
docker build -t digit-recognition-app .
After building the image, start a container using:
docker run -p 8000:8000 -p 8501:8501 digit-recognition-app- The backend API will be accessible at
http://127.0.0.1:8000. - The frontend Streamlit app will be accessible at
http://127.0.0.1:8501.
- Open the Streamlit app at
http://127.0.0.1:8501. - Draw a digit on the canvas, and the frontend will send the image to the backend for prediction.
- View the predicted digit displayed in the app.
After containerizing the application, the next step is to make the Docker image available online. This allows for easy deployment to cloud services like AWS ECS or Kubernetes. In this project, the Docker Hub repository aiinabox/digit-recognition-app-rep is used for hosting the Docker image.
-
Tag the Docker Image:
Add a tag to the locally built Docker image to link it to your Docker Hub repository. Replaceaiinabox/digit-recognition-app-repwith your Docker Hub repository name:docker tag digit-recognition-app aiinabox/digit-recognition-app-rep:1.0
-
Log In to Docker Hub:
Authenticate with Docker Hub using thedocker logincommand:docker login
- Enter your Docker Hub username and password when prompted.
-
Push the Image to Docker Hub:
Push the tagged image to your Docker Hub repository:docker push aiinabox/digit-recognition-app-rep:1.0
-
Verify the Image on Docker Hub:
After the push completes:- Visit your Docker Hub repository: Docker Hub - aiinabox/digit-recognition-app-rep.
- Confirm that the image appears with the
l.0tag.
Once the image is hosted on Docker Hub, it can be used for deployment on various platforms, such as:
-
Local Machines:
Pull and run the Docker image locally using:docker run -p 8000:8000 -p 8501:8501 aiinabox/digit-recognition-app-rep:1.0
-
Cloud Services (e.g., AWS ECS):
Use the Docker Hub image URLaiinabox/digit-recognition-app-rep:1.0for cloud-based deployments.
To make the digit recognition application accessible on the cloud, AWS ECS (Elastic Container Service) is used. The deployment uses the Fargate launch type, which is a serverless compute engine for containers.
- Go to the AWS ECS Console.
- Click on Create Cluster.
- Choose Fargate as the cluster type.
- Provide a name for your cluster (e.g.,
digit-recognition-cluster). - Complete the setup to create the cluster.
- Navigate to Task Definitions in the ECS Console.
- Click Create new Task Definition.
- Choose Fargate as the launch type.
- Configure the task definition with the following settings:
- Container Name:
digit-recognition-app. - Image: Specify the Docker Hub image URL:
aiinabox/digit-recognition-app-rep:1.0. - Port Mappings:
- 8000: for FastAPI backend.
- 8501: for Streamlit frontend.
- CPU and Memory: Allocate resources as per your application’s requirements, e.g.,
0.5 vCPUand1GB memory.
- Container Name:
- Save the task definition.
- Go to your ECS cluster in the console.
- Click on Create Service.
- Choose Fargate as the launch type.
- Select the task definition created in the previous step.
- Specify the desired number of tasks (typically 1 for small-scale apps).
- Review the configuration and click Launch Service to deploy.
- After ECS deploys the service, go to the cluster page in the ECS Console.
- Look for the public IP or DNS assigned to your service.
- Use the following URLs to access your application:
- Streamlit Frontend:
http://<public-ip>:8501 - FastAPI Backend:
http://<public-ip>:8000
- Streamlit Frontend:
After deploying the application on AWS ECS, everything was set up successfully. Below are the results:
Upon visiting the AWS ECS cluster, I confirmed that the service was up and running. Here’s a screenshot of the ECS Cluster dashboard with the deployed service and task details:
The service deployed successfully with a public IP, and I was able to access the frontend and backend APIs through the respective ports:
-
Frontend (Streamlit):
Access the digit recognition UI via the Streamlit app. It was successfully served on port8501.
(This image shows the user interface of the frontend where users can draw a digit.) -
Backend (FastAPI):
The backend API is running on port8000, where it serves as the endpoint for handling predictions. If you visithttp://<public-ip>:8000, you will see the following message indicating the API is live:{ "message": "Welcome to the Digit Recognition API" }This is a simple response from the FastAPI server, confirming that the API is running. The actual predictions are made when you send an image to the
/predict/endpoint through the Streamlit frontend.
Here’s a step-by-step breakdown of how the application works:
-
The user draws a digit (0-9) on the Streamlit Canvas.
-
The image is sent to the FastAPI Backend via an HTTP POST request.
-
The backend predicts the digit and sends back the result.
-
The Streamlit UI displays the predicted digit in real-time.
Here’s a screenshot of the Streamlit app in action where users draw a digit, and the model predicts it. The images below show the predicted digits from 0 to 9.
Digit 0 Digit 1 Digit 2 Digit 3 Digit 4 Digit 5 Digit 6 Digit 7 Digit 8 Digit 9
Now that the application is deployed and running smoothly, here are some improvements and enhancements I plan to explore in the future:
- Add Authentication: Implement a simple authentication system to secure both the API and frontend, ensuring only authorized users can access and interact with the app.
- Container Orchestration: Explore container orchestration tools like Kubernetes to manage and scale the application, ensuring high availability and efficient load balancing.
- Monitor Application Performance: Set up monitoring and logging services on AWS ECS to track key performance metrics, including latency, resource utilization, and error rates, for better operational insights.
- Continuous Deployment: Automate the deployment process by integrating the project with a CI/CD pipeline (e.g., GitHub Actions), enabling seamless updates and faster iterations.