In this repository, I have created a Jupyter notebook that shows my complete implementation of Logistic Regression from scratch. Through this project, I aimed to deepen my understanding by combining mathematical theory with hands-on coding and visualization using numpy and matplotlib.
Logistic Regression is a fundamental binary classification algorithm that models the probability of an input belonging to a class using the logistic sigmoid function. In my notebook, I cover the following:
- Loading and visualizing a binary classification dataset.
- The math behind logistic regression, including the sigmoid function and cost optimization.
- Writing the cost function manually (binary cross-entropy loss).
- Computing gradients of weights and bias for learning.
- Implementing gradient descent to optimize parameters.
- Evaluating the model and visualizing the decision boundary.
I learned that the heart of logistic regression is the sigmoid function, which maps any real number (z) (which is a linear combination of input features and model parameters) to a value between 0 and 1:
This output represents the probability of the input belonging to the positive class.
To measure how well the model fits the data, I implemented the binary cross-entropy loss function:
This function heavily penalizes wrong predictions, guiding the model to improve.
Through calculus, I derived the gradients of the cost function with respect to each parameter, which are necessary for gradient descent:
Using these, I updated the parameters iteratively:
where (\alpha) is the learning rate controlling the step size.
- I used
numpyfor efficient numerical computation andmatplotlibfor plotting data and results. - I learned how to structure and vectorize the code for efficient training.
- I tracked the cost over iterations to ensure the model was converging.
- The notebook also includes detailed plots of the data points and the final decision boundary.
- Clone the repository to your local machine.
- Open the
logistic-regression.ipynbnotebook with Jupyter Notebook or any Jupyter-compatible environment. - Run all cells step-by-step to see the training process and visualize the results.
- Feel free to modify parameters like learning rate, number of iterations, or try your own dataset for further experimentation.
- The model training shows a clear decrease in the cost function over epochs.
- Accuracy on training data is printed, and the decision boundary effectively separates the classes.
- This process helped me solidify my understanding of both the theory and implementation of logistic regression.
- Python 3.x
numpymatplotlibpandas(for data handling)
You can easily install these with:
pip install numpy matplotlib pandas
Feel free to dive into the code, ask questions, or suggest improvements!




