Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
gradient_descent.py	gradient_descent.py
gradient_descent_large.png	gradient_descent_large.png
gradient_descent_small.png	gradient_descent_small.png
sage0.png	sage0.png
sage1.png	sage1.png
sage_code	sage_code

Name

Last commit message

Last commit date

gradient_descent.py

gradient_descent_large.png

gradient_descent_small.png

sage0.png

sage1.png

sage_code

Gradient Descent for Linear Regression (Two Variables)

Overview

Gradient descent is an iterative optimization algorithm used to find the minimum of a function. In the context of linear regression, it minimizes the least squares cost function to find the best-fit line y = mx + b through a set of data points.

The algorithm works by:

Starting with initial guesses for the slope (theta_1) and y-intercept (theta_0)
Computing the gradient (partial derivatives) of the cost function with respect to each parameter
Updating the parameters in the direction opposite to the gradient
Repeating until convergence

The cost function (mean squared error) is:

J(theta_0, theta_1) = (1/2m) * sum((h(x_i) - y_i)^2)

where h(x) = theta_1 * x + theta_0 is the hypothesis (predicted value).

Visualizations

Sample 1: Small Dataset [1,2,3] vs [3,5,5]

Sample 2: Larger Dataset

Each visualization shows:

Left panel: The data points (blue), regression line (red), and vertical error lines (gray) showing the residuals
Right panel: The cost function convergence over iterations (log scale), showing how the error decreases with each step

Original SageMath Output

The original Sage implementation produced these plots:

Files

File	Description
`sage_code`	Original SageMath implementation
`gradient_descent.py`	Python 3 translation with matplotlib
`gradient_descent_small.png`	Visualization for small dataset
`gradient_descent_large.png`	Visualization for larger dataset
`sage0.png`	Original Sage output (small dataset)
`sage1.png`	Original Sage output (larger dataset)

How to Run

python3 gradient_descent.py

You can also run the original SageMath code on SageMathCell.

Parameters

Parameter	Default	Description
`alpha`	0.01	Learning rate
`max_iter`	1000	Maximum iterations
`min_tol`	1e-4	Minimum parameter change tolerance
`min_cost`	1e-3	Minimum acceptable cost

Key Takeaways

The learning rate (alpha) controls how large each step is: too large causes divergence, too small causes slow convergence
The cost function decreases monotonically when the learning rate is properly chosen
The vertical error lines visualize the residuals -- the quantities that gradient descent is minimizing
This is the foundation of machine learning: fitting a model to data by minimizing a cost function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Gradient Descent for Linear Regression (Two Variables)

Overview

Visualizations

Sample 1: Small Dataset [1,2,3] vs [3,5,5]

Sample 2: Larger Dataset

Original SageMath Output

Files

How to Run

Parameters

Key Takeaways

FilesExpand file tree

gradient_descent_for_two_variables_with_plot

Directory actions

More options

Directory actions

More options

Latest commit

History

gradient_descent_for_two_variables_with_plot

Folders and files

parent directory

README.md

Gradient Descent for Linear Regression (Two Variables)

Overview

Visualizations

Sample 1: Small Dataset [1,2,3] vs [3,5,5]

Sample 2: Larger Dataset

Original SageMath Output

Files

How to Run

Parameters

Key Takeaways