This animation shows how linear least squares minimizes the Sum of Squared Errors (SSE). A trial line starts away from the data cloud and iteratively moves toward the best-fitting line.
The bottom-right SSE value updates each frame and decreases until the minimum is reached.
For data points (x_i, y_i), each frame evaluates a line y_hat = m x + b and computes SSE = sum over i of (y_i - (m x_i + b))². The animation transitions m and b from an initial guess to the least-squares optimum.
The optimum corresponds to the projection of the response vector onto the span of the design matrix columns (intercept and feature), yielding the smallest possible SSE for all lines.