CODEX

The derivation of the Linear Regression coefficient

This article is inspired by the physics lab that I did 8 years ago

Published in

CodeX

3 min readMar 20, 2021

One of the university subjects that require regression that I remember the most is the Physics laboratory. I remember that I was asked to make data acquisition and then did the scatter plot where the x-axis is the input variable I changed and the y-axis is the output. After that, I was asked by the lab assistant to find the b and m coefficients of the linear equation y = theta_0*x + theta_x by entering the results of the data collection into a formula that I didn’t know where it came from.

Linear Regression Example

To make it easy to grasp the regression concept, I will take an example I found on Google (I believe it comes from the Economist) according to the university’s admission rate versus the 20-year average annual return on the degree.

The figure below shows a difference between the red and blue lines where the red line is quite flat compared to the blue line, which has a downtrend with the university admission rate increase. Those lines represent the result of the regression process, which help us to gain an understanding that there is almost no difference to study at a good or bad university in the US (by comparing the university admission rate with the time needed to return the degree)

Furthermore, we can see that the lines follow the spreading trend of the orange and the blue dot. So our objective is to learn how to build those lines.

asdasd — University admission rate vs. 20-year average annual return on the degree in the US

Linear Regression Derivation

Having understood the idea of linear regression would help us to derive the equation. It always starts that linear regression is an optimization process. Before doing optimization, we need to have the linear equation first, where the y_hat informs us that this is the output of the linear regression process.

After we get the linear equation, we need to determine the objective function that we need to minimize the error between the observed value with the output of the linear regression process by changing the parameter value of theta_0 and theta_1:

The error equation is the objective function that needs to be minimized.

Remember, when we derive the Error equation with theta_0 and set its result to zero, it will give us the optimum value of theta_0 that will bring the Error to minimum/maximum.

Since we get the theta_0 value, we will substitute it to find the optimum theta_1:

With the theta_0 and theta_1 equation, we will find those values with Excel to give more intuition:

Conclusion

Congratulation if you have read the article to this part. If you follow my instructions step by step, you will build your own linear regression model. I have to admit that we can do this easily in machine learning in Python by the scikit-learn library. Still, I believe that it is also important to understand the basic mathematics of the library we use. Hopefully, this article will have a benefit for you.