Simple Linear Regression In R

The simple linear regression is employed to predict a quantitative outcome y on the idea of 1 single variable x. The goal is to create a mathematical model (or formula) that defines y as a function of the x variable.

Once, we built a statistically significant model, it’s possible to use it for predicting future outcome on the idea of latest x values.

Consider that, we would like to guage the impact of advertising budgets of three medias (youtube, facebook and newspaper) on future sales. this instance of problem are often modeled with linear regression .

Formula and basics

The mathematical formula of the linear regression are often written as y = b0 + b1*x + e, where:

b0 and b1 are referred to as the regression beta coefficients or parameters:

1)b0 is that the intercept of the regression line; that’s the expected value when x = 0.

2)b1 is that the slope of the regression curve.

3)e is that the error term (also referred to as the residual errors), the a part of y which will be explained by the regression model

The figure below illustrates the linear regression model, where:

*The best-fit regression curve is in blue

*The intercept (b0) and therefore the slope (b1) are shown in green

*The error terms (e) are represented by vertical red lines

From the scatter plot above, it are often seen that not all the info points fall exactly on the fitted regression curve . a number of the points are above the blue curve and a few are below it; overall, the residual errors (e) have approximately mean zero.

The sum of the squares of the residual errors are called the Residual Sum of Squares or RSS.

The average variation of points round the fitted regression curve is named the Residual Standard Error (RSE). this is often one the metrics wont to evaluate the general quality of the fitted regression model. The lower the RSE, the higher it’s .

Since the mean error term is zero, the result variable y are often approximately estimated as follow:

y ~ b0 + b1*x

Mathematically, the beta coefficients (b0 and b1) are determined in order that the RSS is as minimal as possible. This method of determining the beta coefficients is technically called method of least squares |statistical method|statistical procedure”> method of least squares regression or ordinary least squares (OLS) regression.

Once, the beta coefficients are calculated, a t-test is performed to see whether or not these coefficients are significantly different from zero. A non-zero beta coefficients means there’s a big relationship between the predictors (x) and therefore the outcome variable (y).