DATA ANALYSIS
Linear Regression Using line and medfit
Linear Regression Functions
The line function returns the parameters of the least squares line of best fit. For given x- and y-values in vectors vx and vy, respectively, the least squares line of best fit is given by:
Let’s use the following matrix of xy-data (x-coordinate in first column, y-coordinate in second) to demonstrate a few linear analysis techniques:
Note: Clicking on the Input Table allows you to expand the viewing window of the table.
Some of the methods described below require that the data set be ordered by ascending x-values. Use the csort function to sort the data by the values in the x-column.
The line Function
line returns the parameters (slope and intercept) of the least squares line of best fit for the data in vx and vy.
Arguments of the Function
line takes two arguments.
line(vx,vy)
1. vx is an n-element vector of real data values. The values in vx correspond to the x-values.
2. vy is an n-element vector of real data values. These correspond to the y-values.
Outputs of the Function
Intercept:
Slope:
Define a function which describes the line of best fit by typing:
It is a good idea to examine the data and the line graphically to make a visual inspection of the goodness of the fit.
There is also a built-in correlation function, which returns the coefficient of correlation of the data.
The closer this value is to 1, the better the fit.
Exponential Data
The slope and intercept are also useful when considering exponential data. Specifically, if x and y are related by the model function
The line function can be applied to the logs of the data values, by using
So, you can use slope and intercept to determine the coefficients A and k in the relationship above.
and
Medfit
If you find that your data is not well fit by the parameters returned by line, you may want to use the medfit function. medfit applies a median-median regression on the data. If your data has extreme outliers, a median-median regression accounts for these and may return a line which fits the data more accurately.
The medfit Function
medfit returns the parameters (slope and intercept) of the least squares line of best fit for the data in vx and vy.
Arguments of the Function
medfit takes two arguments.
medfit(vx,vy)
1. vx is an n-element vector of real data values. The values in vx correspond to the x-values.
2. vy is an n-element vector of real data values. These correspond to the y-values.
Outputs of the Function
Intercept:
Slope:
Define a function which describes the line of best fit by typing: