Some Interesting Points
- Geometrically speaking, linear regression methods finds the closest path from the true data to a hypersuface spanned by the data vectors. By definition, each set of data is viewed as a basis vector. The so called closed path to the hypersuface is basically the path that is perpendicular to the surface. Thus we know the prediction we are looking for is a projection of true data onto the hypersuface.
- The argument above also indicates that degenerate data set, which contains data of the same direction, could cause problems since we have a redundant basis.
- Distribution of the parameters can be obtained for some categories of data. It might be a normal distribution.
- t-distribution, aka student’s t-distribution, is a category of distributions describing the deviation of estimated mean in a normal distribution from the true mean.
- The tail of the estimated distribution approaches the actual tail distribution as the sample size increases.
- Z score can be used to test the significance of the statistics.
“Roughly a Z score larger than two in absolute value is significantly nonzero at the p=0.05 level.” The author said in the caption of Table 3.2
- F statistic
- Eqn 3.14: plug in the definition of z and read again.