diff --git a/education/statistics/Correlation and Regression.md b/education/statistics/Correlation and Regression.md index cc649fb..423f2ee 100644 --- a/education/statistics/Correlation and Regression.md +++ b/education/statistics/Correlation and Regression.md @@ -83,12 +83,18 @@ Given a scatter diagram where the average of each set lies on the point $(75, 70 ### The Regression Line/Least Squared Regression Line (LSRL) - This line has a more moderate slope than the SD line. it does not go through the peaks of the "football" - The regression line is *used to predict* the y variable when the x variable is given -- The regression line also goes through the point of averages +- The regression line goes through the point of averages $$ slope = r(\frac{\sigma_y}{\sigma_x}) $$ - You can find the regression line by multiplying $\sigma_y$ by $r$, for the rise, then using $\sigma_x$ for the run from the point of averages. The below formula can be used to predict a y value given a 5 number summary of a set. $$ \hat{y} = \frac{x-\bar{x}}{\sigma_x} * r * \sigma_y + \bar{y} $$ +1. Find $z_x$ +2. Multiply $z_x$ by $r$ +3. Multiply that by $\sigma_y$ +4. Add the average of $y$ + + # Terminology | Term | Definition | | -- | -- |