diff --git a/education/statistics/Correlation and Regression.md b/education/statistics/Correlation and Regression.md index 9751b03..ac1aec5 100644 --- a/education/statistics/Correlation and Regression.md +++ b/education/statistics/Correlation and Regression.md @@ -3,11 +3,24 @@ # Correlation ## Scatter Diagrams A scatter diagram or scatter plot shows the relationship between two variables. One variable is on the X axis, the other on the Y axis. + +If a scatter diagram is football shaped, it can be summarized using the 5-number summary: + +| Variable | Description | +| -- | -- | +| $ave_x$ | | +| $SD_x$| | +| $ave_y$ | +| $SD_y$ | | +| $r$ | | + + + ### Association -- Positive association is demonstrated when the dots are trend upward as $x$ increases. -- Negative association is demonstrated when the the dots trend downward as $x$ increases. -- Strong association is demonstrated when dots are clustered tightly together along a line. -- Weak association is demonstrated when dots are not clustered tightly. +- Positive association is demonstrated when the dots are trend upward as $x$ increases ($r$ is positive). +- Negative association is demonstrated when the the dots trend downward as $x$ increases ($r$ is negative). +- Strong association is demonstrated when dots are clustered tightly together along a line ($|r|$ is closer to 1). +- Weak association is demonstrated when dots are not clustered tightly. ($|r|$ is closer to 0) ## Correlation Correlation is between `-1` and `1`. Correlation near 1 means tight clustering, and correlation near 0 means loose clustering. $r$ is -1 if the points are on a line with negative slope, $r$ is positive 1 if the points are on a line with a positive slope. As $|r|$ gets closer to 1, the line points cluster more tightly around a line.