Regression Lines Given mean ($\bar{x}, \bar{y}$), standard deviation ($s_x, s_y$), and correlation coefficient ($r$), the regression equations are: Regression Equation of $y$ on $x$ This equation predicts $y$ given $x$. Slope $b_{yx} = r \frac{s_y}{s_x}$ Equation: $y - \bar{y} = b_{yx}(x - \bar{x})$ Rearranged: $y = \bar{y} + b_{yx}(x - \bar{x})$ Regression Equation of $x$ on $y$ This equation predicts $x$ given $y$. Slope $b_{xy} = r \frac{s_x}{s_y}$ Equation: $x - \bar{x} = b_{xy}(y - \bar{y})$ Rearranged: $x = \bar{x} + b_{xy}(y - \bar{y})$ Example Problem Solution Given Data: Mean of Mathematics ($x$): $\bar{x} = 475$ Mean of Physics ($y$): $\bar{y} = 39.5$ Standard deviation of Mathematics ($x$): $s_x = 16.8$ Standard deviation of Physics ($y$): $s_y = 10.8$ Correlation coefficient: $r = 0.95$ 1. Equation of Regression Line $y$ on $x$ Calculate slope $b_{yx}$: $$ b_{yx} = r \frac{s_y}{s_x} = 0.95 \times \frac{10.8}{16.8} \approx 0.95 \times 0.642857 \approx 0.6107 $$ Substitute into the equation $y - \bar{y} = b_{yx}(x - \bar{x})$: $$ y - 39.5 = 0.6107 (x - 475) $$ $$ y = 0.6107x - 0.6107 \times 475 + 39.5 $$ $$ y = 0.6107x - 290.0825 + 39.5 $$ $$ y = 0.6107x - 250.5825 $$ 2. Equation of Regression Line $x$ on $y$ Calculate slope $b_{xy}$: $$ b_{xy} = r \frac{s_x}{s_y} = 0.95 \times \frac{16.8}{10.8} \approx 0.95 \times 1.555556 \approx 1.4778 $$ Substitute into the equation $x - \bar{x} = b_{xy}(y - \bar{y})$: $$ x - 475 = 1.4778 (y - 39.5) $$ $$ x = 1.4778y - 1.4778 \times 39.5 + 475 $$ $$ x = 1.4778y - 58.3931 + 475 $$ $$ x = 1.4778y + 416.6069 $$ 3. Estimate $y$ for $x=30$ Using the regression equation $y$ on $x$ ($y = 0.6107x - 250.5825$): Substitute $x=30$: $$ y = 0.6107 \times 30 - 250.5825 $$ $$ y = 18.321 - 250.5825 $$ $$ y = -232.2615 $$ Note: A score of $x=30$ is far outside the given mean of $x=475$. This is an extrapolation, and the result $y=-232.2615$ is a negative score, which is likely not meaningful in the context of examination scores. This highlights the importance of not extrapolating too far beyond the observed data range when using regression models.