### Sampling Distribution of the Mean If sample size is large or population is normal: - **Mean of sample means:** $\mu_{\bar{x}} = \mu$ - **Standard error:** $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$ #### Example: Probability of Sample Mean - Population mean ($\mu$) = 50, Population SD ($\sigma$) = 10, Sample size ($n$) = 25 1. **Mean of sample means:** $\mu_{\bar{x}} = 50$ 2. **Standard error:** $\sigma_{\bar{x}} = \frac{10}{\sqrt{25}} = 2$ 3. **Probability sample mean > 54:** - Z-score: $z = \frac{54 - 50}{2} = 2$ - $P(Z > 2) = 0.0228$ ### Sampling Distribution of a Proportion - **Mean:** $\mu_{\hat{p}} = p$ - **Standard error:** $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$ #### Example: Standard Error for Proportion - Population proportion ($p$) = 0.60, Sample size ($n$) = 100 - **Standard error (SE):** $SE = \sqrt{\frac{(0.6)(0.4)}{100}} = 0.049$ ### Introduction to Estimation: Definitions - **Point Estimate:** Single best guess of a parameter (e.g., $\bar{x}$ estimates $\mu$). - **Interval Estimate:** Range likely containing the population parameter. - **Confidence Level:** Probability the interval captures the true parameter (e.g., 95%). ### Confidence Interval for Mean ($\sigma$ known) - **Formula:** $\bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$ #### Example: CI for Mean - $\bar{x} = 80$, $\sigma = 12$, $n = 36$, 95% confidence ($z = 1.96$) 1. **Standard error:** $\frac{12}{\sqrt{36}} = 2$ 2. **Margin of error:** $1.96 \times 2 = 3.92$ 3. **Interval:** $80 \pm 3.92 \implies (76.08, 83.92)$ ### Sample Size Formula - **Formula:** $n = \left(\frac{z_{\alpha/2} \sigma}{B}\right)^2$ (where $B$ = desired margin of error) #### Example: Calculating Sample Size - Estimate within 2 units ($B=2$), $\sigma=10$, 95% confidence ($z=1.96$) - $n = \left(\frac{1.96 \times 10}{2}\right)^2 = 96.04 \implies$ Round up to $n = 97$ ### Hypothesis Testing: Definitions - **Null Hypothesis ($H_0$):** Claim assumed true. - **Alternative Hypothesis ($H_1$):** What you seek evidence for. - **Type I Error:** Rejecting a true $H_0$. (Probability = $\alpha$) - **Type II Error:** Failing to reject a false $H_0$. - **Significance Level ($\alpha$):** Max probability of Type I error (usually 0.05). - **p-value:** Probability of observing data as extreme as, or more extreme than, that observed, assuming $H_0$ is true. ### Z Test for Mean ($\sigma$ known) - **Formula:** $z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$ #### Example: Z Test - Claim $\mu = 100$, $\bar{x} = 104$, $\sigma = 12$, $n = 36$, $\alpha = 0.05$ 1. **Hypotheses:** $H_0: \mu = 100$, $H_1: \mu \neq 100$ 2. **Test statistic:** $z = \frac{104 - 100}{12/\sqrt{36}} = \frac{4}{2} = 2$ 3. **Critical values:** $\pm 1.96$ 4. **Decision:** $2 > 1.96 \implies$ Reject $H_0$. 5. **Conclusion:** Mean is significantly different from 100. ### Inference About a Population: Mean with Unknown $\sigma$ (t-test) - **Formula:** $t = \frac{\bar{x} - \mu}{s/\sqrt{n}}$ - **Degrees of freedom (df):** $n - 1$ - **Confidence Interval:** $\bar{x} \pm t_{\alpha/2} \frac{s}{\sqrt{n}}$ #### Example: CI for Mean (t-test) - $\bar{x} = 52$, $s = 8$, $n = 16$, 95% confidence, $df=15 \implies t=2.131$ 1. **SE:** $8/\sqrt{16} = 2$ 2. **ME:** $2.131 \times 2 = 4.262$ 3. **CI:** $52 \pm 4.262 \implies (47.74, 56.26)$ ### Inference About a Population: Population Variance - **Test Statistic:** $\chi^2 = \frac{(n-1)s^2}{\sigma^2}$ (Use chi-square table) #### Example: Test for Variance - $n=10$, $s^2=25$. Test $H_0: \sigma^2=16$. - $\chi^2 = \frac{(10-1) \times 25}{16} = \frac{9 \times 25}{16} = 14.06$ ### Inference About a Population: Population Proportion - **Test Statistic:** $z = \frac{\hat{p} - p}{\sqrt{p(1-p)/n}}$ - **Confidence Interval:** $\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ #### Example: CI for Proportion - Sample: 120 of 200 like product $\implies \hat{p} = 120/200 = 0.60$ - 95% CI ($z=1.96$): 1. **SE:** $\sqrt{\frac{(0.6)(0.4)}{200}} = 0.0346$ 2. **ME:** $1.96 \times 0.0346 = 0.0678$ 3. **CI:** $0.60 \pm 0.0678 \implies (0.532, 0.668)$ ### Comparing Two Samples: Definitions - **Independent Samples:** Two separate, unrelated groups. - **Paired Samples (Matched Pairs):** Same subjects measured twice or naturally matched. - **Difference of Means ($\mu_1 - \mu_2$):** True difference between population averages. - **Difference of Proportions ($p_1 - p_2$):** True difference between population percentages. ### Comparing Two Means (Large Samples or Known $\sigma$) - **Formula (Z Test for Two Means):** $z = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$ - Usually, $H_0: \mu_1 - \mu_2 = 0$. #### Example: Z Test for Two Means - Factory A: $\bar{x}_1 = 105$, $\sigma_1 = 12$, $n_1 = 36$ - Factory B: $\bar{x}_2 = 100$, $\sigma_2 = 10$, $n_2 = 25$ - Test if means differ ($\alpha = 0.05$) 1. **Hypotheses:** $H_0: \mu_1 - \mu_2 = 0$, $H_1: \mu_1 - \mu_2 \neq 0$ 2. **Test Statistic:** - Difference: $105 - 100 = 5$ - SE: $\sqrt{\frac{12^2}{36} + \frac{10^2}{25}} = \sqrt{4+4} = \sqrt{8} = 2.828$ - $z = 5/2.828 = 1.77$ 3. **Decision:** Critical values = $\pm 1.96$. Since $1.77$ is within the region, Fail to reject $H_0$. 4. **Conclusion:** No significant difference in means. ### Confidence Interval for Two Means ($\sigma$ known) - **Formula:** $(\bar{x}_1 - \bar{x}_2) \pm z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}$ #### Example: CI for Two Means - Difference = 5, SE = 2.828, 95% confidence ($z = 1.96$) - **Margin of error:** $1.96 \times 2.828 = 5.54$ - **Interval:** $5 \pm 5.54 \implies (-0.54, 10.54)$ - Since 0 is inside interval, no clear difference. ### Comparing Two Means (Unknown $\sigma$, t-test) - **Equal Variances Assumed (Pooled Variance):** - $s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}$ - $t = \frac{(\bar{x}_1 - \bar{x}_2)}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$ - $df = n_1 + n_2 - 2$ #### Example: Pooled Variance t-test - Group A: $\bar{x}_1 = 82, s_1 = 6, n_1 = 10$ - Group B: $\bar{x}_2 = 78, s_2 = 5, n_2 = 10$ 1. **Pooled Variance:** $s_p^2 = \frac{9(6^2) + 9(5^2)}{10+10-2} = \frac{9(36) + 9(25)}{18} = \frac{324+225}{18} = \frac{549}{18} = 30.5$ - $s_p = \sqrt{30.5} \approx 5.52$ 2. **t statistic:** $t = \frac{82-78}{5.52 \sqrt{\frac{1}{10} + \frac{1}{10}}} = \frac{4}{5.52 \sqrt{0.2}} = \frac{4}{5.52 \times 0.447} = \frac{4}{2.466} \approx 1.62$ 3. **Decision:** Compare with critical t for $df = 18$. ### Paired Samples Test Used when observations come in pairs (e.g., before/after). - Let $d = x_1 - x_2$ (difference for each pair) - **Formula:** $t = \frac{\bar{d}}{s_d/\sqrt{n}}$ - $\bar{d}$ = average of differences - $s_d$ = standard deviation of differences #### Example: Paired t-test - 5 students scores before/after tutoring: | Student | Before | After | d (After-Before) | |---------|--------|-------|------------------| | 1 | 70 | 75 | 5 | | 2 | 68 | 72 | 4 | | 3 | 74 | 78 | 4 | | 4 | 71 | 76 | 5 | | 5 | 69 | 73 | 4 | 1. **Mean Difference:** $\bar{d} = \frac{5+4+4+5+4}{5} = 4.4$ 2. **Compute $s_d$:** (Assume $s_d = 0.55$) 3. **t statistic:** $t = \frac{4.4}{0.55/\sqrt{5}} = \frac{4.4}{0.55/2.236} = \frac{4.4}{0.246} \approx 17.9$ - Huge value suggests tutoring helped. ### Comparing Two Proportions - **Test Statistic:** $z = \frac{(\hat{p}_1 - \hat{p}_2)}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$ - Where pooled proportion $\hat{p} = \frac{x_1+x_2}{n_1+n_2}$ - **Confidence Interval:** $(\hat{p}_1 - \hat{p}_2) \pm z_{\alpha/2} \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}$ #### Example: Z Test for Two Proportions - Brand A: 40 successes out of 100 ($\hat{p}_1 = 0.40$) - Brand B: 30 successes out of 100 ($\hat{p}_2 = 0.30$) 1. **Pooled Proportion:** $\hat{p} = \frac{40+30}{100+100} = \frac{70}{200} = 0.35$ 2. **Standard Error (denominator of z):** $\sqrt{0.35(0.65)\left(\frac{1}{100} + \frac{1}{100}\right)} = \sqrt{0.35 \times 0.65 \times 0.02} = \sqrt{0.00455} \approx 0.067$ 3. **z statistic:** $z = \frac{0.40 - 0.30}{0.067} = \frac{0.10}{0.067} \approx 1.49$ - Not significant at 5% (critical values $\pm 1.96$). ### Comparing Two Variances - **F Test:** $F = \frac{s_1^2}{s_2^2}$ (Put larger variance in numerator) - **Degrees of freedom:** $df_1 = n_1 - 1$, $df_2 = n_2 - 1$ #### Example: F Test for Variances - Sample variances: $s_1^2 = 25, s_2^2 = 16$ - $F = 25/16 = 1.56$ - Compare with F-table critical value. ### Hypothesis Testing Steps 1. State $H_0$ and $H_1$. 2. Choose significance level ($\alpha$). 3. Compute test statistic. 4. Find critical value or p-value. 5. Make decision: Reject / Fail to reject $H_0$. 6. Write business conclusion. ### ANOVA: Hypotheses - **Null Hypothesis ($H_0$):** $\mu_1 = \mu_2 = \mu_3 = \dots = \mu_k$ (All population means are equal). - **Alternative Hypothesis ($H_1$):** At least one mean is different. ### Why Not Use Many t-tests? - Doing many t-tests inflates Type I error. ANOVA controls the overall error rate. ### Core ANOVA Idea Total variation is split into: - **Between-Group Variation (SSA):** Differences caused by treatment/group means. - **Within-Group Variation (SSE):** Natural random variation inside each group. If between-group variation is much larger than within-group variation, group means likely differ. ### ANOVA: Sums of Squares - **Total Sum of Squares (SST):** $SST = \sum(x_{ij} - \bar{x}_{..})^2$ - **Treatment (Between) Sum of Squares (SSA):** $SSA = \sum n_j (\bar{x}_j - \bar{x}_{..})^2$ - **Error (Within) Sum of Squares (SSE):** $SSE = \sum \sum(x_{ij} - \bar{x}_j)^2$ - **Relationship:** $SST = SSA + SSE$ ### ANOVA: Degrees of Freedom - $k$ = number of groups, $N$ = total observations - **Between Groups ($df_A$):** $k - 1$ - **Within Groups ($df_E$):** $N - k$ - **Total ($df_T$):** $N - 1$ ### ANOVA: Mean Squares - **Between (MSA):** $MSA = \frac{SSA}{k-1}$ - **Within (MSE):** $MSE = \frac{SSE}{N-k}$ ### ANOVA: F Test Statistic - **Formula:** $F = \frac{MSA}{MSE}$ - Large F = evidence group means differ. ### ANOVA Table | Source | SS | df | MS | F | |-----------|-----|-------|-------|---------| | Between | SSA | k-1 | MSA | MSA/MSE | | Within | SSE | N-k | MSE | | | Total | SST | N-1 | | | ### ANOVA: Full Example Compare exam scores from 3 teaching methods: - **Method A:** 8, 9, 7 - **Method B:** 5, 6, 4 - **Method C:** 9, 10, 8 1. **Group Means:** $\bar{x}_A = 8$, $\bar{x}_B = 5$, $\bar{x}_C = 9$. Grand mean $\bar{x}_{..} = 66/9 = 7.33$. 2. **Compute SSA:** $SSA = 3(8-7.33)^2 + 3(5-7.33)^2 + 3(9-7.33)^2 = 1.35 + 16.29 + 8.37 = 26.01$ 3. **Compute SSE:** (Sum of squared deviations within each group) - A: $(8-8)^2+(9-8)^2+(7-8)^2 = 0+1+1 = 2$ - B: $(5-5)^2+(6-5)^2+(4-5)^2 = 0+1+1 = 2$ - C: $(9-9)^2+(10-9)^2+(8-9)^2 = 0+1+1 = 2$ - $SSE = 2+2+2 = 6$ 4. **Degrees of Freedom:** $k=3, N=9$. $df_A = k-1 = 2$, $df_E = N-k = 6$. 5. **Mean Squares:** $MSA = 26.01/2 = 13.005$, $MSE = 6/6 = 1$ 6. **F Statistic:** $F = 13.005/1 = 13.005$ 7. **Decision:** Compare with F critical value ($\alpha=0.05, df_1=2, df_2=6$). Since F is large, Reject $H_0$. 8. **Conclusion:** At least one teaching method mean differs. ### ANOVA Assumptions - **Independence:** Observations are independent. - **Normality:** Each population is approximately normal. - **Equal Variances (Homoscedasticity):** Population variances are roughly equal. ### If $H_0$ is Rejected, Then What? ANOVA only tells you that *some* mean differs. Use post-hoc comparisons (e.g., Tukey, Bonferroni) to find *which* means differ. ### ANOVA: Common Exam Interpretation - **If p-value $\alpha$:** Fail to reject $H_0$. No significant evidence of differences. ### Calculator Shortcuts (TI-84 style) 1. **Enter Data:** STAT → EDIT (Put each group in L1, L2, L3). 2. **Run ANOVA:** STAT → TESTS → ANOVA(. 3. **Use:** `ANOVA(L1,L2,L3)`. 4. **Outputs:** F statistic, p-value. ### Mega Decision Table | Situation | Use | |-------------------------|---------------| | Compare 2 means | t-test | | Compare 3+ means | ANOVA | | Compare percentages | z-test | | Compare variances | F-test | | Before/after same people| paired t-test | ### ANOVA: Most Common Mistakes - Using multiple t-tests instead of ANOVA. - Forgetting grand mean. - Mixing SSE and SSA. - Concluding all means differ when $H_0$ is rejected. - Ignoring assumptions. ### ANOVA: Ultra Quick Formula Sheet - **Total Variation:** $SST = SSA + SSE$ - **Mean Squares:** $MSA = SSA/(k-1)$, $MSE = SSE/(N-k)$ - **Test Statistic:** $F = MSA/MSE$ ### Simple Linear Regression: Model - **Population Model:** $Y = \beta_0 + \beta_1 X + \epsilon$ - $Y$: dependent variable - $X$: independent variable - $\beta_0$: intercept (population) - $\beta_1$: slope (population) - $\epsilon$: error term - This is the true population model; you estimate it, not calculate directly. ### Simple Linear Regression: Estimated Equation - **Estimated Equation:** $\hat{Y} = b_0 + b_1 X$ - $\hat{Y}$: predicted value of Y - $b_0$: sample estimate of intercept - $b_1$: sample estimate of slope - **How to use:** Plug in X to get predicted $\hat{Y}$. Used for prediction questions. ### Simple Linear Regression: Slope ($b_1$) - **Formula:** $b_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2}$ - **Interpretation (EXAM FAVORITE):** $b_1$ is the change in Y for a 1-unit increase in X. - Example: If $b_1 = 5$, then if X increases by 1, Y increases by 5. ### Simple Linear Regression: Intercept ($b_0$) - **Formula:** $b_0 = \bar{y} - b_1 \bar{x}$ - **Interpretation:** Value of Y when X = 0 (may not always be meaningful in context). ### Simple Linear Regression: Correlation Coefficient (r) - **Formula:** $r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}$ - **Range:** -1 to +1 - **Measures:** Strength and direction of linear relationship. - **Link to regression:** If $b_1 > 0 \implies r > 0$; If $b_1 ### Simple Linear Regression: Coefficient of Determination ($R^2$) - **Formula:** $R^2 = r^2 = \frac{SSR}{SST}$ - **Measures:** The proportion (or percentage) of the variation in Y that is explained by the independent variable X. - Example: If $R^2 = 0.64$, then 64% of the variation in Y is explained by X. ### Simple Linear Regression: Residual (Error) - **Formula:** $e_i = Y_i - \hat{Y}_i$ - **Measures:** Difference between actual Y value and predicted Y value. Used to check accuracy of predictions. ### Simple Linear Regression: Least Squares Idea - The regression line is chosen to minimize the sum of the squared errors ($\sum e_i^2$). ### Simple Linear Regression: Sums of Squares (VERY IMPORTANT) - **Total Sum of Squares (SST):** $\sum(Y_i - \bar{Y})^2$ (Total variation in Y) - **Regression Sum of Squares (SSR):** $\sum(\hat{Y}_i - \bar{Y})^2$ (Variation in Y explained by X) - **Error Sum of Squares (SSE):** $\sum(Y_i - \hat{Y}_i)^2$ (Unexplained variation in Y) - **Relationship:** $SST = SSR + SSE$ ### Simple Linear Regression: Standard Error of Estimate ($s_e$) - **Formula:** $s_e = \sqrt{\frac{SSE}{n-2}}$ - **Measures:** The average distance that the observed values fall from the regression line. Smaller = better fit. ### Simple Linear Regression: Hypothesis Test for Slope ($\beta_1$) - **Hypotheses:** $H_0: \beta_1 = 0$ (no linear relationship), $H_1: \beta_1 \neq 0$ (linear relationship exists) - **Test Statistic:** $t = \frac{b_1}{s_{b_1}}$ - **Standard Error of Slope ($s_{b_1}$):** $s_{b_1} = \frac{s_e}{\sqrt{\sum(x_i - \bar{x})^2}}$ - **How to use:** If $H_0$ is rejected, then X is a useful predictor of Y. ### Simple Linear Regression: Confidence Interval for Slope ($\beta_1$) - **Formula:** $b_1 \pm t_{\alpha/2} s_{b_1}$ - **How to use:** Provides a range of plausible values for the population slope. If 0 is NOT in the interval, the relationship is statistically significant. ### Simple Linear Regression: Assumptions (VERY TESTED) 1. **Linear relationship:** Y is linearly related to X. 2. **Errors have mean = 0:** $\epsilon$ values average to zero. 3. **Constant variance (Homoscedasticity):** The variance of $\epsilon$ is the same for all values of X. 4. **Errors are independent:** Error terms are not correlated with each other. 5. **Errors are normally distributed:** The error terms follow a normal distribution. ### Simple Linear Regression: Key Interpretations (MEMORIZE) - **Intercept ($b_0$):** The predicted value of Y when X is 0. - **Slope ($b_1$):** The expected change in Y for a one-unit increase in X. - **$R^2$:** The proportion of total variation in Y explained by the regression model (X). ### Simple Linear Regression: Common Mistakes - **Correlation $\neq$ Causation:** A strong correlation does not imply X causes Y. - **Don't extrapolate:** Do not make predictions outside the range of the observed X values. - **Outliers can distort results:** Extreme values can heavily influence the regression line. ### Multiple Linear Regression: Model - **Population Model:** $Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_k X_k + \epsilon$ - **Estimated Model:** $\hat{Y} = b_0 + b_1 X_1 + b_2 X_2 + \dots + b_k X_k$ - **How to use:** Each coefficient $b_i$ represents the effect of that specific independent variable $X_i$. ### Multiple Linear Regression: Key Interpretation (CRITICAL) - **Coefficient ($b_i$):** The change in Y for a 1-unit increase in $X_i$, **holding all other independent variables constant**. - **THIS PHRASE is essential for exams:** "Holding all other variables constant". ### Multiple Linear Regression: $R^2$ - **Formula:** $R^2 = \frac{SSR}{SST}$ - **Measures:** The proportion of variation in Y explained by *all* independent variables together. - **Property:** $R^2$ always increases when more independent variables are added, even if they are not significant. ### Multiple Linear Regression: Adjusted $R^2$ (IMPORTANT) - **Formula:** $R^2_{adj} = 1 - \left(\frac{SSE/(n-k-1)}{SST/(n-1)}\right)$ - **How to use:** Penalizes the model for adding unnecessary independent variables. Use $R^2_{adj}$ to compare models with different numbers of predictors. ### Multiple Linear Regression: Standard Error ($s_e$) - **Formula:** $s_e = \sqrt{\frac{SSE}{n-k-1}}$ - $k$ = number of independent variables. ### Multiple Linear Regression: Hypothesis Test for Each Coefficient ($\beta_i$) - **Hypotheses:** $H_0: \beta_i = 0$, $H_1: \beta_i \neq 0$ - **Test Statistic:** $t = \frac{b_i}{s_{b_i}}$ - **How to use:** If $H_0$ is rejected, the variable $X_i$ is a statistically significant predictor of Y (given other variables in the model). ### Multiple Linear Regression: Overall F-Test - **Hypotheses:** $H_0: \beta_1 = \beta_2 = \dots = \beta_k = 0$ (None of the independent variables explain Y) - **Test Statistic:** $F = \frac{MSR}{MSE}$ - $MSR = SSR/k$ - $MSE = SSE/(n-k-1)$ - **How to use:** Tests if the entire regression model has any explanatory power. If $H_0$ is rejected, at least one of the independent variables is significant. ### Multiple Linear Regression: Key Issues (EXAM TRAPS) - **Multicollinearity:** Independent variables are highly correlated with each other. Causes unstable coefficients and inflated standard errors. - **Overfitting:** Including too many independent variables, leading to a model that performs well on training data but poorly on new data. ### Multiple Linear Regression: Assumptions Same as simple linear regression: 1. Linear relationship between Y and each $X_i$. 2. Errors have mean = 0. 3. Constant variance (Homoscedasticity) of errors. 4. Errors are independent. 5. Errors are normally distributed. ### Final Ultra-Short Summary (MEMORIZE) - **Simple Regression:** $\hat{Y} = b_0 + b_1 X$; $b_1$ = slope; $R^2$ = % explained; Residual = error. - **Multiple Regression:** Add more X's; interpret with "holding others constant"; use adjusted $R^2$; watch multicollinearity. ### "How to Solve Questions" Flow - **If given data:** 1. Compute $b_1, b_0$. 2. Write regression equation. 3. Predict $\hat{Y}$. 4. Compute residuals. 5. Find $R^2$. 6. Interpret. - **If hypothesis test:** 1. State $H_0$ and $H_1$. 2. Compute test statistic ($t$ or $F$). 3. Compare to critical value / p-value. 4. State conclusion (Reject / Fail to reject $H_0$).