POP88162 Introduction to Quantitative Research Methods
Department of Political Science, Trinity College Dublin
\[Y_i = \alpha + \beta X_i + \epsilon_i\] where:
\[SSE = \sum_{i = 1}^{n} (Y_i - \hat{Y}_i)^2 = \sum_{i = 1}^{n} (Y_i - (\hat{\alpha} + \hat{\beta} X_i))^2\]
\[t = \frac{\hat{\beta} - \beta_{H_0}}{\hat{\sigma}_{\hat{\beta}}}\]
where:
Note that in the very common case the null hypothesis is \(\beta_{H_0} = 0\) the t-statistic simplifies to \(t = \frac{\hat{\beta}}{\hat{\sigma}_{\hat{\beta}}}\)
\[Y_i - \bar{Y} = (Y_i - \hat{Y_i}) + (\hat{Y_i} - \bar{Y})\]
\[\sum_{i = 1}^{n} (Y_i - \bar{Y})^2 = \sum_{i = 1}^{n} (Y_i - \hat{Y_i})^2 + \sum_{i = 1}^{n} (\hat{Y_i} - \bar{Y})^2\]
\[TSS = SSE + ESS\]
where:
Call:
lm(formula = gdp_per_capita ~ democracy_duration, data = democracy_gdp_2020)
Residuals:
Min 1Q Median 3Q Max
-44806 -8756 -4944 4820 163717
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5051.44 2370.78 2.131 0.0345 *
democracy_duration 182.22 35.15 5.185 5.99e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20900 on 173 degrees of freedom
(20 observations deleted due to missingness)
Multiple R-squared: 0.1345, Adjusted R-squared: 0.1295
F-statistic: 26.88 on 1 and 173 DF, p-value: 5.995e-07
\[t = \frac{\bar{Y}_{X = 0} - \bar{Y}_{X = 1}}{se_{\bar{Y}_{X = 0} - \bar{Y}_{X = 1}}} = \frac{\bar{Y}_{X = 0} - \bar{Y}_{X = 1}}{\sqrt{\frac{s^2_{X = 0}}{n_{X = 0}} + \frac{s^2_{X = 1}}{n_{X = 1}}}}\]
\[\bar{Y}_{X = 0} - \bar{Y}_{X = 1} = 54.61 − 45.05 = 9.56\]
\[se_{\bar{Y}_{X = 0} - \bar{Y}_{X = 1}} \approx \sqrt{\frac{s^2_{X = 0}}{n_{X = 0}} + \frac{s^2_{X = 1}}{n_{X = 1}}} = \sqrt{\frac{2498.004}{77} + \frac{1521.604}{118}} = 6.73\]
\[t = \frac{\bar{Y}_{X = 0} - \bar{Y}_{X = 1}}{\sqrt{\frac{s^2_{X = 0}}{n_{X = 0}} + \frac{s^2_{X = 1}}{n_{X = 1}}}} \approx \frac{9.56}{6.73} = 1.42\]
Welch Two Sample t-test
data: democracy_gdp_2020$democracy_duration by democracy_gdp_2020$democracy
t = 1.4198, df = 134.61, p-value = 0.158
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-3.75709 22.87617
sample estimates:
mean in group 0 mean in group 1
54.61039 45.05085
\[Y_i = \alpha + \beta X_i + \epsilon_i\]
\[\widehat{Longevity}_i = \hat{\alpha} + \hat{\beta} Democracy_i\]
Call:
lm(formula = democracy_duration ~ democracy, data = democracy_gdp_2020)
Residuals:
Min 1Q Median 3Q Max
-52.610 -24.610 -10.051 7.949 175.949
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.610 4.975 10.976 <2e-16 ***
democracy -9.560 6.396 -1.495 0.137
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 43.66 on 193 degrees of freedom
Multiple R-squared: 0.01144, Adjusted R-squared: 0.00632
F-statistic: 2.234 on 1 and 193 DF, p-value: 0.1366
\[Y_i = \alpha + \beta_1 X_{1i} + \beta_2 X_{2i} + \ldots + \beta_k X_{ki} + \epsilon_i\]
where:
\[E(Y_i|X_{1i}, \ldots, X_{ki}) = \alpha + \beta_1 X_{1i} + \beta_2 X_{2i} + \ldots + \beta_k X_{ki}\]
where:
\[GDP_i = \alpha + \beta_1 Longevity_i + \beta_2 Democracy_i + \epsilon_i\]
\[\widehat{GDP_i} = \hat{\alpha} + \hat{\beta_1} Longevity_i + \hat{\beta_2} Democracy_i\]
\[\widehat{GDP_i} = \hat{\alpha} + \hat{\beta_1} Democracy_i + \hat{\beta_2} Longevity_i\]
# Note the formula syntax: Y ~ X_1 + X_2
lm_fit_2 <- lm(gdp_per_capita ~ democracy + democracy_duration, data = democracy_gdp_2020)
lm_fit_2
Call:
lm(formula = gdp_per_capita ~ democracy + democracy_duration,
data = democracy_gdp_2020)
Coefficients:
(Intercept) democracy democracy_duration
-4971.8 14649.7 201.8
\[\widehat{GDP_i} = -4971.8 + 14649.7 \times Democracy_i + 201.8 \times Longevity_i\]
plot(democracy_gdp_2020$democracy_duration, democracy_gdp_2020$gdp_per_capita,
xlab = "Duration of Political Regime", ylab = "GDP per capita",
pch = 19, col = democracy_gdp_2020$democracy + 1
)
abline(a = coef(lm_fit_2)[1], b = coef(lm_fit_2)[3], col = 1)
abline(a = coef(lm_fit_2)[1] + coef(lm_fit_2)[2], b = coef(lm_fit_2)[3], col = 2)\[\widehat{GDP_i} = -4971.8 + 14649.7 \times Democracy_i + 201.8 \times Longevity_i\]
Call:
lm(formula = gdp_per_capita ~ democracy + democracy_duration,
data = democracy_gdp_2020)
Residuals:
Min 1Q Median 3Q Max
-39100 -10196 -4907 5437 158563
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4971.8 3076.8 -1.616 0.108
democracy 14649.7 3089.0 4.742 4.41e-06 ***
democracy_duration 201.8 33.4 6.040 9.26e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 19710 on 172 degrees of freedom
(20 observations deleted due to missingness)
Multiple R-squared: 0.2346, Adjusted R-squared: 0.2257
F-statistic: 26.35 on 2 and 172 DF, p-value: 1.037e-10
Call:
lm(formula = gdp_per_capita ~ democracy + democracy_duration,
data = democracy_gdp_2020)
Residuals:
Min 1Q Median 3Q Max
-39100 -10196 -4907 5437 158563
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4971.8 3076.8 -1.616 0.108
democracy 14649.7 3089.0 4.742 4.41e-06 ***
democracy_duration 201.8 33.4 6.040 9.26e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 19710 on 172 degrees of freedom
(20 observations deleted due to missingness)
Multiple R-squared: 0.2346, Adjusted R-squared: 0.2257
F-statistic: 26.35 on 2 and 172 DF, p-value: 1.037e-10
\[\widehat{log(GDP)_i} = \hat{\alpha} + \hat{\beta_1} log(Longevity)_i + \hat{\beta_2} Democracy_i\]
\[\widehat{log(GDP)_i} = 5.7157 + 0.6274 \times log(Longevity)_i + 1.1758 \times Democracy_i\]
plot(log(democracy_gdp_2020$democracy_duration), log(democracy_gdp_2020$gdp_per_capita),
xlab = "Duration of Political Regime (log)", ylab = "GDP per capita (log)",
pch = 19, col = democracy_gdp_2020$democracy + 1
)
abline(a = coef(lm_fit_3)[1], b = coef(lm_fit_3)[2], col = 1)
abline(a = coef(lm_fit_3)[1] + coef(lm_fit_3)[3], b = coef(lm_fit_3)[2], col = 2)
Call:
lm(formula = log(gdp_per_capita) ~ log(democracy_duration) +
democracy, data = democracy_gdp_2020)
Residuals:
Min 1Q Median 3Q Max
-2.85521 -0.87765 -0.07444 0.82037 3.10558
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.71571 0.34602 16.519 < 2e-16 ***
log(democracy_duration) 0.62745 0.08795 7.134 2.60e-11 ***
democracy 1.17576 0.17567 6.693 2.96e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.128 on 172 degrees of freedom
(20 observations deleted due to missingness)
Multiple R-squared: 0.3441, Adjusted R-squared: 0.3365
F-statistic: 45.12 on 2 and 172 DF, p-value: < 2.2e-16