graph LR
A(Regression<br>Line) --> B(Regression<br>Model) --> C(Estimation) --> D(Statistical<br>Inference)
POP88162 Introduction to Quantitative Research Methods
Department of Political Science, Trinity College Dublin
graph LR
A(Regression<br>Line) --> B(Regression<br>Model) --> C(Estimation) --> D(Statistical<br>Inference)
\(Y\) takes the value of \(0.13\) when \(X = 0\).
A one unit increase in \(X\) is associated, on average, with a \(0.93\) increase in \(Y\).
The value of \(Y\) can be calculated as \(0.13\) plus \(0.93\) times the value of \(X\).
A simplified description of an object.
A simplified description of relationships between variables.
\[\text{Winning Election} = \text{Party} + \text{Incumbency} + \text{Campaign Spending}\]
All models are wrong, but some are useful.
George Box
\[Y_i = \alpha + \beta X_i + \epsilon_i\]
where:
\[Y_i = \alpha + \beta X_i + \epsilon_i\]
has three population parameters:
We know from basic school geometry:
There are infinitely many lines that go through \((\bar{X}, \bar{Y})\).
Residual \(\hat{\epsilon_i}\) is the difference (vertical distance) between observed \(Y_i\) and predicted \(\hat{Y_i}\).
\[\hat{\beta} = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(Y_i - \bar{Y})}{\sum_{i = 1}^{n}(X_i - \bar{X})}\]
and
\[\hat{\alpha} = \bar{Y} - \hat{\beta} \bar{X}\]
A parameter can also be called an estimand (something that is estimated).
\[\widehat{GDP}_i = \hat{\alpha} + \hat{\beta} Longevity_i\]
\[\widehat{GDP}_i = 5051.4 + 182.2 \times Longevity_i\]
Note that in the very common case the null hypothesis is \(\beta_{H_0} = 0\) the t-statistic simplifies to \(t = \frac{\hat{\beta}}{\hat{\sigma}_{\hat{\beta}}}\)
lm_fit <- lm(gdp_per_capita ~ democracy_duration, data = democracy_gdp_2020)
summary(lm_fit) # Use `summary()` function to get a more detailed output
Call:
lm(formula = gdp_per_capita ~ democracy_duration, data = democracy_gdp_2020)
Residuals:
Min 1Q Median 3Q Max
-44806 -8756 -4944 4820 163717
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5051.44 2370.78 2.131 0.0345 *
democracy_duration 182.22 35.15 5.185 5.99e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20900 on 173 degrees of freedom
(20 observations deleted due to missingness)
Multiple R-squared: 0.1345, Adjusted R-squared: 0.1295
F-statistic: 26.88 on 1 and 173 DF, p-value: 5.995e-07
\[t = \frac{\hat{\beta} - \beta_{H_0}}{\hat{\sigma}_{\hat{\beta}}} = \frac{182.22 - 0}{35.15} \approx 5.184\]