# We can combine different data types in a list and, optionally, name elements (e.g. B below)
l <- list(2:4, list("a"), B = c(TRUE, FALSE, FALSE))
l[[1]]
[1] 2 3 4
[[2]]
[[2]][[1]]
[1] "a"
$B
[1] TRUE FALSE FALSE
POP88162 Introduction to Quantitative Research Methods
Department of Political Science, Trinity College Dublin
| Structure | Description | Dimensionality | Data Type |
|---|---|---|---|
vector |
Atomic vector (scalar) | 1d | homogenous |
matrix |
Matrix | 2d | homogenous |
array |
One-, two or n-dimensional array | 1d/2d/nd | homogenous |
list |
List | 1d | heterogeneous |
data.frame |
Rectangular data | 2d | heterogeneous |
list() function in R.[] to subset lists[[ and $ operatorslist[index]
list[[index]]
list$name
data.frame() function with named vectors as input.'data.frame': 4 obs. of 3 variables:
$ x: int 1 2 3 4
$ y: chr "a" "b" "c" "d"
$ z: logi TRUE FALSE FALSE TRUE
$x
[1] 1 2 3 4 5
$y
[1] "a" "b" "c" "d" "e"
$z
[1] TRUE FALSE TRUE FALSE TRUE
If you subset with a single vector, it behaves as a list
data_frame[column_indices]
data_frame[column_name(s)]
data_frame$column_nameIf you subset with two vectors, it behaves as a matrix.
data_frame[row_indices, column_indices]
data_frame[row_indices, column_name(s)] x y z
1 1 a TRUE
3 3 c TRUE
5 5 e TRUE
[1] "x" "y" "z" "r"
[5] "r_standardised"
Oftentimes, analysis involves more than one dataset.
This requires merging (joining) data frames together.
R function merge() can be used for this purpose.
Assuming the two columns share the same columns name:
merge(x, y, by = "column_name")If the column names differ, the by.x and by.y arguments can be used:
merge(x, y, by.x = "column_name_x", by.y = "column_name_y")Note that in either case the datasets must share some unique identifier.
Data Frame 1
Data Frame 2
The order of arguments in merge() when the column name is the same is not important (but affects the order of columns in the merged data frame).
Data Frame 1
Data Frame 2
But it is important to keep track which dataset is x and which is y: