Week 1: Introduction to R¶

Introduction to Computer Programming for Data Analysis I¶

Tom Paskhalis¶

27 April 2022¶

Overview¶

  • Backstory
  • RStudio
  • R Script
  • Markdown/R Markdown
  • Operators

R background¶

Source: University of Auckland, R Project

  • S (for statistics) is a programming language for statistical analysis developed in 1976 in AT&T Bell Labs
  • Original S language and its extention S-PLUS were closed source
  • In 1991 Ross Ihaka and Robert Gentleman began developing R, an open-source alternative to S

R release names (v. 4.1.1 -- "Kick Things")¶

Source: Twitter
Extra: More on historical R release names

Popularity of Data Analysis Software¶

Source: SPSS is dying. R is surging.

R and development environments¶

  • There is some choice of integrated development environments (IDEs) for R (StatET, ESS, R Commander)
  • However, over the last decade RStudio became the de factor standard IDE for working in R
  • You can also find R extensions for your favourite text editor (Atom, Sublime Text, Visual Studio Code, Vim)

RStudio¶

R Script¶

  • Usually you want to have a record of what analysis was done and how you did it.
  • So, instead of writing all your R commands in the interactive console,
  • You can create an R script, write them there and run then together or one at a time.
  • R script is a file with .R extension and contains a collection of valid R commands.

R Markdown¶

  • Markdown:
    • Easy-to-read and easy-to-write plain text format;
    • Separates content from its appearance (rendition);
    • Widely used across industry sectors and academic fields;
    • .md file extension.
  • RMarkdown:
    • Allows combining of R commands with regular text;
    • Compiles into PDF/DOC/HTML and other formats;
    • Can be converted into slide deck or even website!
    • .Rmd file extension

Extra: Ch 27: R Markdown in Wickham & Grolemund 2017

Markdown formatting basics¶

  • Use _ or * for emphasis (single - italic, double - bold, triple - bold and italic)
    • *one* becomes one, __two__ - two and ***three*** - three
  • Headers or decreasing levels follow #, ##, ###, #### and so on
  • (Unordered) Lists follow marker -, + or *
    • Start at the left-most position for top-level
    • Indent four space and use another marker for nesting like here
  • (Numbered) Lists use 1. (counter is auto-incremented)
  • Links have syntax of [some text here](url_here)
  • Images similarly: ![alt text](url or path to image)

R Markdown example¶

Some text in *italic* and **bold**

Simple list:

- A
- B

Ordered list:

1. A
1. B

Example, where $Y = X + 5$

```{r}
x <- 3
y <- x + 5
y
```

R Markdown example¶

Some text in italic and bold

Simple list:

  • A
  • B

Ordered list:

  1. A
  2. B

Example, where $Y = X + 5$

{r}
x <- 3
y <- x + 5
y

R basics¶

  • R is an interpreted language (like Python and Stata)
  • It is geared towards statistical analysis
  • R is often used for interactive data analysis (one command at a time)
  • But it also permits to execute entire scripts in batch mode
In [2]:
print("Hello World!")
[1] "Hello World!"

Operators¶

Key operators (infix functions) in R are:

  • Arithmetic (+, -, *, ^, /, %/%, %%, %*%)
  • Boolean (&, &&, |, ||, !)
  • Relational (==, !=, >, >=, <, <=)
  • Assignment (<-, <<-, =)
  • Membership (%in%)

Basic mathematical operations in R¶

In [3]:
1 + 1
[1] 2
In [4]:
5 - 3
[1] 2
In [5]:
6 / 2
[1] 3
In [6]:
4 * 4
[1] 16
In [7]:
## Exponentiation, note that 2 ** 4 also works, but is not recommended
2 ^ 4
[1] 16

Advanced mathematical operations in R¶

In [8]:
# Integer division
7 %/% 3
[1] 2
In [9]:
# Modulo operation (remainder of division)
7 %% 3
[1] 1

Basic logical operations in R¶

In [10]:
3 != 1 # Not equal
[1] TRUE
In [11]:
3 > 3 # Greater than
[1] FALSE
In [12]:
FALSE | TRUE # True if either first or second operand is True, False otherwise
[1] TRUE
In [13]:
F | T # R also treats F and T as Boolean, but it is not recommended due to poor legibility
[1] TRUE
In [14]:
3 > 3 | 3 >= 3 # Combining 3 Boolean expressions
[1] TRUE

Next¶

  • Introduction to computation