Week 1: Introduction¶

Introduction to Computer Programming for Data Analysis I¶

Tom Paskhalis¶

27 April 2022¶

Overview¶

  • Module objectives
  • Prerequisites and software
  • Materials and books
  • Schedule

Source: Harvard Business Review

Source: Drew Conway

Source: Reddit

About me¶

  • Assistant Professor in Political Science and Data Science, Trinity College Dublin
    • Before: Postdoctoral Fellow, New York University
    • PhD in Social Research Methods, London School of Economics and Political Science
  • My research:
    • Political communication, social media, interest groups
    • Text analysis, machine learning, record linkage, data visualization
  • Contact
    • tom.paskhalis@tcd.ie
    • tom.paskhal.is
    • @tpaskhalis

Module Objectives¶

  • Introduce the fundamentals of computer programming
  • Get familiar with R programming languages
  • Develop understanding of core software design principles
  • Learn crucial data manipulation techniques
  • Practice these concepts using real-world datasets

Books¶

Some books that can be helpful (last 3 are available online!):

  • Matloff, Norman. 2011. The Art of R Programming: A Tour of Statistical Software Design. San Francisco, CA: No Starch Press.

  • Peng, Roger D. 2016. R Programming for Data Science. Leanpub.

  • Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O'Reilly Media.

  • Wickham, Hadley. 2019. Advanced R. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC.

Additional Online Materials¶

  • Git Book

  • R Manuals

  • R Documentation

  • R Inferno

Prerequisites and Software¶

  • Introductory module - no formal prerequisites
  • Laptop with Windows/Mac/Linux OS (no Chrome books)
  • Software:
    • R (version 4+) - statistical programming language
    • RStudio - integrated development environment
    • Git - version control system
    • GitHub - git-based online platform for code hosting

Module Meetings¶

  • Wednesday 13:00-18:00 - 2041B, Arts Building
  • 5 weeks from 27 April to 25 May

Meeting Structure¶

  • Lecture (1 hour)
  • Exercise (1.25 hour)
  • Break (15 mins)
  • Lecture (1 hour)
  • Exercise (1.25 hour)
  • Break (15 mins)

Module Outline¶

Week Topic
1 Introduction to Computation
2 R Fundamentals
3 Control Flow and Functions
4 Debugging and Testing
5 Data Wrangling

Any questions?

Next¶

  • Introduction to R