Week 1: Introduction¶

Python for Social Data Science¶

Tom Paskhalis¶

Overview¶

  • Module objectives
  • Prerequisites and software
  • Materials and books
  • Schedule

Data¶

Data Science¶

Source: Drew Conway

Data Scientist¶

Source: Harvard Business Review

Tools ...¶

Source: Reddit

Replicable Analysis¶

Source: Sidney Harris

About me¶

  • Assistant Professor in Political Science and Data Science, Trinity College Dublin
    • Before: Postdoctoral Fellow, New York University
    • PhD in Social Research Methods, London School of Economics and Political Science
  • My research:
    • Political communication, social media, interest groups
    • Text analysis, machine learning, record linkage, data visualization
  • Contact
    • tom.paskhalis@tcd.ie
    • tom.paskhal.is
    • @tpaskhalis

Module Objectives¶

  • Introduce the fundamentals of computer programming
  • Get familiar with Python programming languages
  • Develop understanding of core software design principles
  • Learn crucial data manipulation techniques
  • Practice these concepts using real-world datasets

Books¶

Some books that can be helpful:

  • Guttag, John. 2021 Introduction to Computation and Programming Using Python: With Application to Computational Modeling and Understanding Data. 3rd ed. Cambridge, MA: The MIT Press

  • McKinney, Wes. 2022. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 3rd ed. Sebastopol, CA: O'Reilly Media

  • Sweigart, Al. 2019. Automate the Boring Stuff with Python. 2nd ed. San Francisco, CA: No Starch Press

Additional Online Materials¶

  • Git Book

  • The Hitchhiker's Guide to Python

  • Python For You and Me

  • Python Wikibook

  • Python 3 Documentation (intermediate and advanced)

Prerequisites and Software¶

  • Introductory module - no formal prerequisites
  • Laptop with Windows/Mac/Linux OS
  • Software:
    • Python (version 3+) - versatile programming language
    • Jupyter - web-based interactive computational environment
    • Git - version control system
    • GitHub - git-based online platform for code hosting

Module Meetings¶

  • Online:
    • Wednesday 18:00-20:00
    • Thursday 18:00-20:00
  • 6 weeks from 15 February to 30 March

Module Outline¶

Week Topic
1 Introduction to Computation
2 Python Fundamentals
3 Control Flow and Functions
4 Debugging and Testing
5 Data Wrangling
6 Data Analysis & Communicating Results

Any questions?

Next¶

  • Introduction to Computation