Modelling Texts
Lecture Slides Lecture Slides (pdf) Lecture Slides (ipynb)
Tutorial Exercise Tutorial Exercise (pdf) Tutorial Exercise (ipynb)
This week we focus on probabilistic aspects of modelling text and take a look at some commonly assumed data generating processes and analytical approaches that result from those.
Required Readings
- Chs 6, 12–13 Grimmer, Roberts, and Stewart 2022;
- Blei, David M. 2012. “Probabilistic Topic Models.” In Communications of the ACM, 55:77–84. https://www.cs.columbia.edu/~blei/papers/Blei2012.pdf.
Additional Readings
- Kenneth Benoit, Michael Laver, and Slava Mikhaylov. 2009. “Treating Words as Data with Error: Uncertainty in Text Statements of Policy Positions.” American Journal of Political Science 53 (2): 495–513. https://kenbenoit.net/pdfs/blm2009ajps.pdf
- Margaret E Roberts et al. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58 (4): 1064–1082 https://scholar.harvard.edu/sites/scholar.harvard.edu/files/dtingley/files/topicmodelsopenendedexperiments.pdf
Tutorial
- Authorship prediction and topic modelling