Beyond Bag-of-Words

Lecture Slides Lecture Slides (pdf) Lecture Slides (ipynb)

Tutorial Exercise Tutorial Exercise (pdf) Tutorial Exercise (ipynb)


In this week we move beyond the classic bag-of-words representation of text data and look at how to take account of word order and context.

Required Readings

  • Grimmer, Roberts, and Stewart 2022 Chs 7 The Vector Space Model and Similarity Metrics, 8 Distributed Representations of Words.
  • Turney, Peter D., and Patrick Pantel. 2010. “From Frequency to Meaning: Vector Space Models of Semantics.” Journal of Artificial Intelligence Research 37 (1): 141–88. https://doi.org/10.1613/jair.2934

Additional Readings

Tutorial

  • Working with word embeddings