130
POP77032 Quantitative Text Analysis for Social Scientists
.txt.md, .Rmd, qmd, .html.csv, .tsv, .tab, .json, .xml.svg, .eps.py, .R, .tex, .sh.ipynb (effectively, .json), .docx (effectively, zipped .xml).doc, .rtf, .pdf.pickle, .rds, .feather.png, .jpg, .gif.docx, .pdf, .svg).Windows-1252 or UTF-16).Write out text using Python in ISO-8859-1 encoding.
130
Read in text in R using the default (UTF-8) encoding.
[1] "i" "me" "my" "myself" "we"
[6] "our" "ours" "ourselves" "you" "your"
[11] "yours" "yourself" "yourselves" "he" "him"
[16] "his" "himself" "she" "her" "hers"
[21] "herself" "it" "its" "itself" "they"
[26] "them" "their" "theirs" "themselves" "what"
[31] "which" "who" "whom" "this" "that"
[36] "these" "those" "am" "is" "are"
[41] "was" "were" "be" "been" "being"
[46] "have" "has" "had" "having" "do"
[51] "does" "did" "doing" "would" "should"
[56] "could" "ought" "i'm" "you're" "he's"
[61] "she's" "it's" "we're" "they're" "i've"
[66] "you've" "we've" "they've" "i'd" "you'd"
[71] "he'd" "she'd" "we'd" "they'd" "i'll"
[76] "you'll" "he'll" "she'll" "we'll" "they'll"
[81] "isn't" "aren't" "wasn't" "weren't" "hasn't"
[86] "haven't" "hadn't" "doesn't" "don't" "didn't"
[91] "won't" "wouldn't" "shan't" "shouldn't" "can't"
[96] "cannot" "couldn't" "mustn't" "let's" "that's"
[101] "who's" "what's" "here's" "there's" "when's"
[106] "where's" "why's" "how's" "a" "an"
[111] "the" "and" "but" "if" "or"
[116] "because" "as" "until" "while" "of"
[121] "at" "by" "for" "with" "about"
[126] "against" "between" "into" "through" "during"
[131] "before" "after" "above" "below" "to"
[136] "from" "up" "down" "in" "out"
[141] "on" "off" "over" "under" "again"
[146] "further" "then" "once" "here" "there"
[151] "when" "where" "why" "how" "all"
[156] "any" "both" "each" "few" "more"
[161] "most" "other" "some" "such" "no"
[166] "nor" "not" "only" "own" "same"
[171] "so" "than" "too" "very" "will"
Advantages:
Disadvantages:
Extra
{
"response": {
"status": "ok",
"userTier": "developer",
"total": 115725,
"startIndex": 1,
"pageSize": 1,
"currentPage": 1,
"pages": 115725,
"orderBy": "relevance",
"results": [
{
"id": "travel/2026/jan/05/i-ran-1400-miles-around-ireland",
"type": "article",
"sectionId": "travel",
"sectionName": "Travel",
"webPublicationDate": "2026-01-05T07:00:29Z",
"webTitle": "I ran 1,400 miles around Ireland",
"webUrl": "https://www.theguardian.com/travel/2026/jan/05/i-ran-1400-miles-around-ireland",
"apiUrl": "https://content.guardianapis.com/travel/2026/jan/05/i-ran-1400-miles-around-ireland",
"isHosted": false,
"pillarId": "pillar/lifestyle",
"pillarName": "Lifestyle"
}
]
}
}
List of 1
$ response:List of 9
..$ status : chr "ok"
..$ userTier : chr "developer"
..$ total : int 115725
..$ startIndex : int 1
..$ pageSize : int 1
..$ currentPage: int 1
..$ pages : int 115725
..$ orderBy : chr "relevance"
..$ results :'data.frame': 1 obs. of 11 variables:
.. ..$ id : chr "travel/2026/jan/05/i-ran-1400-miles-around-ireland"
.. ..$ type : chr "article"
.. ..$ sectionId : chr "travel"
.. ..$ sectionName : chr "Travel"
.. ..$ webPublicationDate: chr "2026-01-05T07:00:29Z"
.. ..$ webTitle : chr "I ran 1,400 miles around Ireland"
.. ..$ webUrl : chr "https://www.theguardian.com/travel/2026/jan/05/i-ran-1400-miles-around-ireland"
.. ..$ apiUrl : chr "https://content.guardianapis.com/travel/2026/jan/05/i-ran-1400-miles-around-ireland"
.. ..$ isHosted : logi FALSE
.. ..$ pillarId : chr "pillar/lifestyle"
.. ..$ pillarName : chr "Lifestyle"
id type sectionId
1 travel/2026/jan/05/i-ran-1400-miles-around-ireland article travel
sectionName webPublicationDate webTitle
1 Travel 2026-01-05T07:00:29Z I ran 1,400 miles around Ireland
webUrl
1 https://www.theguardian.com/travel/2026/jan/05/i-ran-1400-miles-around-ireland
apiUrl
1 https://content.guardianapis.com/travel/2026/jan/05/i-ran-1400-miles-around-ireland
isHosted pillarId pillarName
1 FALSE pillar/lifestyle Lifestyle