Perplexitaal 1

Perplexitaal 1: Can you guess the text from the keywords?

Keyword analysis has become a widely used tool in applied linguistics since it is relatively easy to do and often provides useful insights into the content and style of the target texts. Basically, keyword analysis compares the proportional frequencies of each word in the target text with their proportional frequencies in a benchmark corpus. Words with high keyness values are words which are likely to be important in some way in the target text.

There are several free tools available for conducting keyword analyses (and AntConc includes this function). For this Perplexitaal, I used KeyBNC ( which compares your text against the British National Corpus. I used the most common keyness statistic – log likelihood (LL), excluded words with a frequency of less than 3, and deleted proper nouns to make it a bit more challenging.

Here are the top 10 keywords for 3 well-known texts. Can you guess the text from the keywords? Hint: LL values are affected by the size of the corpora so, in this case, they should give you some indication of how long the target text is.

Text A
Top 10 keywords

Word Keyness (LL)
let 93.91
me 93.19
go 91.70
ooo 52.43
blows 36.45
matters 30.29
no 29.91
just 29.62
gotta 26.11
poor 26.10

Text B
Top 10 keywords

Word Keyness (LL)
here 52.81
nation 50.04
dedicated 45.29
we 24.75
dead 21.24
shall 18.48
cannot 17.92
great 13.22
that 10.26
people 8.03

Text C
Top 10 keywords

Word Keyness (LL)
he 1,616.93
was 1,510.00
had 1,209.20
his 690.33
proles 606.80
party 600.71
it 478.73
him 478.53
seemed 393.26
face 346.83


0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments