Workset 5

This workset is about working with texts on the small scale. We’ll engage with larger scale analysis and topic modeling text week.

Choose a small-ish set of texts to work with. This may be an individual work, several smaller ones, or something in between; it should be between 10,000 and 1 million words, but you could go down to 3,000 if there’s something you really want to look at. Choose them based on the first questions below.


  1. Upload your texts as a single document to Just screw around for a little while with the results. Try to obey Steven Ramsay’s suggestion to think about what the results will be before you look. What are the most common words going to be? Will there be trends?

  2. Upload your texts as multiple documents to Voyant tools using some level of address. as described by Whitmore. You might lump together a few of the corrected versions vs uncorrected of scans; compare two authors whose works you can find online; etc. What are some of the differences between the levels of text? The statistics will come out easily.

  3. What is one obvious functionality that an online portal like Voyant should have but doesn’t?