All posts by Ben Schmidt

KMW regular expressions

Finding all words that contain a K, M, and a W: there are a few ways to do this, but one is to use the so-called “lookahead” operator.


That gives:


Georectification follow-up

Here’s the MBTA’s map of the T, georectified on the individual stations. If you want to overlay it onto a real map of Boston to get a sense of how that it maps onto the physical geography, you can explore it in QGIS.



But to share, you might want to put it on the web. Here’s a link to georectified version of the MBTA T map that you can pan around. When you installed QGIS, you got enough software to build something like this yourself; but if you want to see your file online, you can also just send me the .tif and .vrt files you get by exporting a map in QGIS and I’ll put it online. Instructions on just what you need to do are in the first paragraph here.



Text Analysis.

In addition to Jockers’ Macroanalysis, which is a formidable work with a huge number of texts, we’re reading a short blog post by Cameron Blevins about topic modeling Martha Ballard’s diary.

This is an example of the sort of small scale, but potentially helpful interventions that humanists can do quite easily with some existing packages.

We’re going to do some topic modeling in class using R and Mallet; I’ll have some sets of text build up that you can work with, but it will be extra useful if–like Blevins–you’re able to find some text collection of your own that you can work with. If you have one or an idea for one, let me know and we can figure out how to get it ready for class.


As I said in class, this week before we meet you should take some time to participate in a crowdsourcing project to see how some institutions are digitizing their content. Everyone should take a different one so that we can compare notes about the possibilities and pitfalls of this sort of thing. You’ll probably be happiest if you can find something that maps against your interests (try googling “Crowdsourced ___ history” or something as a last result to find projects.

Spend enough time to make a contribution to the archive, but also browse around and be ready to report to the rest of us how well the project is working, what sort of contributions it seems to be getting, and if it’s a model extensible to other projects. Would you be able to apply these methods to a project yourself? Could you go about digitizing your own research artifacts in these same ways?

Some possibilities:


You don’t have to use twitter, but it can be a good way to find out what’s going on in the field. We’ll expand this list, but just to start:

Think about setting up at Twitter account to follow some of your historical colleagues at other institutions.

Here at Northeastern, you might want to follow:


Some historians and humanists outside of the university:

@dancohen (Digital Public Library of America)
@DavidRArmitage (Harvard)

Finding Blogs to Follow

You should definitely follow Digital Humanities Now.

One way to find other blogs is just to do some searches for your field—“Russian History Blog,” for instance, will take you straight to this good group blog. Try to find blogs written by academics so you can see how professional historians write. If you can’t tell who the author is from the blog, that’s usually a bad sign nowadays. (Academics blogged anonymously quite frequently in the early 2000s, but now they tend to do it under their own name).

Group blogs are particularly good. Some group blogs with a particular subject area you might interested in are:

The blog of the Society for US Intellectual History.
Blog of the Forum for the History of Science in America.
has a blog: you could follow it, or even better go through some of their posts to see who they’ve been following themselves.