You can find my most formal DH work in my CV.

I have two major digital projects in process. One is Bookworm, which I co-direct with Erez Lieberman Aiden of Rice University, and provides an open-source platform for data analysis and visualizations of large libraries. It is used by a number of other organizations, including Yale University, the Medical Heritage Library, and the Hathi Trust.

The other is a monograph-sized critical engagement with digital sources as mediated through the crucible of the American state, 1850 to 1950. Elements of that project are online in my work on whaling logs and on data visualization in the census bureau.

Much of my ongoing research in text mining or digital humanities is posted in some form at my personal research blog, Sapping Attention. Here are some highlights:

Several posts engage with new research on a database of approximately 1 million out-of-copyright texts from the Open Library. Practice, the Periphery, and Pittsburg(h) (January 2012) shows how simple textual cues that are nearly meaningless at the individual scale can help us to think about historical patterns of cultural dissemination and state power when we look at larger aggregates. Age Cohort and Vocabulary Change introduces a method for thinking about the effects of generational cohorts on language change which I have been developing further offline. Women in the Libraries uses census data to delve into questions of author gender and representation in academic libraries.

Others offer introductions to the non-technical aspects of using digital libraries for historical research. What historians don’t know about database design… and Stopwords to the wise both treat the influence of database design decisions in shaping the work possible for traditional humanistic researchers who venture online. In search of the great white whale describes the the curious omission of most of the canon of British and American literature from our digital archives. Digital History and the Copyright Black Hole provides a more polemical perspective on the distorting effects of copyright law on research possibilities.

I also ran a seven-part serial study on visualizing the locations of the American whaling fleet. The previous link takes you to the overview page; the central post is here. This built off of methods from the singular most widely circulated post on the blog, which uses ship’s logs originally digitized for climate research to show the paths of European ships in the 18th and 19th centuries.

Finally, the blog regularly engages with ongoing debates about the thorny questions about theoretical uses and abuses of digital methodologies in the humanities. Among these are What’s New (one response to the common question, “What do numbers tell us we didn’t know before?”), Theory First, urging Digital Humanists to put an emphasis on theoretical grounding ahead of digital methodology (later republished in the Journal of Digital Humanities), and a post on the confusion of the physical book with cultural systems of academic reproduction. My whaling series also includes a posting on the missing category of individual experience in data-driven narratives.

