Working with Data

All enrolled students should be in the course slack; I will place a link here.

Data analysis in the humanities presents challenges of scale, interpretation, and communication distinct from the social sciences and sciences. In recent years, a number of new practices in this sphere have begun to cohere: “cultural analytics,” “distant reading,” “macroanalysis,” and “data feminism.” But it can be hard for humanists to learn how to apply these practices, not just talk intelligently about them.

This graduate seminar will develop skills to read and create scholarship in these computationalist traditions of the digital humanities. We’ll do so through more traditional seminar readings and a series of programming worksets that will teach you how to do a variety of types of data analysis and visualization that are actually useful for humanists and others communicating in areas where the data is messy and culturally contingent, and where the reading audience is not necessarily trained or interested in statistics.

This course is designed for graduate students in the humanities and related fields like journalism, media studies, sociology, and digital media. It intentionally does not cover most of the statistical techniques that a comparable course in economics, statistics, or data science might spend time on, developing instead geocomputation, data architecture, and information visualization.

We will primarily use SQL databases and the R language, but the programming in this class is a means towards an ends: understanding the many formal languages that computers let us use and develop to describe culture. Programming for humanists tends to be an overwhelming grab-bag of tools and methods, so we focus on a few classic methods and the basic skills of data manipulation you need to connect them together. If you have previous experience with Python or Javascript and wish to use one of those languages, you may do so. The lessons for this course are tied to an online text with exercises.

In the 2022 year, we will be focusing some project work on a collection of New York City directories containing labor and ethnic information each year that recent (fall 2021!) work at Columbia has made possible to locate geographically.

You can read a preliminary syllabus on this site, which will change.

This course will meet on Monday evenings at about 6:20 New York University in Spring 2022.