Data Visualization

While exploring the world of “humanities computing,” that is, using digital methods in research, some common themes I’ve noticed are the visualization of data and text mining. By “visualization” I refer to the use of computers to generate charts and models to show associations between, or elements of, data (usually quantitative in nature). Text mining, on the other hand, is the scanning of digitalized texts for word occurrences and word patterns, and can be open-ended or delimited by the researcher according to his or her interest.

Text mining was covered heavily by Matthew Jockers in Macroanalysis. His main concern is to demonstrate the viability of applying text mining to large sets of data; in his case, a corpus of principally Irish and British literature. By using computer software Jockers is able to hastily complete an analytical process that would take weeks if left to human mental faculties. He compiles data about the writing style, themes, and subjects in the corpus of novels to show their relationships with, for example, nationality and gender. He also represents this data visually by using charts and word clusters (a model where the size of the word or phrase and its position relative to the center of the cluster symbolize significance) to show trends or hidden aspects in the data, whether by year, region, nationality, etc.

This week I’ve encountered some essays which use the visualization technique in a slightly different way. In The Other Ride of Paul Revere Shin-Kap Han I was introduced to the use of network modeling. Han uses models to provide a visualization of membership data which illustrates the connections between Paul Revere, Joseph Warren, and political groups that were otherwise disjointed at best. Caroline Winterer, for her part, uses network modeling to map letter correspondences for Benjamin Franklin and Voltaire. A significant difference between these scholars and Jockers is that they use modeling not to represent aspects of text but more concrete historical social realities.

Whatever the end result of modeling and/or text mining, there’s a notable degree of cautious skepticism present among the writers. Jockers suggests the methods of the digital humanities can only accompany, or are even secondary to, the traditional method in humanities scholarship of close reading and interpretation. Winterer also says distinctly that her study attempts to bump against the boundaries of digital mapping and that we have “good reasons to be wary of what digitization and visualization can offer us” (Winterer, 598).

While I think it’s too early for me to put in a good word for text mining, especially after an in-class experiment last week returned some peculiar results, the visualization aspect of the digital humanities seems, based on my inchoate impressions, at least, to be a useful tool for interpreting data. Certainly there’s validity to the issues of how to properly interpret the meaning of the data that’s being visualized in the wider historical context–an end to which, it could be argued, is dependent on the traditional “close reading” approach. But it seems more striking, more eye-catching, I think, to visualize the transfer of letters instead of weeding out one by one their individual destinations, for example. We should be cautious on how to interpret models but certainly we can accept them as an aid to our interpretation.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">