Monthly Archives: March 2015

Great Expectations & Hard Times: Dickens, Twitter, and R

This is a post about the experience of banging one’s head against the wall. Or, no. This is a post about trial and error. About Great Expectations and Hard Times. About learning. Really, really slowly.

If I had written this post two weeks ago, it would have begun differently. I was feeling good about my progress with R back then. We’d gone from writing code to making that code show things in the form of visualizations. To this end, I was able to confirm Moretti’s observation that book titles shrunk in length over the course of the 18th and 19th centuries. That graphs is below:


Then, I became interested in what the Library of Congress classifications could tell us about the books in our data frame. I made a rather colorful graph showing the rate of growth in number of books published per year and separated out by classification. Of particular interest to me is the way that the literature classification, P,  grew at a relatively more dramatic rate than any of the others. Here’s the graph:


After working with the State of the Union Addresses last week, I was eager to do some textual work myself. I looked around for some texts that might be more relevant to my interest in literary journalism, but, eager to get my hands dirty, I just decided to work with the provided Dickens texts. First, I read in the Dickens corpus and managed to tokenize it by word. From there, obviously, I went straight to the random walk generator, since we had so much fun with it in class and I wanted to see what it might look like in Dickens’ “voice.” The result was pretty cool, and really dialogue heavy. I added some punctuation for effect, but I think it sounds like Dickens:

“It was the first clear indication, Sir Leicester. If I had better be a comfortable home.”

“The slight noise they made me wery cold, I tell you,” said Mr Phunky Serjeant.

Snubbin replied, “The little mother had been a post at the door.”

He held his peace.

“Come,” cried the boy’s face with an air of Fleet Street amidst the loud screams of ladies and gentlemen. “Here but humiliation that I suffered it to the past history her present position as will outlive this danger and your manners.”

My friend Dombey with his disengaged arm and yard: “By his name I heard the old gentleman returned, the Captains said.”

But now we come to the frustrating part. Undoubtedly related to my own inflated sense of confidence, I wanted to actually make something. I decided to take up the suggestion in the syllabus to create a Twitterbot, something I’d thought of trying to learn in the past anyway. My idea was to tweet text generated from the Dickens random walk generator.

This presented a few challenges. First, I needed to figure out how to limit the amount of text the generator produces. As we’ve seen, the code most of us were using would run forever. So, I figured out how to set a limit of 20 words (if I was being really precise I might have used character count to limit it). Then, following some directions I found online, I registered as a developer with Twitter and connected R Studio to Twitter using the TwitteR package. I sent my first tweet from R last night:

Now that I knew I could tweet from R, how to make the content of the tweet some text from the random walk generator? This is where the banging my head against the wall came in. I spent hours, a lot of hours trying to figure this out. The problem was that rather than “print” the text from the function, I needed to store it as a value. Finally, around 9pm last night, I tossed out the Hail Mary to Prof. Schmidt, who, even at that late hour, was kind enough to fire back a bit of code. Another couple hours passed trying to adapt what he provided with what I had set up, until finally, at exactly 11:20pm, I published my first auto generated Dickens quote. Note that I appended an ellipses and a hashtag so that the people who follow me on Twitter wouldn’t worry that I had been hacked or lost my mind. Here’s the tweet:

And then, for good measure, and just to ensure myself that I hadn’t dreamt this success, I sent another one this morning:

The next step will be to learn how to automate this, but for now, it’s fun to think that whenever I get stuck working in R Studio, rather than bang my head against the wall, I can launch an original, auto generated Dickens quote out into the world.