Monthly Archives: February 2015

Language Contributions to Subjects

For my first data exploration of the booklists, I started by looking at the subject trends of books from 1850–1922.  Unsurprisingly, there is a general increase in publications for all subjects due to advances in printing technologies.  I measured each subject’s presence in print culture based on the sum of the number of words and the number of publications.  Both are nearly identical.  I could’ve made a mistake in the code, but the graphs have different scales for the y-axis.

Because that graph wasn’t interesting, I focused on which languages were producing texts for each subject.  I limited the languages to French, German, Italian, and Spanish.  German overtakes French as the most published language for the Art, Science, Technology, Music, Medicine, and Economics during the span of this graph.  I wouldn’t have guessed, though, that French was a more common language for Military Science publications right before WWI.
Publication_Lang copy
I’ve added a part of my Faulknerian random walk generator because I really liked it.  I added some punctuation and ellipses to make it look modern.  The modernists loved ellipses.
book_lengthbook_lengththe head of one mule appears, its eyes roll with soft, fleet, wild opaline fire; its muscles bunch and run at it, because jewel is quiet now. “up your…” i said, “thought it would take a rawhiding for thinking they meant it.” but the courthouse lifts among the pine clumps blotched up the ford, used to be enclosed in a cage in jackson where,his grimed hands lying light in a greek frieze, isolated out of him, trying to catch her. “darl catch her darl catch her” darl says. pa says, “reckon i better do.” pa says, “cash does not look back when she finds me watching her, her eyes and face kind of… kind of lived.” one part used no more than you can ride down. dewey dell says, “leaning above the edge of the minds of the…” cash says, “kind of pop eyes like she says she…

A Taste of Italy (Part 1 of ???)

My ultimate project in the class is going to be looking into the specific political and literary works of Gabriele D’Annunzio. This is going to serve as a preface/introductory post to get the gist of the state of Italian literature when he began publishing around 1880. I looked at the full bibliographic data we had been using in class, and found the 5 most frequently published Italian authors (in the Italian language) from 1880-1922. The list is as follows:

  1. Gabriele D’Annunzio (80 books published in Italian)
  2. Giosuè Carducci (53 books)
  3. Alessandro Manzoni (38 books)
  4. Edmondo De Amicis (38 books)
  5. Antonio Fogazzaro (37 books)

I created a simple, yet nauseatingly colorful bar chart with this list:

Most Books Published in Italian 1880-1922

Then I looked at these same 5 authors’ published books in other languages (I am hesitant to say books “translated” into other languages, because I did not distinguish between the same book appearing multiple times in multiple languages from a book appearing once in a non-Italian language). As you will see, D’Annunzio is still the most prolific, but it is interesting to note De Amicis, whose books have been published in the greatest variety of languages.

Popular Italian Authors in Other Languages


Finally, breaking away from the author-specific, I looked at the total number of books published per language for English, Italian, French, and German over this time frame. This is mainly to get a sense of the number of Italian books relative to other languages in the set. The one interesting thing I noticed, is in the early part of World War I, around 1914-1915, it seems there is a steep drop in German, French, and Italian, whereas the English language books remain steady.

Books by Language 1880-1922

Wiggly Tales: A Random Walk Generator

We’ve been reading a lot of fairy tales around my house recently, so I wanted to see how well-spun of a tale I could create by walking randomly through a collection of fairy tales. I selected four fairy-tale collections from Project Gutenberg to test this idea on. Code is on GitHub.

I selected these four collections:

The addition of the Arabian Nights stories to Western European fairy tales makes the random generator more interesting, sometimes throwing the geographical sense of the tale into a different place and a different world.

This version generated my favorite beginning: “once upon a time a man by the river yes he was looking straight into the deep waters skeletons of walruses.”

But other versions of the generator took an even darker turn. Here’s the raw text:

“once upon a great procession which was conscious of pain And sore regret of which she said nothing but torment and affliction that He sniffed about to give the ants were always running to and when he approached her they did not really birds but she bore thee Thou hast nothing to me Only tell me something Why this is what you say What is the news O my sister relate to me Art thou she whom he found it impossible to think of The old rough doll You are learned and wise men assembled together in his age and to nail up my mind every earthly care and sorrow with soft turf From the narrow walks and the Wezeer the father of Is both of you should care so much that renders men sinful and impure He fully realized the true the speaker s hand saying to each other till the morning following I have with me from first to last and then burst and fell fast asleep”


And here’s the story, with some punctuation that I added for “clarity”:

Once upon a great procession–which was conscious of pain and sore regret, of which she said nothing but torment and affliction that He sniffed about to give. The ants were always running to, and when he approached her, they did not really birds but she bore thee: “Thou hast nothing to me. Only tell me something: Why this is what you say? What is the news? O my sister relate to me! Art thou she whom he found it impossible to think of? The old rough doll? You are learned, and wise men assembled together in his age and to nail up my mind every earthly care and sorrow with.” Soft turf from the narrow walks and the Wezeer the father of Is, both of you should care so much! That renders men sinful and impure. He fully realized the true the speaker’s hand, saying to each other till the morning following, “I have with me from first to last,” and then burst and fell fast asleep.

And sometimes it’s important to be reminded of where your texts come from. I didn’t remove any text at all from the Project Gutenberg texts, which means that the copyright and distribution information could appear in our stories too. For example:

“The two grand annual festivals are observed with public domain eBooks Redistribution is subject to particular laws or rules with respect to our beetle to himself but the observance of this Wezeer So the porter approached the Distracted Slave of Love when his boat or playing in the lap of prosperity and the fear of him said the Fire drum Peter has gone away I ll do something in me.”

All this generator proves is that tales can be wiggly indeed.


Installing Git

Using git will make it easier to access the course files in RStudio.

Here’s how to get it.

On Linux, it’s probably already installed. Any package manager will include a git install.

On Windows, just follow the official instructions.

On OS X, you’ll need to install the “XCode command line tools.”

There are instructions online for doing this: the precise mechanism varies by operating system. You can always upgrade to Yosemite, the latest version, and follow these instructions. But don’t feel the need to upgrade if you’re on an old machine: it may slow you down.

On some versions, that will install git directly. But if you want to install more command-line tools, it may be worthwhile (on a Mac) also installing a program called homebrew. 

To install it, open the application “terminal,” and paste the following:

ruby -e "$(curl -fsSL"

Follow the prompts. It will require your password.

Afterwards, you can install the latest version of git. The way to do this is to type at the command line:

brew install git

This works for all sorts of programs: you can also, for example, upgrade to the latest version of R by typing

brew install R

Some R packages you may encounter on your own have complicated “dependencies:” that is, they may need some other set of programs installed. (For example, to do advanced mapping in R, you may need the gdal toolset). `brew install XXX` will frequently let you install a program without even having to find its website.