Markdown, Historical Writing, and Killer Apps

Like many technically inclined historians (for instance, Caleb McDaniel, Jason Heppler, and Lincoln Mullen) I find that I’ve increasingly been using the plain-text format Markdown for almost all of my writing.

The core idea of Markdown is that rather than use Microsoft Word, Scrivener, or any of the other pretty-looking tools out there, you type in “plain text” using formatting conventions that should be familiar to anyone who’s ever written or read an e-mail. (Click on Mullen’s or Heppler’s name for a better introduction than this, or see the Chronicle’s wrapup of approaches).

The benefits are many, but they’re mostly subtle:

  • A simple format like Markdown creates documents you’ll have not trouble reading in twenty years. I’ve been teaching a survey course this semester and had a hell of a time reading my old notes from generals which I took using EndNote; with Markdown, any web browser, text editor, or Microsoft Word descendant will have no trouble opening it.
  • It’s very easy to produce content that will look good in multiple media: I can make a course syllabus or personal CV with that formats nicely on a website and produces a clean looking PDF at the same time.
  • It becomes much easier to do things to a bunch of notes at the same time: bundle them into PDFs, search through all of your notes simulataneously, and so forth.

None of these, though, are a particularly strong sell for those who use a computer instrumentally: in reality, your Microsoft Words documents aren’t about to disappear, either. And there are disadvantages to giving up Word.

  • Things like footnotes with a citation manager are not very easy, even for the technically competent. 1 Even footnotes without a citation manager are fairly clumsy.
  • The best tool for making your Markdown documents into attractive web pages , Pandoc, is not especially easy to install or configure if you don’t use the command line on a regular basis.
  • The core definition of Markdown is a little unclear: particularly in the last week, there have been some conflicts over the definition that will be confusing to newcomers. (Although the proposal that sparked them, “Common Markdown,” is likely to be a good thing in the long run)

The heart of Markdown’s appeal is its flexibility: to drive any adoption outside the hard core of people, you need a killer app built off of it that solves a problem. In the technology sector, that has been Markdown’s ability to easily handle links and snippets of computer code for those writing on two widely used sites, GitHub and Stack Overflow

Among historians, neither of those are very important. And the footnote problem is big enough that I generally wouldn’t recommend anyone to use Markdown, right now, unless they enjoy banging their head against the wall.

Lectures and Notes: the killer apps.

There are two places, though, where even historians don’t tend to use footnotes: lectures, and notes. And in both of these, Markdown makes some amazing things possible.

If there’s any reason for historians to use markdown, it’s in these two spheres. The reason I keep using Markdown is that it makes it possible for me to personally solve two problems that have driven me crazy:

  1. Quickly making slides decks to go alongside a lecture, and borrowing and reusing chunks of slides from one talk in another;
  2. Making heads or tails of the thousands of pictures you take while in an archival trip.

Markdown and lectures: multimedia and transposability.

First lectures. With Markdown, I’m able to write my own notes and create a slide deck at the same time. An example will help. Here’s a snippet from my lecture notes on the memory of the Civil War:

With some ancillary code I wrote, that does two things at once: builds a slide showing the wikimedia copy of Sherman’s grizzled mug, and creates a set of notes for me under the header “Abolitionist memory of the war” to go on the paper notes I’ll read from.

Later on, I’ll write another script that will find pull every phrase in boldface (like “Field Order 15”) from all my notes and put them onto a list of possible IDs for the midterm I can hand out. Another script could strip just the section headers and print out outlines for the lectures to hand out before class.

This is writing documents for multiple uses, and it can be incredibly useful. If, two minutes before class, I decide I want to switch the order I talk about the abolitionist memory of the war and the white supremacist memory of the war, I can just cut and paste the chunks of text, and all the slides associated with each will have their order switched.

Something like this could provide a really useful way to integrate and share resources, and free up some of the tedium with prepping lectures. But:

  • That syntax for including an image as a slide is my own, not standard Markdown. I’ve defined scripts for dropping in YouTube videos, images, captions, and some other predefined formats: but it would take a lot of work to define a set of them that make sense for anyone but me.
  • There are a lot of standards out there for working with HTML slides. None is winning, in part because none is anywhere as good as Keynote or Powerpoint for the average user. My code works with deck.js, one of the only HTML formats not supported by Pandoc; but there’s no obvious other standard to switch to.
  • Constructing slides that are more complicated than a single image with a title, or a numbered list, requires some serious HTML/CSS expertise. My scripts support that, but not in a pretty way.

Modern HTML allows some beautiful things: I can easily imagine a GUI for one of the standards that would make it easy to create slides for re-use in one of the competing platforms. But I think the standards are still evolving too rapidly in this sphere to make the way forward obvious.

Pull out the slide deck, and you still might have a useful tool here: something that generates a lecture notes for me, outlines for the students/course web page, and IDs for the test prep sessions. But I think there’s something even more valuable possible for archive notes.

Markdown and the Archives: integrating notes and photos

Markdown is a great language for taking archival notes. Archives are all about hierarchy: and Markdown easily lets you tag mutliple levels of headers (Series, Box, Collection, file…). But so is Microsoft Word: and there are plenty of outlining programs out there that are even better.

There are a few things that Markdown notes might do more easily than normal ones. Build a good enough web interface, and you could even click on a photo or quote in your notes and instantly get back a string that ascends the various headers to tell you where it is: Series 3a, Box 13, Folder 4, Letter on 4/18. But the place where there’s really an opportunity lies in Digital Photos.

Digital cameras have completely changed historians’ relations to archives in the last 15 years. (That is, in the subset of archives where cameras are allowed). We used to take notes: now, a massive part of our archival practice involves taking pictures, which have to be sorted through on our return.

When I’m wading through boxes, I tend to type the name of the box, and then some information about each folder followed by descriptions of the documents: if it’s especially useful or especially visual, I take a picture (or a series of several pictures). I think this is pretty similar to what most people do. It means that I end up with two separate timelines to sort through when I get home. 1) A bunch of textual notes that contain my impressions of the works and the rationales for why I copied them and what they are. 2) A stream of pictures with little context but their order to patch together their origin, sometimes with a close-up of a box or folder label thrown in to help.

The tough question is: how can you insert pictures into your notes? Unless you want to physically pick up your laptop and use the webcam for your pictures, it’s not obvious what the best way would be. And if you try to put more than a couple pictures into a Word document, it will crash right away.

Unlike the systems most historians use for notes, Markdown is plain text and has an easy method for inserting multimedia. That means that you can use it to integrate your archival photos directly into your notes; and that unlike Word, it can handle hundreds of images or thumbnails with aplomb.

The last challenge is knowing which parts of your notes go with which pictures. This is a surprisingly hard thing to solve: but there’s an existing answer in a second technology much beloved by the technology industry: version control.

Version control can get complicated, but in its simplest form it’s much like a wikipedia edit history: not just the current state of a file, but every previous revision is stored in memory.

So for archival notes, we just need to save the state of your archival notes every 10 or 15 seconds; match those markers against the timestamps of the photos from a digital camera; and insert the pictures into the text just in place.

When you want to review your notes, you just open them up in HTML format: thumbnails of every picture will appear in place, and you can click on them to get the full version.

For the technically savvy, I’ve put a set of scripts online that do just this. I use gitit to view the notes themselves so I can interlink between pages. A daemon handles the git commits: but that only works because I have always been a compulsive, several-times-a-minute saver of my documents.

What would a user-friendly platform look like?

My repo might be useful for those who are already comfortable with tools like version control: but those are the people who are already using Markdown anyway.

To make this useful for anyone else, we’d need a system with three easy, non-command line steps:

1. Installation

Puts Pandoc, Git, and a good Markdown editor on your computer at once.

2. Writing (in the archives)

This should resemble existing note taking as closely as possible: the user will need to make sure their camera’s clock is well-calibrated, but other than that it should look only like using a new text editor.

Whenever you type in the editor, it saves the files and runs git commit at close intervals. (Git experts may find the idea of automatic commits without a clear commit message cringe-inducing. Insofar as they have a point, edits should probably take place on a separate branch that is forked back into the main one periodically.)

3. Compilation (loading your pictures)

Imports photos from an sdcard or photo library, finds the version control files and matches photo times against them, and builds an html file for each document of notes.

What’s the platform?

Some of the technical components are obvious. I can’t imagine using anything other than git for version control; and though I use gitit to view files, I think that standalone html files are the only sensible way for most people to view their files. The scripting language for step three, as well, isn’t very important: I’ve used python, but anything with a set of hooks into git.

The big question is: what’s the text editor to be? I use emacs, and get the impression that most people writing in Markdown are using vim. Both of these are clearly bad choices for the ordinary historian. For all that Markdown can be written in any editor, the writing function also must support auto-save and auto-git-commit, so anything without a scripting interface is out. SublimeText has its selling points, but free’s probably the way to go.

That means, unless I’m missing a central player in the ecosystem, that the natural choice is the new Atom editor from Github. But perhaps there’s a more lightweight alternative?

Platform will also be an issue. The Mac is the obvious platform to capture a majority of historians: but a surprising number of people seem to take their notes with an iPad-keyboard array, which would call the whole stack into question.

Infrastructure

So that’s the proposal. Once historians see how great Markdown is for notes, maybe they’ll think about it for lectures; once they use it for lectures, maybe the footnote ecosystem will start to improve, and we’ll finally be able to distribute historical papers as text, making them more portable, more easily structured, and more lasting.

So, anyone want to try?


  1. It took me a few hours of mucking about in Emacs Lisp to make inserting a link to something in my Zotero library almost as easy as it is under Microsoft Word; and if you want to configure the core behavior of Pandoc, it’s best to use Haskell. Even the “programming historian” may not have heard of either of these languages. Both (well, at least Haskell) have their strengths: but suffice it to say that neither has ever been anyone’s answer to the question “If I should only learn one computer language, which should it be?”↩

6 thoughts on “Markdown, Historical Writing, and Killer Apps

  1. Caleb McDaniel

    Great post! I’m delighted (in my geeky way) to hear that you are also using Gitit, which is the platform I’m using for all of my research notes, paired with an Omeka installation to host archival photos.

    As a point of clarification, I think that footnotes themselves are not too difficult in most current implementations of Markdown. But bibliographic citations in those footnotes are, I admit, more complicated.

    I’m not sure whether the goal should be to create a killer app, or just to work on better educating historians on how to use the various parts of this workflow that each do One Thing Well. Once upon a time, Lincoln and I wanted to write a Plain Text workflow paper for humanists similar to the one Kieran Healy has for social science graduate students, but we haven’t gotten far. There are more and more such tutorials appearing, though, like this one on Programming Historian. I’ve toyed with the idea of writing a sort of Pandoc Primer that explains how its templating features can be used to do various practical things like writing recommendation lectures and creating slideshows/lecture notes.

    I don’t rule out a killer app, you understand, but there be dragons down that path as more and more users ask for more and more features …

    Reply
    1. ben Post author

      You’re right that it’s bibliographic citations that are really bad: I think that pandoc-citeproc and csl files are good enough for anyone (maybe not Andrew Goldstone?), but they’re a pain. But I do think the whole ^[footnote] syntax doesn’t feel as natural to me as the rest of the Markdown spec, to the point that I avoid writing them so as not to use them. Maybe I’m in a minority. (Plus, you see what happens to pandoc’s nice footnote return elements without too much care for encoding at the bottom of this post).

      I guess the reason I’d say there needs to be a killer app is that the benefits are just so subtle right now that it almost comes down to a certain sort of philosophy that makes Markdown appealing. If you don’t *already* know what a plain text document is, I’m not sure the gut appeal is built in, and that’s so much of what makes it nice…

      Reply
  2. Caleb McDaniel

    I do see your point. I think the best opportunity for converting those who are not already believers is the moment when Microsoft Word crashes and loses stuff you care about, and I mean “moment” in the Pocockian sense rather than a specific time. That’s what drew me into finding out more about plain text in the first place. I predict that future historians will see that moment between a Word crash and the reopening of the file as the Anxious Bench that powered the plain-text Great Awakening.

    Reply
    1. ben Post author

      I should also say that I never would have started using gitit if I hadn’t seen you and a couple others making some use of it: I’ve experimented with some other personal wikis in the past, but it has some definite advantages.

      Although I can’t stand that you need a commit message to save a draft, so I edit all the .page files straight in emacs.

      Reply
      1. Caleb McDaniel

        I do most of my editing of Gitit files in Vim, too. I’ve also been learning Haskell (I know, I know) for basically the sole purpose of being able to help with development on Pandoc and Gitit. Especially Gitit, which doesn’t have nearly as robust a user-developer community as Pandoc.

        In some ways, I think Gitit (or Gitit2) could be close to the killer app you’re thinking of, especially if one could have a hosted option like with Omeka.net so that the average user wouldn’t have to get it installed and running. Since you can convert to any format Pandoc converts to with the drop-down “export” menu, it can also serve as a kind of GUI for Pandoc. And it would take out the need to learn an editor like Vim or Emacs since you could edit in browser. (Lincoln even has a way of using the Ace editor with Gitit.)

        Of course, one can’t get very far with Pandoc without command-line options and templates, which Gitit doesn’t allow as of yet. So maybe, with some of the features you describe added, it could be a good gateway drug, if not a killer app.

        Reply
  3. Simon Gadbois

    I think the Mac platform is where Markdown is blooming now. With MultiMarkdown, most of your worries would be disappearing. MultiMarkdown Composer 3 (to be released soon) has many features that will make it very appealing to academics.

    Reply

Leave a Reply to Caleb McDaniel Cancel reply

Your email address will not be published. Required fields are marked *