Ben Schmidt
I am Vice President of Information Design at Nomic, where I am working on new interfaces for interpreting and visualizing embedding models. For several years before that, I was a professor in the history departments at Northeastern and then NYU, where I worked with and led digital humanities groups deploying new approaches to data analysis and data visualization to help people think about the past. I have also write about higher education (teaching evaluations and humanities policy), narrative anachronism and plot structure, and political history.
I live in Manhattan.
For a third-person bio or photo, click here.
Recent Blog Posts
This is a Twitter thread from March 14 that I’m cross-posting here. Nothing massively original below. It went viral because I was one of the first to extract the ridiculous paragraph below from on the release of GPT-4, and because it expresses some widely shared concerns.
I think we can call it shut on ‘Open’ AI: the 98-page paper introducing GPT-4 proudly declares that they’re disclosing nothing about the contents of their training set.
Recently, Marymount–a small Catholic university in Arlington, Virginia–has been in the news for a draconian plan to eliminate a number of majors, ostensibly to better meet student demand. I recently learned the university leadership has been circulating one of my charts to justify the decision, so I thought I’d chime in on the context a bit. My understanding of the situation, primarily informed by the coverage in ARLNow, is this seems like bad plan,1
I sure don’t fully understand how large language models work, but in that I’m not alone. But in the discourse over the last week over the Bing/Sydney chatbot there’s one pretty basic category error I’ve noticed a lot of people making. It’s thinking that there’s some entity that you’re talking to when you chat with a chatbot. Blake Lemoine, the Google employee who torched his career over the misguided belief that a Google chatbot was sentient, was the first but surely not the last of what will be an increasing number of people thinking that they’ve talked to a ghost in the machine.1
I attended the American Historical Association’s conference last week, possibly for the last time since I’ve given up history professorin. Since then, the collapse of the hiring prospects in history has been on my mind more. See Erin Bartram, Kathryn Otrofsky and Daniel Bessner on the way that this AHA was haunted by a sense of terminal decline in the history profession. I was motivated to look a bit at something I’ve thought about several times over the years: what happens to people after receiving a PhD in history?
The collapse of Twitter under Elon Musk over the last few months feels, in my corner of the universe, like something potentially a little more germinal; unlike in the various Facebook exoduses of the 2010s, I see people grasping towards different models of the architecture of the Web. Mastodon itself (I’ve ended up at @benmschmidt@vis.social for the time being) seems so obviously imperfect as for its imperfections to be a selling point; it’s so hard to imagine social media staying on Rails application for the next decade that using it feels like a bet on the future, because everyone now knows they need to be prepared to migrate again.
I’m excited to finally share some news: I’ve resigned my position on the NYU faculty and started working full time as Vice President of Information Design at Nomic, a startup helping people explore, visualize, and interact with massive vector datasets in their browser.
When you teach programming skills to people with the goal that they’ll be able to use them, the most important obligation is not to waste their time or make things seem more complicated than they are. This should be obvious. But when I’m helping humanists decide what workshops to take, reviewing introductory materials for classes, or browsing tutorials to adapt for teaching, I see the same violation of the principle again and again. Introductory tutorials waste enormous amounts of time vainly covering ways of accomplishing tasks that not only have absolutely no use for beginners, but which will confuse learners by making them
It’s not very hard to get individual texts in digital form. But working with grad students in the humanities looking for large sets of texts to do analysis across, I find that larger corpora are so hodgepodge as to be almost completely unusable. For humanists and ordinary people to work with large textual collections, they need to be distributed in ways that are actually accessible, not just open access.