The idea
Ultimately I suppose the inspiration for the idea of SOTU-db goes back to my own high school career, when I certainly would not have found working with the text of a State of the Union address to be an engaging activity. More recently, though, my experience teaching high school social studies really made me treasure well-made online learning tools. In a not-uncommon experience, I would get excited by an online resource and plan to take my class to the computer lab with a specific activity in mind. Once in the lab with computer access, students immediately set to work logging into Facebook, Instagram, Twitter, etc.
I understand the appeal of these platforms, and often found myself thankful that these services were not readily available via pocket-sized devices when I was a high school student -- I'm sure they would not have helped with my own engagement in my schoolwork. Though connecting with friends and being part of that social milieu is a key appeal of these products, it's also undeniable that the web interfaces for interacting with these platforms are far superior to many of the digital tools I was trying to push onto my students. The New York Times's 2010 census map tool could be an amazing learning opportunity - but the Flash interface made loading it on different devices unreliable and inconsistent, the page is heavy and often crashed or stopped working under heavy load, and the interface is simply too different from the expected Google Maps-like experience to hold students' interest.
This was discouraging to me. What if students could log in to the census map, track their progress, see notification badges letting them know they have tasks left undone, insights left unexplored? What if the map offered suggestions or nudges for users to explore in a particular way, or try out a certain feature? Could student engagement be improved? Questions like this are a big part of how I found myself as a graduate student in Digital Humanities. Creating a tool that teachers can effectively use in diverse classrooms is a major goal for me, because it's something I would have appreciated more of when I was teaching.
The New York Times 2010 Census Map online tool. |
The idea to work with State of the Union addresses in particular came about as I was watching President Trump's 2018 "State of the Union" address. As Head of State, President Trump's words were being symbolically decoded and analyzed all over the world. Particularly during an administration marked by instability, President Trump's ability to conform to the expectations of office and to deliver an address with clear, coherent messaging seemed urgent. At the conclusion of the address, I felt Trump had successfully conformed to the aesthetic and ceremonial expectations of the evening, without saying much in the way of substantive policy or ideological goals. This was far from satisfying as someone looking for clues about the trajectory of this administration. Was there a way to cut through the noise and see what set Trump's speech apart from other State of the Union addresses? Could we compare his word choices and topic selections with previous presidents to learn more about the priorities of this administration?
From my perspective as a student in Digital Humanities, these questions seemed perfectly suited to the application of digital tools for scholarship and criticism. On a basic level, it seemed obvious to use tools for textual analysis (Voyant Tools, R) to analyze the specific word choices of the January 30, 2018 address. In fact, interesting and insightful analyses of this type had already been conducted and are available online:
I envisioned a project that could put some of this together into one platform and open up a kind of exploratory, playful engagement with the texts. Inspired by the simple, fun digital tools I've explored already in my career as a Digital Humanist (see the list DH tools below), I wanted to create my own platform for searching, comparing, analyzing, and visualizing the text of these highly symbolic acts of communication: the annual address (or State of the Union). This, too, has already been done, at a site called "SOTU."
- "Analysis of Trump's State of the Union Speech, with R" on Revolutions blog
- "'I Have The Best Words.' How Trump’s First SOTU Compares To All The Others" from Buzzfeed, with links to source code.
- "The state of our union is … dumber: How the linguistic standard of the presidential address has declined" from The Guardian
DH Tools:
How is my own project distinct from SOTU? At the most basic level, SOTU-db is different because I am creating it. As I outlined in my blog post about project goals, the goals for the SOTU-db product align nicely with my own goals as a developer and researcher. Therefore, even if SOTU-db offered no additional or superior functionality to "SOTU," it would still be a worthwhile project for me. But there are real differences between what I envision for SOTU-db and the "SOTU" site.
First, our goals and audiences appear different. "SOTU" dedicates one of its five major tabs to an "essay" entitled "The {Sorry} State We Are In." Provocative and subjective, the essay strikes a tone I hope to avoid on SOTU-db. I hope SOTU-db is equally useful as a classroom tool and as a resource for professional and academic research; I don't feel an essay of this nature would help SOTU-db achieve that goal.
Secondly, "SOTU" relies for its analysis on frequency counts of words ("Statistical Methods" appendix). I am certainly interested in this and plan to use this for major parts of SOTU-db. But my interest goes far beyond this. I am not only interested in which words are unique, but which common words have been used in which contexts, how adjectives and connotations are used, and patterns that might be visible among and within presidential terms. For example, when presidents have used the words "Americans," what adjectives or actions have they ascribed to Americans? Does the answer change depending on historical period, political party affiliation, or whether the US was at war or not when the speech was delivered? These are the type of questions I hope to enable users of SOTU-db to answer -- and, importantly, the type of question I want to encourage them to ask.
![]() |
A mobile version mockup of the main SOTU-db landing page |
The Addresses
"State of the Union Addresses and Messages" at The American Presidency Project by John Woolley and Gerhard Peters appears to be the authoritative online resource for State of the Union addresses by US presidents. As they explain there, the tradition for delivering the "State of the Union" to Congress have changed over time, such that referring to them all as "speeches" or "addresses" is probably technically incorrect. Additionally, in recent decades, American presidents have often delivered an annual address at the beginning of their terms. As Peters explains,
I concur, in part, and plan to include these "non-official" SOTU addresses in the project (and to continue referring to them as SOTUs - at this phase, if the American Presidency Project lists it on its "State of the Union Addresses and Messages," I include it in SOTU-db and call it a SOTU). But that nagging phrase, "should be the same," is precisely the type of question SOTU-db should be able to help answer."For research purposes, it is probably harmless to categorize these as State of the Union messages. The impact of such a speech on public, media, and congressional perceptions of presidential leadership and power should be the same as if the address was an official State of the Union."
The rise of incoming presidents delivering a pseudo-SOTU at the beginning of their term is relatively new (only since Reagan) and also coincides with an end to the tradition of presidents delivering a SOTU early in the year after elections that voted a new president into office. It would not be surprising to find material differences in the word choices of new presidents only weeks into the office as compared to other addresses given years along into a presidency. Likewise, it is not inconceivable that presidents about to leave office within weeks would speak differently than those just beginning or in the midst of their terms. Can we isolate these speeches and see what words, topics, and styles stand out? Questions like this help to motivate me and to structure the project in a way that encourages users to ask and find answers for questions like this.