Please enable JavaScript in your browser.

Problematizing the Idea of Data in the Beehive

by David Ragnar Nelson

Did Pastorius think about data like we think about data?

Pastorius’s Beehive manuscript, with its countless cross-references and indices, is a clumsy physical book. To navigate successfully the connections he makes between entries, the reader must laboriously search through the three volume work, searching for the correct reference. A digital edition that maps these connections automatically for the user thus feels like an ideal format for the Beehive.

Many scholars have invoked digital metaphors when discussing the work of reference in the Beehive. For example, Anthony Grafton compares the associations Pastorius draws between entries to the hyperlink:

“Associations served Pastorius very much as hyperlinks served us, both tempting and enabling him to leap from one text or subject to another. Every small tag from a great text had associations for him, which spurred him to call up and write down passages from other texts.” (6)

In creating annotations for each entry in the Beehive, we are essentially creating hyperlinks that will allow the user to navigate quickly between two related, but physically different, ideas. For example, Pastorius’s entry for “Cross” in the alphabetical section references “adversity” and “Sufferings” as well as numerical entries 498 (Sufferings) and 1505 (Cross). Ideally, the user will be able to quickly navigate between these various entries, as when using hyperlinks on a website.

Pastorius's entry for "Cross"

Here, the digital form fo the hyperlink works well for the manuscript because we can navigate between entries. Pastorius’s cross reference “Sufferings” directs the user to a discrete entity in the book, namely an entry. The user will be able to seamlessly call up this entry and compare.

Pastorius's entry for "Sufferings"

As we can see, “Suffering” comes with its own set of cross references that pull the reader off in other directions. Being able to navigate these quickly and easily will be a huge aid to researchers.

However, in creating these annotations, we are not only creating hyperlinks. A hyperlink is a good model for the work of the Beehive because it links the user directly from a distinct word on one page to another page. This means hyperlinks do not immediately create reference between two discrete points. Rather, they create reference between one discrete point—the single word on one page—to one larger point—an entire entry. In creating annotations, we are simultaneously creating data. And the data we create more significantly depart from the form of Pastorius’s Beehive.

Sticking with the example of “Cross,” the entry for “Cross” would have four data points: “adversity,” “Sufferings,” “Sufferings,” and “Cross.” One can see how mapping out the data in this tagged way easily shows the semantic resonances Pastorius tries to draw out in the Beehive. But let’s look more closely at entry 1505.

Pastorius's entry "1505"

We see entry 1505 is actually three entries in one. Here, we need to decide whether Pastorius meant “Cross,” “Conspiracy,” or “Popish plots.” As “Cross” is identical to the reference we are looking for and the other two have no obvious connection to “Cross,” we can safely assume Pastorius meant to reference to “Cross.” In our data set, we’ll have a node connecting the alphabetical entry “Cross” with the numerical entry “1505 [Cross].”

In some numerical entries, however, Pastorius does not provide us with entries that create neat nodes connecting one entry to the other. Let’s take, for example, the index entry for “insurance”:

Pastorius's index entry for "insurance"

Here, Pastorius directs us to entry 914, which we see below:

Pastorius's entry "914"

Here, the choice is not so obvious. Pastorius lists “Security,” “Security, he that promises for another,” and “Safety.” All of the words show a tight semantic coherence among themselves, and any of the three could conceivably have something to do with “insurance.” Since we see the word “insurance” in the reference for “Security, he that promises for another,” we assume Pastorius meant to refer specifically to this entry and create a node between that entry and “insurance” in the index. But in the physical book, the reader could easily let her eyes wander between the various entries and see the connections between them. With the idea of the hyperlink, where we link to a whole entry, this is still possible. But the discrete data points do not allow for this complexity. We could create separate nodes for all three entries, but then we might be creating a stronger connection than Pastorius intended.

The index entry for “Kropf” is similarly complex.

Pastorius's entry "Kropf"

The index entry for “Kropf,” meaning goiter or canker in German, leads to entry 681, which we see below:

Pastorius's entry "681"

Here, our choices are “Neck,” “Throat,” “Wen,” and “Rickets.” Again, we see the word “Kropf” in the entry “Wen,” so we use that for our annotation. Nonetheless, it is impossible to say that Pastorius did not mean for the reader to see the possible connections between all the words listed here.

As Tobias Blanke, Elena Pierazzo, and Peter Stokes have remarked, a digital manuscript is a representation of the manuscript, not the manuscript itself. In creating these representations, we allow the user a different way to experience the work. Some of these interfaces may make using the manuscript easier. But our representations are simultaneously interpretations of the work. In making decisions about how best to display the work or what metadata to create, we interpret and potentially modify the work. Thus we cannot confuse the data our annotations create for the manuscript itself. The nature of reference in the Beehive is more complex than current digital models can account for. This does not mean we cannot use the data or that the data does not tell us valuable information about the manuscript. However, it is important to remember the nature of reference and association in the physical manuscript remains more complex than whatever model the data create. The experience of viewing digital annotations, spreadsheets of data, or data visualizations of the Beehive will never substitute completely the experience of working with the manuscript itself.

Works Cited:

Blanke, Tobias, Elena Pierazzo, and Peter Stokes. “Digital Publishing Seen from the Digital Humanities.” Logos 25, no. 2 (2014):16-27.

Grafton, Anthony. “The Republic of Letters in the American Colonies: Francis Daniel Pastorius Makes a Notebook.” American Historical Review 117, no. 1 (February 2012): 1-39.