Diagnostic tools – or – the pretty visualization is not the end

As the semester and my first graduate digital history class wind down, I’ve been thinking a lot about building DH things for investigation vs. argument.  There’s a lot of good work on tools-as-theory, and whether a digital thing can be a satisfying argument, and an upcoming conference on argumentation in the digital humanities – so I’m not the only one.

I also just finished writing 1-2 pages – maybe 1,000 words – based on a diagnostic tool that it took me over a month to build.  I’m hoping to spin what it tells me out into a longer article in future, but for now I thought I’d share it here, with some commentary on how I made it, what it told me, and why it is not an effective argument.

One of my book chapters is on a group of enslaved and free people in Richmond who raised funds for victims of famine in Ireland.  The First African Baptist Church of Richmond raised just under $35 in 1847. While the amount per congregant was low (the church listed thousands of active members, but many of them were not able to regularly attend because of their enslavement) the donation itself was relatively unique in the church’s history.  This was one of the first times that this congregation raised funds for people not connected with the church.  I have a much longer argument on the political work that this donation did, but I wanted to be able to make some concrete statements about congregants’ experiences in the 1840s.

This was helped by the church minute books, which recorded the names of baptized, excluded and restored members (there were a lot of exclusions for adultery in the 1840s) as well as the names of the men and women who owned the congregants who were enslaved.  So I built a network (using Gephi, which benefits tremendously from the recent update) that showed only relationships characterized by slavery, to see if any white Richmonders were particularly over-represented. (made with sigma.js and the Gephi plugin created by OII)

While some men and women owned more than one congregant, by and large this network was fairly diffuse.  Congregants obviously shared the religious and physical space of the church, but their relationships outside of the church did not seem to be conditioned by their enslavement by particular men and women. (There is an excellent and robust literature on enslaved people in urban spaces, resistance and community building, which I won’t recap here – but suffice it to say that scholars have charted many other ways of relating beyond ownership by the same person, and I assume those modes were at play in 1840s Richmond).

As I put together the database of congregants, I realized that many and unusual names (Chamberlayne, Poindexter, Frayzer, Polland, among others) recurred among both slaveholding and enslaved people.  So I made another network, this one assuming that people who shared a surname had some kind of relationship (this is not a 100% defensible assumption – some of the more common names might have been happenstance).  With those kinds of connections, the network (which includes all of the same people as above) becomes much more dense, with clusters that signify relationships based both in slavery and (most often coerced) sex.

It’s interactive!  It’s dynamic!  It’s a network!

It is not an argument.

At best, this is a tool that lets me locate an individual and see connections.  It relies on two kinds of relationships (and likely overstates the certainly of genetic relationships or previous ownership based on shared surnames).  It helped me to write two pages about the density of connections among black and white Richmonders, and bolster claims about the broader relationships that the First African Baptist Church was embedded in.  It remains an investigative tool.

I think it could be helpful, which is why I am putting it on the internet, but it does not constitute argument.  It does not even constitute analysis (that happened behind the scenes in R).  It did take – from the start of transcription to now – over a month to build.

Was it worth it?  Well, I was able to see connections among the 800+ congregants mentioned in the minute books from 1845-1847 that I would not have been able to see just by reading the names.  I was able to place individuals in a broader social context.  I wrote two pages.  I think that work like this can be tremendously generative, but either happens behind the scenes and only lives on a researcher’s computer, or is presented as the end of an investigative process. This is firmly in the middle of the investigation, but I suppose that has value too.

Quick note: Timeline of famine philanthropy

I’m sitting down to tackle my introduction, and wanted to say something specific about the timeline for famine philanthropy. Tableau helped to track the total number of donors by organization.  This is a better measure than the total amount of donations – at least until I go back and standardize British pounds and U.S. dollars, but it gives a good sense of time timeline of relief.

 

Streamlined Grading with Linked Documents

First comes the start of the semester; then comes grading; then comes the inevitable
wondering about how to make grading less of a chore
. I realized a few years ago that
much of my dislike of grading came not from an aversion to reading student work, or
even to writing comments. Rather, for me, it came from navigating the systems that
meant that I was spending more time collecting, archiving and returning student projects
than I was on giving thoughtful feedback.

I’d tried a number of tools meant to streamline grading, but rather than making things
easier, each seemed to involve an ever-increasing number of steps:
* Course Management Systems: students upload papers, I download them, comment
on them, grade them, upload the graded/commented document, enter grades into a
digital gradebook.
* E-mail submissions: students e-mail me papers, I save them, comment on them,
grade them, enter their grades into a gradebook, calculate other graded
components of the course, send an e-mail back with course component grades as
well as the commented and graded final project.
* Paper submissions: students give me papers, I mark them (with increasingly
cramped handwriting), grade them, handed papers back to students who were in
class the day grading was finished, subsequently fielded e-mails about course
grades and arranged meetings with those who hadn’t been in class to pick up
papers.

Last semester I tried something new. I created a series of linked folders and documents
which allowed for students to easily submit work, and for me to speedily grade papers,
leave comments, and give students ready access to their course work and grades. As a
result, my grading went faster, papers were returned more speedily, and I felt like I was
spending less time uploading, e-mailing and returning work, and more time crafting
actual feedback.

This took some set-up time on the front end, but the result streamlined the acquiring and
returning of papers. I used Dropbox and Excel,
in large part because my institution gave me access to them, but I think that something
similar could be developed using open source tools.

Here’s how it worked:
* At the beginning of the semester I created one folder for the class and saved it to
my personal Dropbox.
* Within that folder, I created one folder for each student.
* Within the class folder, I also created a master grading spreadsheet, which could
track the grading categories (participation, short papers, long papers etc.) for each
student.
* I then linked this master grading spreadsheet to individual spreadsheets for each
student. (All of this can be accomplished in Excel by highlighting a cell in the
individual student sheet, typing “=” and then navigating to the main gradebook
and selecting the matching cell.) Setting these sheets up was the most time-
consuming part of the process, and took about an hour for a fifteen person class.
* Each student’s spreadsheet was saved in that student’s folder, along with word
documents in which I could enter comments about weekly reflection pieces.
* Finally, I shared each folder with the student to whom it belonged.

While this took some time to set up, by the time the semester started it was possible for
students to upload work, and for me to grade papers, take attendance and track
participation without ever having to log into a CMS, and without ever having to send
students e-mails with their grades. When I updated the master spreadsheet, the individual
folders updated as well. Students could check in on their grades and comments whenever
they wanted, and my grading process was condensed to three steps:
i. Students upload papers
ii. I grade and comment
iii. I enter the grades in the master grading spreadsheet.

This, like any other, is not a perfect system. One of the biggest concerns raised by my
colleagues was that students could change the grades in their individual sheets. This
didn’t happen this semester, and even if it did, the ability to track changes in shared
folders would make it pretty easy to catch, but it still presents a possible administrative
hurdle. My system also forced students to learn new tools, on top of those already
required by the college to do similar tasks (for example, Moodle or Blackboard). Finally,
it required some considerable setup time up front, to get each of the student folders and
spreadsheets working properly.

At the end of the semester, I found that I preferred this approach to the grading systems
I’d tried. Students were able to access feedback more quickly. Because everything
happened within students’ individual folders, it was easy to be sure that feedback was
going to the right person, and I never had to worry about accidentally sending the wrong
paper or the wrong grade to the wrong student. While this system didn’t make grading
any more fun, it certainly mitigated against one of my biggest stumbling blocks for
getting grades done on time.

Digital History “From Below”: a call to action (and an abstract)

I’ll be heading to Kraków this summer for DH2016 – here’s the paper I’ll be giving.


 

Humanists – inclusive of digital humanists – are preoccupied with telling stories. Some of our most interesting subjects, however, have left only the barest of marks on historical records. Their stories are among the most captivating, but also some of the most difficult to access. This paper knits together recent trends in digital humanities practices that have helped us to elevate unrepresented voices with a discussion of how to elevate the marginalized within the DH community. It showcases select projects that undermine archival silences.[1]   It then argues that digital humanities practitioners should add these theories to the collection of tools currently used to forward social justice projects in DH spaces.

 

Elevating the Archivally Silenced

Various methodologies have been adopted to address the problem of how to tell stories about people who left behind few records.   In the 1970s and 1980s, practitioners of “history from below” worked to elevate narratives about “people with no history,” by chronicling the everyday lives of peasants and non-elites.   At the same time, practitioners of the “new social history” turned to cliometrics – and adopted methods that would be familiar to those who work with “big data” today – to highlight trends about marginalized peoples from historical data like censuses, probate records and financial documents.

 

There have been various resurgences and developments in these methods in the intervening four decades. These include practices of reading archives “against the grain” to get at the unstated assumptions that historical actors made about those they held power over.   They also include theoretical approaches that advocate the reading of silences to understand those whose voices were intentionally obscured by official recorders and gatekeepers.

 

Marginalizations Within DH

Questions about whose voices are elevated and whose are silenced have also long been a theme in DH scholarship and discourse. These questions seek to unpack the ways in which DH as a field is exclusionary. This former is a much (though still not enough) referenced problem in panels at former DH conferences, which have asked how DH research can address (and remedy) social problems.

 

Digital humanities scholarship has also begun to address problems of access within the broader DH community, and the barriers erected to women and people of color in particular. For example, Adeline Koh has argued that we need to examine the ways in which DH publics are constituted, in order to better understand the creation of “limits of the discourse that defines the idea of a digital humanities ‘citizen.’”   Similarly, Tara McPherson has argued that we must see the evolution of DH as a field shaped by structural inequalities – of race, class and gender – which accompanied the rise of computation technologies.

 

A Knitted View

These are much needed interventions, and help us to understand the evolution of our field as one in which certain groups have been marginalized and others have been centered. These conversations also mirror methodological debates within history about whose voices to elevate, and under what circumstances. This paper complements extant work by arguing that theoretical interventions concerning current structural inequalities must be brought to bear on the past, and that digital methodologies are ideally suited to elevating subsumed voices in the present. It further demonstrates that these projects, the theories that underlie them, and current work to make DH more equable should be read together to further the practice of digital history and humanities “from below.”

 

Bastian, J. (2003). Owning Memory: How a Caribbean Community Lost Its Archives and Found Its History. Westport, Conn.: Libraries Unlimited.

Bhattacharya, S. (1983). ‘History from Below.’ Social Scientist, 3–20.

Farge, A. (2015). The Allure of the Archives, New Haven: Yale University Press

Fuentes, M. (2010). Power and Historical Figuring: Rachael Pringle Polgreen’s Troubled Archive. Gender & History 22, no. 3: 564–84.

Gallman, R. (1977). Some Notes on the New Social History. The Journal of Economic History 37, no. 1: 3–12.

Koh, A. (2014). Niceness, Building, and Opening the Genealogy of the Digital Humanities: Beyond the Social Contract of Humanities Computing. Differences 25, no. 1: 93–106. doi:10.1215/10407391-2420015.

McPherson, T. (2012). Why Are the Digital Humanities So White? Or Thinking the Histories of Race and Computation. in Gold, M (ed) Debates in the Digital Humanities. Minneapolis, MN: University of Minnesota Press.

Trouillot, M. (1995). Silencing the Past: Power and the Production of History. Beacon Press.

 

 

[1] These might include work like Ben Schmidt’s, elaboration upon late twentieth-century cliometrics and use of “big data” methods to explore historical sources (http://benschmidt.org/projects/digital-humanities-research/); maps like Vincent Brown’s “Slave Revolt in Jamaica” which uses sources produced by slaveholders to argue for the agency and tactical prowess of enslaved people (http://revolt.axismaps.com/map/); and Michelle Moravec’s use of metadata to “unghost” lesbian women in the past (http://michellemoravec.com/).

d3.js + R > Gephi (or, why network analysis helps with history)

Gephi is a very useful tool.  I’m very much looking forward to the new release that seems always on the horizon.  In the meantime, though, every time I open Gephi it crashes, and then I dive down a long rabbit hole of trying to re-write the program code, and then I get angry and go home.  So I’ve been delighted to find that a combination of R (for manipulating and analyzing the data) and d3.js (for visualizing the data) does most of the work of Gephi with much less frustration.

I’ve been using Kieran Healy’s work on Paul Revere and network centrality and applying it to a cohort of men who served on the boards of philanthropic organizations in New York in the 1840s. I am particularly in the officers General Relief Committee for the Relief of Irish Distress of the City of New York. These men – Myndert Van Schiack, John Jay, Jacob Harvey, George Griffin, Theodore Sedgewick, Robert B. Minturn, George Barclay, Alfred Pell, James Reyburn, William Redmond and George McBride Jr. – were deeply politically connected, but don’t seem to have had much of a relationship to one another.

Healy’s script, and Mike Bostock’s d3 blocks helped me to build a matrix which tracked relationships between philanthropists via organizations, making note of the number of organizational connections that different pairs of men shared; and another matrix which tracked relationships between philanthropic organizations and social clubs via philanthropists, making note of the number of men that each organization shared.  I used the former to build a force-directed network diagram, which, in combination with some R based analysis, suggests that while the New York Famine Relief Committee officers didn’t often serve on other committees together, they shared other social connections.

For example Jonathan Goodhue was not a member of the famine relief committee, but served on other committees with nearly every General Relief Committee officer.  Of the New York famine relief committee members, Jacob Harvey was the most centrally connected member.  This data has pointed me in some new archival directions, but also give a much better sense of the ways in which people were connected to one another than comparable textual descriptions might do.

 

 

I also built a network diagram showing relationships among different newspapers reporting on the famine, which cluster newspapers more inclined to cite each other.

 

Digital Projects at the AHA (now with projects from THATCamp)

As many others have pointed out – on Twitter, in blog posts, and in person – this was a good year for digital at the AHA, and a great year for some wonderful and innovative digital projects.  I’ve been compiling a list of projects mentioned on Twitter and in panels, and I thought I’d share them here (in no particular order).  I’m sure it’s not complete- and that other wonderful digital projects were debuted and mentioned at the AHA.  I’d be happy to add to the list, if folks want to tweet their projects.

These are a few of my favorite maps

I’m putting together an aspirational syllabus for a digital humanities/mapping course, and have been thinking about my favorite maps, and why they work so well.  Here is a very-not-complete list of my current greatest hits:

Slave Revolt in Jamaica, 1760-1761: a cartographic narrative.

This is, by far, my favorite digital mapping project.  I’ve seen Vincent Brown speak on it, and I was quite impressed by his articulation of why we need a map like this to understand enslaved rebellion.  Because records of these uprisings tend to have been produced by ruling elites who were actively opposed to representing enslaved resistance as anything other than barbarous and futile, it would be easy to think that this uprising – and many others like it – were haphazard and poorly planned.  Brown’s map, on the other hand, reads the colonial archives against the grain to show us the strategy that underlay this revolt.  I love that he uses sources in which obscuring enslaved agency is a feature rather than a bug to highlight that agency.

Touring the Fire

A little less high tech, but still a great example of how a geospatial perspective can give us new, or at least different information about an historical event.  One of the persistent fictions about the Chicago fire is the culpability of Mrs. O’Leary’s cow, so it’s interesting to see how the fire spread, but also to treat the path of the fire like a walking tour, and to map it onto Chicago’s geography today.

London Soundmap

This is just ridiculously cool (and reminds me of a book I just finished about London’s underground rivers).  It borrows aesthetically from the iconic tube maps, but instead of information about subways gives us the sound of underground waterways.  There are some other great soundmaps on this site, including ambient London noise, the sound of the Thames estuary, and a handy map of the most common sounds in different parts of the city.  The whole thing is worth exploring.

While we’re talking about aural mapping…

Here’s a project which uses immigration data to create a true aural map of changes in American demography over time.

And finally, everything NOAA does, but especially their geospatial services.

Now it’s all about convincing the undergraduates that maps are cool…

On the embargo question

I’d been waiting to start this until my dissertation abstract was actually available through Proquest, but I’ve recently learned that it might take as few as 8 and as many as 20 weeks (~2-5 months) from the time NYU submitted the darned thing, which was three months after I submitted final revisions, to appear.  That’s between 5 and 8 months between my final version and the world of accessible-via-Proquest.  This, frankly, seems like something worth throwing into the debate about embargoing.  I embargoed my dissertation – largely because of some truly horrible stories I’d heard from people who found their work used by more senior scholars, or in more cases, had found that the archival road-map laid out in their dissertation had been used by someone else to publish more or less the same argument before they’d been able to get a book contract.  I know that there are many and varied reasons not to embargo, and I’m toying with the idea of asking Proquest to lift it, but as a junior, untenured scholar, the risks seem to outweigh the reward.

That being said, as an advocate for the digital humanities generally, and as someone who has benefited from open-source dissertations particularly (an aside – what I’d really like to see is for the AHA to have some mechanism for young scholars to make their dissertations widely available, and to work with acquisitions editors to create a culture where an available dissertation is almost never an impediment to a book deal.  The former, because I find it frustrating that my only venue for dissertation publishing is through a for-profit company) I’d like to make some of my work generally available to a wider audience.  It seems like conference papers are a good place to start.  For one, this is intellectual work that’s already been put out in a public space (though attendance at conferences and particular panels obviously varies), and for another, a lot of the material I’ve presented at conferences over the years has been excised from the dissertation, or changed so much as to make it truly different work.  In that spirit, I’ve created a new page here – one at which I’ll post selected delivered conference papers that aren’t a part of anything that’s currently in process or out for review.

I’d also like to say – both in this post and at that new page – that I’m happy to share my dissertation with anyone who wants to see it.  Just e-mail me and ask.  I know that this is far from the spirit of true open source access to academic work – but for me it’s a start.

(Live(ish))blogging the survey

First, a long digression:

I got to sit down with reps from Gale today for about five hours to talk about all of the tools they have for teaching.  Among some other fun things, we were able to test drive Artemis, which will eventually aggregate some? many? of Gale’s primary sources (or, what Gale markets as primary sources – a lot of things that aren’t sold as primary sources, like literary criticism, could still be useful in 20th century U.S. history classrooms, for instance), and what they’re calling “term clusters,” which is basically an interactive pie graph that shows the frequency of words that abut your search term.  It looks like it will be a pretty useful and robust search engine once everything is integrated, though like any archive it’s limited by what Gale’s editors acquire, how they subject index what they have, and (particular to digital archives) how well it’s been OCRed.

We were challenged to think about how we’d use Gale resources in the classroom, and there was a lot of talk about how having a not-infinite-but-still-pretty-vast universe of possible primary sources would challenge students to think more creatively about their topics, and how the analytic tools like term cluster will help students identify trends that they might not otherwise have seen.

Two thoughts:

1) Sometimes an infinite, or seemingly infinite universe can be a great thing.  When a student is working on a year-long senior thesis, having millions of pages of documents to draw from could be really productive.  But, there’s also something to be said for well curated small collections of primary sources, especially for introductory courses, where students aren’t sure how to even approach analyzing a primary source, let alone picking one about which they can make an argument that will sustain them throughout the paper writing process.

2) I’m always struck by the ways in which online databases or search engines try to replicate the functionality of a physical library.  I can’t count the number of times I’ve been told that looking in the metadata for a book’s subject, and then looking for the subject headings that immediately follow and precede the one you’re interested in is like browsing the shelves at a library.  I’ve heard similar things about the serindip-o-matic tool, as a way to replicate the lucky happenstance of coming across an unforeseen or mis-filed document in a brick and mortar archive.  I love the serindop-o-matic, and I’ve been doing the proximity subject searches since college, so I’m not saying that these are bad tools or workflows, but I wonder about how effective it is to try to replicate the research experience of a library online.  On the other hand, we know how to research in libraries and archives, and it doesn’t seem so wise to reinvent the research wheel if we don’t have to – but browsing by proximate subject heading, or looking for high frequencies of words that cluster around any given search term will never the same as browsing the stacks.  Finding a document through serindip-o-matic (which I love, by the way – I think that it’s a fantastic tool, and I’m not sure that it’s makers would characterize it as replicating the eureka in the archive experience, I’ve just seen it described that way) will never be the same as coming across a mis-filed pouch of heroin, say – or perhaps more likely, an archivist who knows you’ve been pulling stuff on one subject getting you something related that you hadn’t thought to ask for.

At any rate, I’m not sure if, or how I’m going to be using Gale’s, or anyone else’s online databases for teaching in the future.  For this imminent semester, I’ve settled on the Major Problems in American History reader, because I really like the interpretive essays, and find the transcription and gobbeting of documents by experts in the field, of say, colonial American history, to be far superior to anything I might do on my own, even drawing from a near-infinite corpus.

But as to the point of this post, I’ve been thinking that it might be a useful exercise in my own pedagogical development, and possibly a useful contribution to the conversation started on the Junto blog last week about teaching the survey, to periodically check in about this, my first time teaching U.S. to 1877 on my own.  Consider this the first post of that project, and if I’m really systematic perhaps I’ll go the digital document reader route next semester and compare notes.