Friday, January 18, 2008

Chris Wehner's SoldierStudies.org

This is the first of two reviews of similar transcription projects I wrote in correspondence with Brian Cafferelli, an undergraduate working on the WPI Manuscript Transcription Assistant. In this correspondence, I reviewed systems by their support for collaboration, automation, and analysis.

SoldierStudies.org is a non-academic/non-commercial effort like my own. It's a combined production-presentation system with simple but effective analysis tools. If you sign up for the site, you can transcribe letters you possess, entering metadata (name and unit of the soldier involved) and the transcription of the letter's text. You may also flag a letter's contents as belonging to N of about 30 subjects using a simple checkbox mechanism. The UI is a bit clunky in my opinion, but it actually has users (unlike my own program), so perhaps I shouldn't cast stones.

Nevertheless, SoldierStudies has some limitations. Most surprisingly, they are doing no image-based transcription whatsoever, even though they allow uploads of scans. Apparently those uploaded photos of letters are merely to authenticate that the user hasn't just created the letter out of nothing, and only a single page of a letter may be uploaded. Other problems seem inherent to the site's broad focus. SoldierStudies hosts some WebQuest modules intended for K-12 pedagogy. It also keeps copies of some letters transcribed in other projects, like letter from James Booker digitized as part of the Booker Letters Project at the University of Virginia. Neither of these seem part of the site's core goal to "to rescue Civil War letters before they are lost to future generations".

Unlike the pure-production systems like IATH MTD or WPI MTA, SoldierStudies transcriptions are presented dynamically. This allows full-text searching and browsing the database by metadata. Very cool.

So they've got automation mostly down (save the requirement that a scribe be in the same room as a text), analysis is pretty good, and there's a stab at collaboration, although texts cannot be revised by anybody but the original editor. Most importantly, they're online, actively engaged in preserving primary sources and making them accessible to the public via the web.

No comments: