LIS432 Cultural Heritage Informatics SPR16: Crowdsourcing: The Royal British Columbia Museum and NYPL Labs' What's On the Menu?

Introduction

The Royal British Columbia Museum

The Royal British Columbia Museum is a natural and cultural historical museum and archive in Victoria, British Columbia, Canada. It houses over 7 million artifacts (Royal British Columbia Museum, 2015).

Selection

I chose the Royal British Columbia Museum [RBCM] Transcribe project because it gave me the opportunity to transcribe personal journals. I had done this type of transcription before as part of the Helen Keller archive, but that project had ended before I started work on this assignment. So the RBCM it was!

I selected the personal journal of Martha Douglas Harris (1854-1922), the youngest daughter of Sir James Douglas, first governor of British Columbia. This journal, kept by Martha Douglas from 1872-1873, covers her trip from her home in Victoria, through the United States and to England and continental Europe (“Royal BC Museum | Transcribe,” n.d.)

Interface and Usability

The first thing a user sees is a series of thumbnails for various collections of documents, with an image and title giving an idea of what sort of documents the collection contained. This encourages users to choose the type of documents they are most interested in working with, which I imagine makes the process more enjoyable and could cause users to remain on the site longer.

The interface allowed users to see a thumbnail of the pages prior to selecting one to work on, which permitted me to choose something I had adequate time to finish in one sitting. The interface was fairly typical, consisting of a document window with arrow keys for moving around the page and a magnifying glass icon for zooming in on the image. I found that I was able to navigate these pretty easily, though the site response was slow enough that it was hard to determine when I had moved or zoomed enough, because the page would catch up several seconds after I stopped. The next section of the page was for entering my transcription, which consisted of nothing but a big white box.

Here's what I entered:

Receipts of Money

1872 $          cts

Aug     11        Received from Galen                                     75        .

"           18 [struck] Recieved from Bassenger 144      00

"           28 Bassenger Ja wth Received f B.H.A $100 00

"           " Received from GR Patton $5 00

Sept     17 Received from [Struck] L S "

Joseph the sum of 50        "

28

144

100

5

------

277

   5

------

282

I clicked a “?” icon, which led me to a page describing WIKI-style markup, so that gave me the idea that I should attempt to format my work using that markup. It was very useful for mimicking strikeouts on the original page, but more complex formatting options from the help page did not seem to render as expected. Not only are the data items not aligned at all, but the dashes that should create a horizontal rule created text boxes around some numbers, instead.

This was frustrating, since the page I was transcribing was not formatted like a journal, with sentences and paragraphs. I needed to be able to use columns, and even depict an arithmetic problem. The current implementation of transcription formatting just does not handle table or column organization at all. This problem is likely to have affected only a few pages in the journal, but I imagine other types of historical documents might contain many tables, calculations, or other non-block formats. For those instances, the transcription experience are both confusing and frustrating.

The project could benefit from the addition of more scaffolding, which librarian and blogger Trevor Owens asserts “allows interested amateurs to participate…through their actions on the website, without needing the skills and background of a professional” (Owens, 2013). This would probably look like creating a new web form which broke the task of transcription down into smaller chunks, in order to take formatting worries away. It probably wouldn’t be worthwhile for the small number of pages in this database that are tabular in nature, but the idea is very important to think about when planning a crowdsourced project. Owens puts it this way: “What expertise can we embed inside the design of our tools to magnify user efforts? How can our tools put a potential user in exactly the right position, with the right knowledge, just at the moment he or she needs it, to accomplish a given activity”(Owens, 2013).

There are no points, no head-to-head competition between transcribers, and no other overt "game" elements in the interface.

Data and Analysis

One interesting thing for me about this site was a total lack of apparent tracking data. No login was required, so users could not be tracked that way to determine how long they stayed on the site or what level of contribution they made to transcribing. Perhaps this tracking is being done behind the scenes using IP addresses, but no amount of searching their site, the RMBC blog, or scholarly sources turned up any tracking or mention of how the project was going. It launched in 2015, so it's possible that the figures just haven't been released yet.

There is no publicly available evidence that RBCM plans to make the dataset public; this doesn't really surprise me, since the tradition in the museum world is not so free and open as that of the library world. It will be interesting to see what they do release, since the project is reaching its anniversary right now. After the grant year has finished and the project either closes or becomes supported through other means, it seems likely that more information will come to light, either as formal published work, or in the blog and other publications released by the museum.

Goals and Sustainability

The goal of the project is very simple: "to improve the Royal BC Museum and Archives’ public accessibility by turning handwritten, audio, and video records into searchable data."(“Royal BC Museum | Transcribe,” n.d.) On first glance it seems that the plan consists solely of leveraging volunteers for the purpose of transcribing the materials faster than could be accomplished in-house. It seems unlikely to be the only real goal, so future reports may reveal more.

NYPL Labs: What's on the Menu?

The What'sOn The Menu? project by NYPL Labs extends an earlier project to digitize the now 45,0000-item collection of menus begun by library volunteer and schoolteacher Miss Frank E. Buttolph (1850-1924). This digitization project asked users to transcribe and verify the contents of many vintage and historical menus from all around the city and other places, too.

Selection

I selected it because in the area of user engagement and interaction it is very like RBCM's Transcribe, while it is very different in transparency, data availability and maturity.

Interface and Usability

The interface was very similar to the RMBC site, but the experience of working with it was more pleasant because the server was more responsive. Another issue that made transcribing easier for this project is the fact that menus are typeset rather than handwritten; I am getting better at reading historical handwritten documents, but it's still a challenge compared to printed text.

Because the approach of this project is more granular in nature, asking for transcription of individual dishes rather than entire menus, the process of transcription is much less frustrating. I don't need to attempt to format the transcription at all, so I'm free to focus on the content. I recognize this now, having read Owens, as a “scaffolded” interface (Owens, 2013).

The menu project is much older than the RMBC one, so many of the menus have been transcribed at least partially. Most of the work with which users are initially presented consists of verifying existing transcriptions, or pages that have very little text to transcribe. Perhaps verification is presented first because it is somewhat easier to do than transcription, allowing new users to ease into the process. The web page design here is more retro-styled, and uses cute old-fashioned clip-art hands to point out the best place to begin.

Again there is no overt gamification in the project. Transcribers work with the menus because they are interesting or because they want to help the library, and not for an artificial competition or reward.

Data and Analysis

Because this is a more mature project, there is information regarding usage numbers and the overall sense of progress. I was surprised to discover that there are "approximately 45,000 items" in the physical menu collection. Clearly there is a lot of transcription still to come! The top of the project's home page shows "1,332,229 dishes transcribed from 17,545 menus" to date, making it very clear to every visitor that the collection is big and that a lot of work has already been done(“Whats on the menu?,” n.d.).

The data set which has been generated by the "crowd" is also open for download and use by the public, in CSV format for use with spreadsheets, or as an application programming interface [API] for developers who may wish to use the data set from within their own projects(“Whats on the menu? Data,” n.d.). This is a very sophisticated approach that web developers will be comfortable working with, positioning the project to get outside the library even more than it already has.

The spreadsheet data focuses exclusively on the menus themselves; no user data is present. This shows a focus on the outcomes — more digitized and verified menus — instead of who is using the project and how much time they are spending. True to the bragging on the project’s home page, the spreadsheet data comprises 4 separate tables and includes almost 1.5 million individual menu items, from over 17,500 menus.

Goals and Sustainability

The project launched in 2011 with grant support, and still proceeds today, demonstrating strong sustainability. I wonder if this is a function of being part of a large and well-funded library system, or the press attention the project received, or simply the fact that people find the menus entertaining and useful. The WOTM about page speaks of the collection being used by chefs, historians, and novelists in their research ( n.d.).

Conclusion

Each of these transcription projects is a classic case of not needing to "gamify" the process, because the documents themselves engaged the volunteers. Therefore, there are no "points" associated with it, and no explicit rewards for completing transcriptions. I have played with other projects that do use things like points and levels to support user motivation, and I imagine that might increase the length of time a user spends there, especially people who really enjoy gaming.

References

Owens, T. (2013). Digital Cultural Heritage and the Crowd. Curator: The Museum Journal, 56(1), 121–130. http://doi.org/10.1111/cura.12012

Royal BC Museum | Transcribe. (n.d.). Retrieved April 12, 2016, from http://transcribe.royalbcmuseum.bc.ca

Royal British Columbia Museum. (2015). Royal BC Museum Corporation Statement of Financial Information for the Year Ended March 31st, 2015. Retrieved from http://royalbcmuseum.bc.ca/assets/CSCD-RBCM-2014-15-FINAL-MINISTER-APPROVED-ASPR.pdf

Whats on the menu? About Page. (n.d.). Retrieved April 12, 2016, from http://menus.nypl.org/about

Whats on the menu? Data. (n.d.). Retrieved April 12, 2016, from http://menus.nypl.org/data

Whats on the menu? (n.d.). Retrieved April 12, 2016, from http://menus.nypl.org/

LIS432 Cultural Heritage Informatics SPR16

Tuesday, April 12, 2016

Crowdsourcing: The Royal British Columbia Museum and NYPL Labs' What's On the Menu?

Introduction

The Royal British Columbia Museum

Selection

Interface and Usability

Data and Analysis

NYPL Labs: What's on the Menu?

Selection

Interface and Usability

Data and Analysis

Goals and Sustainability

Conclusion

References

No comments:

Post a Comment