LIS432 Cultural Heritage Informatics SPR16: Crowdsourcing, Archives and Museums--Two Examples: NARA's Citizen Archivists and the V&A's "Search the Collection"

Crowdsourcing, Archives and Museums—Two Examples:

NARA's Citizen Archivists and the V&A's "Search the Collections"

The trickiest part of this assignment (to me, at least) was finding crowdsourcing projects that were (a) current, and (b) worked within the context of it; perhaps I was looking in the wrong places, but I seemed to find many more examples of how to set up crowdsourcing projects for your LAM institution than I did projects that were ongoing, did not require a large investment of time, and were not primarily curatorial in nature. While the Museum of Fine Arts asking museum goers to select the most representative Impressionist painting from its collection through a process of elimination voting is one way to get the community involved in the museum’s activities, I’m not convinced it really counts as making an active contribution. Another project that attracted a great deal of attention is the New York Public Library’s “What’s on the Menu;” unfortunately, it appeared that most if not all of the menus have already been transcribed, and for some unknown reason I found navigating the site to help with proofreading (the only step left for most, if not all, of the menus) all but impossible. I finally settled on the following two projects: the National Archives’ (NARA’s) Citizen Archivist Project for the archival example, and the Victoria and Albert (V&A) Museum in London’s “Search the Collections” Project for the museum example.

1. The National Archives’ Citizen Archivist Project

From the front page of NARA’s web site at www.archives.gov, click on the link “Information For… Citizen Archivists,” which brings you to http://www.archives.gov/citizen-archivist, the front page of the Citizen Archivist Dashboard. The mission statement, or at least part of it, is this: “One day…All of our records will be online. You can help make it happen,”[1] and depending on your time, talents, and inclination, there are a number of ways in which you can aid them in this pursuit. Once you’ve set up an account, which is very easy, you can check out the list of user projects, or “missions,” pick out one that seems appealing, and tag, transcribe, subtitle videos, and upload and share documents and photos (you’ll need to set up a Flickr account to do this last one). If, like me, you choose to do transcription, you’ll find some helpful hints at https://www.archives.gov/citizen-archivist/transcribe/tips.html; another link that perhaps applies more to those contributing tags and comments, but which should be read by everyone, is the Citizen Contribution Policy at https://www.archives.gov/social-media/policies/tagging-policy.html, which is clearly aimed at keeping the project free of spam and threatening, harassing, or offensive language. One more interesting aspect of the project is the History Hub, a six-month crowdsourcing platform pilot program for American history where citizen archivists, academics, archival professionals, amateur historians, and other likeminded people can meet, work together, and share information—“[T]hink of it as a one-stop shop for crowdsourcing information related to your research subject.”[2] NARA goes on to say:

“The National Archives aims to connect with and better serve customers interested in the historic records we hold. We are launching the History Hub as a limited 6 month pilot project so that we can test its usefulness as a crowdsourcing platform. We hope to apply what we learn to a longer-term solution that can be used by federal government agencies and other interested organizations looking to expand public participation.”[3]

The interface for actually getting to the missions once you’ve actually logged in via the “Transcribe” page isn’t especially intuitive; you have to go to the “What’s New?” section of the menu, scroll down and find the link referencing citizen archivists, and click on it. (To add to the confusion, the page says “Research Our Records” at the top, but it’s not the same “Research Our Records” page that you get when you click on “Research” using the menu; it really shouldn’t be this complicated.) Once you’ve jumped through that hoop, however, it’s reasonably easy to click a few more buttons to find a mission that looks interesting and get down to work.

I decided to transcribe captions on National Forest Photographs, went to “Historical Trees—California,” and immediately ran into a problem: you don’t find out until you’re actually in a particular mission and looking at an individual photograph whether or not its caption has already been transcribed or if it has already been tagged unless you click on the “View/Add Contributions” button; if it’s missing one or the other (usually they seem to need tagging rather than transcribing), it’ll still be listed, waiting to fool unsuspecting transcriptionists. After going through this same routine with several other missions, and becoming progressively more frustrated, I finally landed on “Historical Buildings—Wisconsin,” and was finally able to transcribe several captions successfully.

Overall, I found the NARA Citizen Archivist Project interesting, and, once I worked through the few roadblocks, the format was easy to use. While one’s individual account keeps track of how many transcriptions, tags, and comments one has made, broken down by month, year, and all time, I was unable to find any such program-wide overview of the citizen archivists’ contributions to NARA’s mission of being the nation’s archive of record—surely it’s been a significant contribution over the years, and particularly with the rise on the online citizen archivists.

2. Victoria and Albert (V&A) Museum Crowdsourcing Search the Collections

According to its website, “[a]s the world’s leading museum of art and design, the V&A enriches people’s lives by promoting the practice of design and increasing knowledge, understanding and enjoyment of the designed world,”[4] so it’s hardly surprising that a major component of the site’s search function would be visual. After all, many people might not remember the name of a given artist, but show them an image of a particular work and they may well recognize it.

The new version of the V&A’s “Search the Collection” function contains over 140,000 images of works made in almost every imaginable media, all reduced to small 2-D images to be viewed on a computer, tablet, or even cell phone screen. Since the images in question were chosen automatically, the feeling is that many of them may not be the best possible view that could be displayed within certain parameters (all images must be cropped into a square in order to fit the “Search the Collection” homepage format), and that all of them should be re-examined with an eye toward selecting the ones with the best details that are most likely to help the searcher find the artwork in question. This, of course, is where crowdsourcing comes in.

The first step is to go to http://collections.vam.ac.uk/crowdsourcing, set up an account, and log in; this brings you back to the main crowdsourcing page, but with a slight difference: you can now see at the bottom of the page a bar graph showing how many images have been processed out of the total that require processing. Click on “Click here to begin,” and a new page comes up with five slightly different “crops,” or views, of the same image; usually the first one leaves the most space at the top of the image, cutting off part of the bottom, whereas the fifth image is the reverse, cutting off the top and leaving room at the bottom. Select the one you think best shows off the word (it’s usually, but not always, the one in the middle), and click on it. You then go to another screen with five different versions of the same image, but this time the difference is the zoom level: it ranges from a normal view to a tight close-up. Again, select the one you think best illustrates the image—I generally tried to strike a balance between including as much of the original image as possible with showing a fair amount of close-up detail—and click on it, and a different set of images of a new item pop up.

In addition to the images for clicking, there are two bar graphs near the top of the page; one is labeled “Our progress,” and for some odd reason it doesn’t seem to change (I suspect there may be a bug or badly written code at work here), the other is “Your contribution,” and it shows both how many individual objects you have processed, both in this particular session and overall. (Try as I might, I haven’t been able to find anything similar for other volunteers, other than the general progress bar chart on the main crowdsourcing page; it also doesn’t appear that there’s been a great deal of work as of late, since I seem to be the only person to actually process any images over the past week or so.) While there is generally a very brief identification of the object in the images (“Postcard”), if you click on a button labeled “Need to know more about this object?” it will take you to another page which describes it in considerable detail; I found this to be quite helpful in certain cases, especially those involving photographs of buildings, as it helped give context to the images. If you feel that none of the images are quite right, you can click on “No good image” or “Skip this object,” but I found that if you do that, it’s impossible to go back for a second shot at it—it disappears, never to be seen (at least by you) ever again.

While in many ways this was a much easier undertaking than transcribing photo captions for NARA, I would have to argue that it’s not quite the child’s play it might at first appear. For one thing, you need to have a reasonably good eye for detail and sense of balance and perspective. (One of the aspects of this project that attracted me was that I like to think I do have a good visual sense, and having a minor in Art History doesn’t hurt, either.) Deciding which parts of an image should be highlighted, and which can safely be downplayed or even cropped out, also takes a certain amount of consideration; I spent several minutes studying a caricature of an actor which had been signed by the performer himself, debating whether including his entire head or his entire autograph (which was rather large) was most important, and cursing silently that the V&A had insisted on going with a square format instead of a rectangular one. (I went with the head in the end, since he would have looked rather foolish missing the top of it, and the description of the item did include his name and the fact that it was autographed.) Another item was a heavily embroidered jacket—was the important part the jacket as a whole, or the embroidery, and which should be highlighted? I tried to strike a balance, and think I succeeded, but it’s definitely trickier than it first appears.

3. Final thoughts

Looking at both projects, I see both projects being sustainable for some time—I would say “until the material that needs processing runs out,” but somehow I don’t see that happening for a very long time, in NARA’s case; it may happen eventually for the V&A (although, given the current rate of progress, not for quite a while), but since both institutions will doubtless continue to acquire new items, they could keep going for quite a long time. Inviting civilians (as it were) to pitch in and help with processing is certainly one way to help create closer ties with their institutions; I know from firsthand experience that seeing a project succeed that you played a role in, no matter how small, is a nice little ego boost. My main concern here, at the risk of being cynical, is that crowdsourcing could all too easily slide from “welcoming people into the hallowed halls” to “hey, free labor to exploit!”; some companies have already displayed leanings in that direction, and human nature being what it is, I’m not naïve enough to believe that no one else would do likewise. (I’m looking at you, Jeff Bezos and your “Mechanical Turk”…) I also didn’t see any indication that anyone else is reviewing the crowdsourcers’ work; presumably someone is at NARA, and I hope so at the V&A, but I didn’t see any specific references to this, and while I know that proofreading normally takes a lot less time than doing it all from scratch yourself (then again, you haven’t seen some of the work I’ve seen), I can easily see a situation that ties in to my “free labor” concern, where corners are cut for the sake of saving money and quality eventually suffers. If we keep the potential problems in mind, however, and work to avoid any such problems, then I think there’s no reason that crowdsourcing couldn’t continue to be as successful in the future as it is today.

[1] National Archives. Retrieved from http://www.archives.gov

[2] History Hub, National Archives. Retrieved from http://www.archives.gov/citizen-archivist/history-hub/

[3] Ibid.

[4] “About Us,” Victoria and Albert Museum. Retrieved from http://www.vam.ac.uk/page/a/about-us/

LIS432 Cultural Heritage Informatics SPR16

Sunday, April 10, 2016

Crowdsourcing, Archives and Museums--Two Examples: NARA's Citizen Archivists and the V&A's "Search the Collection"

No comments:

Post a Comment