Saturday, April 16, 2016

Crowdsourcing: Building Inspector and Shakespeare's World

Building Inspector

http://buildinginspector.nypl.org
"Kill Time. Make History." is the tagline of New York Public Library's Building Inspector, a joint project of NYPL Labs and the library's Lionel Pincus and Princess Firyal Map Division.  The public, crowdsourcing side of the project is a simple, engaging, and somewhat game-like website that encourages human input in harvesting data from digitized historical maps of New York City.  For someone like myself who loves public history and adores Sanborn Insurance Maps (and could happily spend all day with them), Building Inspector is an engagingly addictive and absolutely gratifying time-suck. 
The site was launched in late 2013 and, although it is apparently more robust today than it was at its inception, it is remarkably simple, beautifully streamlined, and compellingly easy to use.  The foundation of the project is NYPL's Map Division's collection of maps and atlases of the world and of New York City in particular.  Counting in the thousands, spanning decades and centuries, and detailing untold changes in the city's landscape over time, the collection offers an opportunity to bring it all together as a part of an even greater vision: that of the NYC Space/Time Directory and a goal of "turning historical maps and other geographic sources into a digital time-travel service for New York City."  The simple objective is "to extract, correct and analyze data from historical maps." (1) 
The process (along with the incredible vision of what will be) involves a series of enormous and crucial steps.  Behind the scenes, maps and atlases are first scanned, page by page, at high resolution in order to be "rectified." The process of rectifying a map also includes a largely crowdsourced element: NYPL's Map Warper tool (http://maps.nypl.org/warper/) engages volunteer involvement in identifying and streamlining the scale, orientation, and pagination of maps of myriad variances.  This normalizing process aligns every map to precise points of latitude and longitude and ultimately (or will ultimately) allow them to be layered, enabling a journey through time, and viewing the changes to the city from any location or vantage point.  
The next step is "Vectorizing" and basically consists of computer identification of recognizable demarcations.  This step both eliminates ridiculously astounding hours of human labor in the project AND ensures that humans must participate in the rest of the process for many years before it is complete.  The Vectorizer software is incredibly powerful and time-saving.  Analyzing fire insurance maps, it can identify approximately 150,000 building footprints per day -- astronomically faster than humans, as "it took NYPL staff and volunteers two years to extract 170,000 footprints (and some additional metadata) from four atlases in the library’s collection."  However, the software is no match for the human brain.  Thus comes the final stage in the process and the Building Inspector site, itself.  Inviting human interaction with the data, Building Inspector both checks and cleans data and trains the computers and software to read the maps more effectively. (2)
So how does Building Inspector work?  The gist of contributors' tasks is cleaning up computer software mistakes and identifying as much data as possible on each map .  The tasks are simple and easy to learn.  Currently, there are five task options that, even for someone unfamiliar with insurance maps, NYC history, or basic word processing, can be completed with absolute ease.  Additionally, quick training videos are available and prompt new users to watch them when first entering a task page.
Tasks include Checking Footprints, Fixing Footprints, identifying colors that indicate building structure, and transcribing either location addresses or noted location place names or attributions.   The site allows anyone to contribute, but in order to track one's mounting contributions (as well as rank) one must register.  

Although Building Inspector is described as a game by NYPL Labs, an Inspector really only competes for a ranking and has no idea how much competition there is, who they are, or how much they have done to accomplish their ranking.  Not being anything of a gamer myself, I didn't much mind -- although something still lit a little competitive fire in me.  I wanted to see my ranking go up, but in reality I was mostly driven by my passion for the task and the connection to historic material.

How NYPL Labs measures success not entirely clear.  Nor is it evident how many Inpsectors are contributing.   My earliest ranks (or those when I first thought to check)  indicated that I was likely one of a few thousand, although there was also no indication of just how much or how frequently contributors participated.  However, just last week NYPL tweeted that the crowd had recently hit the milestone of completing 1.5 million tasks, so clearly something is being accomplished.


Accuracy is achieved by majority rule.  Each image task is presented to multiple Building Inspectors who each cast a "vote" until consensus is reached.  The site summarizes the process with the following example: 
For Footprint Inspection, it’s coming to consensus on whether the computer-detected footprint is a Yes, No, or a Fix. We show the same footprint to at least 3 different people, and every 10 minutes we tally up the votes. If 75% or more agree, that footprint has reached “consensus” and the system removes it from the inspection queue. If the jury's still out, we keep the footprint in circulation until consensus is reached, focusing our collective efforts on the buildings most in need." (3)
Skillful completion of tasks is, of course, the best measure of success and hitting milestones helps to track that completion.  
You have reached the end of the internet.
When all tasks have been completed in one area (for the moment) the notice above pops up: "No unprocessed data found for this task.  Good news!! This seems to be complete.  Maybe try another task?"  If one waits a little while and comes back, however, more tasks quickly rise to the surface.  It is presumable that this pause may be occurring between sessions of result tallying, but reaching this point felt more like being denied something I wanted than something to celebrate.  They hooked me!

NYPL's Mission Statement is lengthy, detailed, and (like the institution itself) encompasses much.  Its three primary foci are to "inspire lifelong learning by creating more able learners and researchers...; advance knowledge by providing free and open access to materials and information that reflect New York’s global perspective;" and "strengthen our communities by promoting full citizenship and participation in society."  Within the fleshed out language of their mission, there are many highlights that are supported by the Building Inspector project.  Among them are the goals to:
  • Support creativity, research, and problem-solving
  • Bring people together to spark creative synergies and learn from each other
  • Inspire interest, expand horizons, and enrich perspectives
  • Build tools that allow us to connect with the world in our areas of expertise
  • Provide dynamic resources to help patrons understand and engage in society
  • Create safe and reliable places where we and our patrons can enjoy, honor, celebrate, and engage with our communities
  • Offer unique and authoritative materials of historical importance. (4)


Although NYPL Labs calls Building Inspector a game, I have doubts that many will share that view.  They have scarcely built up any gaming aspects of the site beyond announcing an Inspector's rank.  I am not a gamer by any stretch of the imagination and I found it quite lackluster in any of what I would consider gaming allure.  Rather, I found it a well designed tool that I enjoyed using and found that, by far,  my greatest incentive was the opportunity to interact with the wonderful maps. 
Of greater concern (and only just encountered at the very last minute when I was (once again) enticed to rectify one more (really, just one more, I swear -- well maybe 3 so I can bring my completion record to a nice square number -- and then I'll just really quickly check my ranking and be done -- okay, maybe there is some game element...) is -- I'll begin again.  Of greater concern (based on my recent revelation) is the true development of consensus among different sets of eyes.  Despite the wonderfully illustrated example of how consensus is reached (and, therefore, knowledge is gained and maps are made wholly searchable and interactive through the power of many volunteers and great algorithms) I found myself (in a very brief visit back to the site for further research and not at all because I was compelled to go back and fix a few more footprints) faced with an exact same footprint I had already re-vectored.  Perhaps it was an anomaly; perhaps I have been "playing" more often than others on this task recently; perhaps, perhaps, perhaps.... I am not sure what to make of it, but I will say (after playing with the site far longer than was ever envisioned for this assignment) this was the first and only hiccup I encountered.  That said, from the standpoint of accuracy and consensus, I do not find asking the same person twice about the same property is a best practice in true crowdsourcing.  Though, indeed, I have much to learn.

Shakespeare's World

 https://www.shakespearesworld.org/#/

Shakespeare's World is a collaboration between the Folger Shakespeare Library in Washington, D.C., and Oxford University Press's Oxford English Dictionary.  The transcription project is hosted by Zooniverse.org at Oxford University.  Zooniverse hosts a variety of crowdsourcing projects in many fields and describes itself as "the world’s largest and most popular platform for people-powered research."  Shakespeare's World is among their latest projects, having launched only a few months ago in December 2015.  Although Shakespeare gives the project its name, he has very little to do with its focus or content.  His lifetime (1564-1616) and 2016 being the 400th anniversary of his death provide touchpoints, but the documents people are asked to transcribe relate more to day-to-day life in Shakespeare's time than to anything pertaining to him properly. 
The Folger Shakespeare Library does not list a mission statement, as such, on their website but does discuss their focus and actions on their "About the Folger" page. They state that, "By promoting understanding of Shakespeare and his world, the Folger reminds us of the enduring influence of his works, the formative effects of the Renaissance on our own time, and the power of the written and spoken word." (5)  This project clearly supports that aim.

The involvement of the Oxford English Dictionary, however, is what most intrigued me about this project.  As one of the best examples of crowdsourcing in history and the authority on the English language and the origins of words and their use, the OED's involvement is focused on the hope of discovering new word usages, earlier instances of word usage, and building a greater understanding of the English language and its growth.  The OED is produced by Oxford University Press, which is a Department of Oxford University.  The Press, itself, states no so-titled "mission" on its site but does frequently express that "It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide." (6) Toward meeting these goals, it is noteworthy that within the first week of the launch of Shakespeare's World the OED had already identified new material -- an early and unrecorded variant of taffeta, "taffytie" -- and more has steadily followed. (7)
Shakespeare's World offers transcribers an easy to use keypad, complete with shortcuts for words commonly abbreviated in Shakespeare's time.
So, how does Shakespeare's World work?  Contributors are invited to register with Zooniverse -- a simple and straightforward user name, password, email address process -- and then can begin transcribing individual pages of manuscripts from Early Modern England.  There is a brief slideshow of directions, tips, and advice (all the slides of which are easily referenced at any time), as well as a Guide page with additional and more in-depth assistance on topics such as paleography, numbers, and county names -- all of which are immensely informative but equally easy to use and access.  A handy keypad with common abbreviated terms helps to maintain consistent data.  Users also have quick access to a crib sheet of common letter shapes and can also compile their own crib sheets as they go.  Currently the project is focusing on recipe books and letters, but additional materials will be added over time.  
Transcribers work in isolation on single images from various materials and do not see the work of other contributors.  Any amount of text may be transcribed, from a single word to parts of phrases to every line on the page.  Pages are not sequential; after saving work on a page from one manuscript, transcribers may next be assigned something quite different, written by someone else and in another hand entirely.  The idea, according to the blog page for the site, is that this design encourages greater focus on, or "close reading"of, the page at hand: "The Shakespeare’s World system, which only presents one decontextualized page at a time, forces the transcriber to read closely. The handwriting can be difficult and, without an entire manuscript for reference, the only clues to decode an excerpt are on that page. Every letter, every mark, every word is important. And the transcriber is compelled to linger, focusing on the details and considering the possibilities." (8)  Although they may work in isolation on a single page, transcribers can interact, share, and ponder dilemmas on the Talk page of the site. 
Unlike Building Inspector, Shakespeare's World offers some steady and engaging feedback on accomplishments, progress, and questions.  The Talk page is well organized and active with discussions and comments ranging from tips and questions about spellings and penmanship to alerts of new OED discoveries and modernized updates on the old recipes in the manuscripts.  The blog page is less active (the latest post was nearly a month ago), but it does offer deeper insight, images, and in-depth discussion of the project and its highlights periodically.  Few posts receive any comments and, if they do, rarely is there more than one or two.   
There is no way to measure or visualize just what impact you have had as a volunteer -- nor, either, just how much remains to be completed.  In January, just five weeks into the project, a blog post announced some data: 13,876 unique visitors had logged over 23,000 sessions and submitted 24,252 transcriptions.  Most transcribers were from the greater London area, but they noted growth in many other regions and even offered an infographic of the site's activity. (9)  However, I have found no updates on this data or any additional statistics on progress or growth changes to date.  It would be interesting to follow such changes and I am hopeful that they will offer updates along the way.
Geographic locations of Shakespeare's World contributors, as of January 2016. 
Summary
Although Building Inspector and Shakespeare's World both focus on crowdsourcing efforts involving libraries and their desire for greater metadata and searchable content, their collections, methods of collecting data, and how they engage contributors differ a fair amount.  Both are driven to enrich our information worlds and each has a robust site that is both easy to use and compelling (for the contributors they seek).  Building Inspector offers a taste of instant feedback and a tiny element of competition to urge continued contribution, while Shakespeare's World offers a communal opportunity for discourse and intellectual stimulation.  Both could take a page from the other and jump from great to fantastic.

  1. NYC Space/Time Directory, http://spacetime.nypl.org.
  2. Miller, Greg. "Help Bring New York City’s Past Back to Life From Your Phone." Wired. Oct. 21, 2013. http://www.wired.com/2013/10/phone-map-game-new-york-city/.
  3. "What's this all about?" Building Inspector. http://buildinginspector.nypl.org/about
  4. "NYPL's Mission Statement," New York Public Library, http://www.nypl.org/help/about-nypl/mission
  5. "About the Folger," Folger Shakespeare Library, http://www.folger.edu/about
  6. "About Us," Oxford University Press, http://global.oup.com/about/?cc=us.
  7. Durkin, Philip. "Our First Discovery! And a brief history of the Oxfort English Dictionary," shakepearesworldzoo (blog), Shakespeare's World, December 17, 2015, https://blog.shakespearesworld.org/2015/12/17/our-first-discovery-and-a-brief-history-of-the-oxford-english-dictionary/.
  8. LWSmith. "On Close Reading and Teamwork," shakespearesworldzoo (blog), Shakespeare's World, February 3, 2016, https://blog.shakespearesworld.org/2016/02/03/on-close-reading-and-teamwork/
  9. Snakeweight. "Progress Update," hakespearesworldzoo (blog), Shakespeare's World, January 13, 2016, https://blog.shakespearesworld.org/category/stats/.





No comments:

Post a Comment