PrintInfoScribe helps investigative journalists unlock the stories trapped in PDFs. Specifically, InfoScribe is a generalized, web-based crowd-sourcing document transcription platform that invites the public to participate in the journalistic process by transcribing specified data fields from documents. What does that mean in plain English? We are building a platform where journalists can upload image-based documents (such as PDFs) and a community transcribes those documents.

Despite the exponential increase of digital data today, newsrooms aren’t getting any larger and OCR technology isn’t advancing fast enough. Though on its surface greater availability of digital public records should be a boon to investigative journalism, the reality is that these records are often published as unstructured, image-based documents, or without essential metadata.

While providing journalists with access to data sources that would otherwise be beyond their reach, InfoScribe seeks to cultivate meaningful, long-term personal investment in the journalistic process by giving transcribers access to the journalists who are doing work they care about, as well as publication credit for their contribution. We want to invite community participation to increase the transparency of, and the public’s confidence in, the journalistic process.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>