Technology

The Future Librarians will be AI Archivists

In July 1848, L'illustration, a French weekly, published the first photo to appear alongside a story. This represented the Parisian barricades set up during the June Days Uprising. Almost two centuries later, photojournalism provided libraries with thousands of archival photographs that tell stories of our history. Yet without a methodical approach to curating them, these historical images may be lost in infinite quantities of data.

IMAGE CREDIT : LIBRARY OF CONGRESS/NEWSPAPER NAVIGATOR

That's why the Library of Congress is housed in Washington , D.C. Scientists are using sophisticated algorithms to retrieve digital photographs from newspapers. Although digital scans can already compile images, they can also be processed, cataloged, and archived. This resulted in 16 million newspaper pages worth of photographs that archivists can scan with a simple search.

The news paper Navigator is led by Ben Lee, an innovator at the library of congress, and a graduate student in computer science at the University in Washington. His dataset comes from the current Chronicling America project which compiles digital newspaper pages from 1789 to 1963.

He found that a crowdfourting tour was already underway in the library, transforming those journal pages into a searchable database, with an focus on World War I content. Volunteers will be able to mark up and transcribe digital news pages — which is not always such great computer users. What they created was actually a perfect set of training data for an algorithm to automate all this challenging, difficult work.






Follow Us


Scroll to Top