Marciano, students use computational thinking to reframe digital curation practices

ISR-affiliated Professor Richard Marciano and 20 student researchers recently completed an eight-week project that demonstrated the value of reframing digital curation practices through a computational thinking (CT) framework.

Marciano directs the Digital Curation Innovation Center within the University of Maryland College of Information Studies, which is developing a digital curation agenda exploring the computational move towards “Big Cultural Data.”

In a case study involving World War II Japanese American Incarceration Camp records, the researchers applied CT methods to detecting personally identifiable information, developing name registries, integrating vital records, designing controlled vocabularies, mapping events and people, and connecting events and people through networks.

Computational thinking and archives

Emerging technologies have profoundly altered the nature of libraries and archives of all sizes, by disrupting how information is created, recorded, captured, encoded, curated, shared, and made available and used. Digital curation is increasingly extending archiving and preservation by adding value to digital objects, through indexing, adding metadata, annotations and markup, enhancing discovery and access, and facilitating integration.

CT is a form of problem solving that uses modeling, decomposition, pattern recognition, abstraction, algorithm design and scale. Its practices cover data, modeling and simulation, computational problem solving and systems thinking, and are being applied to large-scale records and archives processing, analysis, storage, long-term preservation, and access.

About the project

In the project, Marciano and the students used CT practices to explore what new stories and connections might emerge from WWII historical archival datasets related to the network of 10 incarceration camps (particularly the one at Tule Lake) that incarcerated more than 120,000 civilians of Japanese ancestry between 1942 and 1946. The students worked with camp intake and outtake records, records from the War Relocation Administration (WRA), and more than 25,000 WRA “Internal Security Case Reports” index cards in the records of the National Archives.

By applying CT practices to these records, the students were able to draw relationships between conditions in the Tule Lake Camp and the resistance that occurred there. The researchers datafied and integrated these records, comparing them to already public knowledge about Tule Lake. This enabled them to put together a more complete story of the people incarcerated there and their networks of resistance.

In the future, the researchers hope to develop deeper matching strategies for name registries; resolve discrepancies in the way death is reported in vital records; better automate classification; develop a spatio-temporal model; and connect events and people through network analysis using graph algorithms.

The work was conducted in fall 2019, culminating on Oct. 30 with an on-campus public event sponsored by the University of Maryland College of Information Studies. The event featured a poster presentation of the work, a showing of the film "Resistance at Tule Lake," and a discussion with the filmmaker Konrad Aderer.

A paper about the research project can be accessed here.

Published November 19, 2019