Digital preservation and big data researcher Richard Marciano becomes ISR affiliate

news story image

ISR welcomes new affiliated Professor Richard Marciano (iSchool), the director of the Digital Curation Innovation Center (DCIC) within the University of Maryland’s College of Information Studies. He also directs the Sustainable Archives and Leveraging Technologies (SALT) lab.

The DCIC explores how to integrate archived information with new data and how to better access these records to foster historical research engagement. At the DCIC, faculty and students are developing interdisciplinary public, industry, and government partnerships to increase integration and access of big data and archival records and databases. Students can work on research projects that address real archival and information management challenges faced by organizations, including government agencies, academic institutions, and corporations.

Marciano’s research interests center on digital preservation, sustainable archives, cyberinfrastructure and big data. Before coming to Maryland, Marciano was part of the University of California, San Diego’s San Diego Supercomputer Center, with an affiliation in the Division of Social Sciences in the Urban Studies and Planning program.

Marciano and Brown Dog

He is the Maryland lead for Brown Dog, a 5-year, $10.5M National Science Foundation Data Infrastructure Building Blocks (DIBBs) implementation grant. The lead institution is the University of Illinois Urbana-Champaign’s National Center for Supercomputing Applications (NCSA).

Brown Dog is a component of a national research cyberinfrastructure. Its goal is to prototype a highly distributed and extensible science-driven Data Transformation Service (DTS). Brown Dog makes past and present research data more accessible and more useful to scientists while enabling novel science and scholarship on top of such data.

Rather than attempting to construct a single piece of software that magically understands all data, Brown Dog leverages and coordinates sources of automatable help already in existence, such as software, tools, libraries, and even other services. Brown Dog does so in a robust and provenance-preserving way, creating a service that unites their capabilities and can deal with as much of this data as possible.

Brown Dog is the “super mutt” of software, serving as a low-level data infrastructure coordinating software capabilities with a user’s data needs to facilitate data reuse and enable a new era of science and applications. Brown Dog could serve not just the scientific community but also the general public as a “DNS for data,” transforming data on the fly to more accessible forms through a distributed and extensible collection of data manipulation tools, moving towards an era where a user’s access to data is not limited by a file’s format or un-curated collections.

Brown Dog is part of NSF’s DataNet/DIBBs program, which began in 2008. DataNet was conceived to address the increasingly digital and data-intensive nature of science and engineering research and education. Digital data are not only the output of research but provide input to new hypotheses, enabling new scientific insights and driving innovation. The challenge is how to develop new methods, management structures and technologies to manage the diversity, size, and complexity of current and future data sets and data streams. DataNet is creating a set of exemplar national and global data research infrastructure organizations that provide unique opportunities to communities of researchers to advance science and/or engineering research and learning.

Brown Dog is part of the DIBBs follow-on effort focused on building software cyberinfrastructure to support current and foreseen scientific data needs. DIBBs projects provide complementary services, each building on capabilities of the others. Visit the Brown Dog website.

Published July 9, 2018