Recognizing Excellence in Records & Information Management
Thank you all for coming tonight.
I would like to start with:
- a few acknowledgements
- discuss the impact of Emmett Leahy Award scholars
- describe some of my own research paths, while also providing you with a few visual examples
- and conclude with a few reflections on archival collaborations & the enduring legacy of Emmett Leahy
I was very surprised to be nominated for, and to receive, this prestigious award.
I would like to thank the Emmett Leahy Award Committee for this honor, including two former recipients who contributed today: Jason Baron (2011) and Christine Ardern (2002).
I would also like to thank our distinguished speakers (Laurence Brewer – NARA, Paul Wester – USDA/NAL, Jane Greenberg – Drexel University, and Michael Kurtz & Dean Keith Marzullo from the Maryland iSchool.
… and all my colleagues, students, and family for attending today.
The Emmett Leahy Award has for over 50 years recognized some of the most accomplished scholars in the field.
The last 10 years have seen truly international recipients from the UK, Italy, Australia, Canada, and the US, with incredible backgrounds including:
- national archives digital preservation directors,
- Vice-President of IT Risk Management,
- audit and certification of trustworthy digital repositories leads,
- cyber security CEOs,
- litigation directors and e-discovery experts,
- national heritage and culture state archivists,
- renowned records management educators,
- and an international records management trust director.
I wanted to acknowledge the impact of all these leaders and how much my own work has benefited from their advances. I’m very honored to be in such good company.
The research paths I have taken over the last three decades have been highly collaborative and interdisciplinary and have spanned four main areas:
1. developing transparency and trustworthiness in algorithms and software (“digital trust”)
- I contributed to the development of algebraic methodology and software technologies and to computational geography and the management of spatial records at scale
2. developing records management software for handing large amounts of information
- while at the San Diego Supercomputer Center (SDSC), with my colleague Reagan Moore, we coined the concept of data grids and launched a data-intensive computing group, which led to the development of the Storage Resource Broker (SRB)
3. pioneering policy-enabled infrastructure
- the SRB led to the development of the open-source integrated Rule-Oriented Data System (iRODS), were data grids and business rules were combined to tackle long-term digital preservation challenges
4. pioneering Big Data / Big Records Curation
- In particular over the last 3 years at the Maryland iSchool,
i. We launched a Digital Curation Innovation Center (DCIC),
ii. We developed a next generation of scalable repositories called DRAS-TIC (Digital Repository At Scale That Invites Computation), now owned by UMD, and based on so-called NoSQL database technologies, the kind used by some 1,800 companies including eBay, GitHUB, Hulu, Instagram, Netflix, and Twitter.
iii. We are establishing a new discipline called Computational Archival Science (CAS)
iv. We are now working with 70 students on 8 hands-on team-based CAS projects with clients and deliverables.
My current focus is on developing services and methodologies that enhance digital trust with a focus on cyberinfrastructure development
In the process, I have conducted dozens of projects including:
- building distributed testbeds,
- exploring persistent archives based on data grids,
- researching infrastructure independence,
- preserving records,
- comparing versions of records,
- preserving video workflows and GIS records,
- designing trusted digital repositories,
- developing cloud preservation services,
- and training the next generation of digital archivists in cloud computing and digital curation.
My approach has been based on bringing together diverse practitioners and researchers and building broad partnerships.
- For instance on the 2008 DCAPE project (Distributed Custodial Archival Preservation Environments) the goal was to develop a cooperative service provider network.
- I am very proud to say that I published a paper with 56 co-authors representing an interdisciplinary project coalition, which included:
- technologists(repository developers, researchers, graduate students),
- state archives and library partners (Michigan, North Carolina, Kentucky, Kansas, New York, and California),
- university archives partners (Tufts, UNC Chapel Hill),
- cultural institution partners (Getty Research Institute, and Smithsonian Institution Archives),
- and ischool partners (UNC Chapel Hill, and U. Wisconsin-Madison).
I have also experimented with novel records interfaces over the years. Three of these examples include:
a. Tactile Maps with 3D Lightboxes (1997)
Using USGS and US Forest Service records:
- this is an early example of 3-D printing of topography,
- a type of environmental braille
- where rear-projection of environmental data lights up the landscape membrane from within.
b. Mapping Inequality portal (2006-present)
- This project was part of last year’s National Geographic Best Maps Gallery award.
- It is a partnership with the University of Richmond, Virginia Tech, and Johns Hopkins, where:
- we are using 1930s New Deal records from NARA (HOLC),
- we are creating a national neighborhood racial redlining map and database for 250 cities and 10,000 neighborhoods.
c. Creation of a DRAS-TIC Archive with experimental interfaces (2010-present)
- This involved the creation of a testbed of 100 million records & 100 TB from NARA
- spanning over 150 Federal Agencies
- and new open-source archival software based on NoSQL backend databases, a long-running collaboration with Mark Conrad at NARA
I would like to conclude with a few observations on Archival Collaborations and acknowledge how indebted I am to a number of the individuals I have worked with over the years and who are present here tonight:
- Mark Conradfrom the National Archives 20 years ago, whose group had presciently established that the digital winter was coming (“or what they called the digital tsunami”) and that digital preservation needed to refocus on automation and scale. It is this single initial encounter that got me hooked on records and drew me to the archival dark side… and frankly changed the course of my career.
- Bruce Ambacherand Bob Horton in 2000 who had the patience to guide my initial efforts in San Diego and be part of an advisory board, for my first NHPRC-funded project
- Bill Underwoodin 2002, a computer scientist who has focused on the application of advanced computer technologies to the tasks of managing, archiving and curating e-records.
- Greg Jansenin 2008, who built the first digital preservation repository based on the integration of Fedora and iRODS
- Bill Regliin 2009, who partnered on a cutting edge NSF DataNet proposal
- Michael Kurtzin 2010, who oversaw the first digital release of Census data with the 1940 US Census
- Anne Weeksin 2014, who is a leader in professional education and partnered in the CurateCloud digital curation training project
- Jane Milosch& Andrea Hall in 2016, for joint work in connecting provenance research at Art Museums with Big Data.
- Jane Greenberg2017, who is enriching library education with data science.
And the many colleagues at the Maryland iSchool, whose regular support, conversations, and insights are contributing to the development of new modes of archival education.
In conclusion, the legacy of Emmett Leahy is well and alive.
- This pioneer and innovator in records management, some 80 years ago helped frame the notion of enduring archival value in records, sounded the alarm on the increasing volume of records – a harbinger for big data, and identified early opportunities brought on by technology…
- These themes are remarkably aligned with the work conducted at the Maryland iSchool by our faculty and students and continue to guide what lies ahead.
Thank you all for sharing your time today. I’m truly honored by your presence.