Executive Summary of Findings
“Scholarship is built on the cumulative record of the past and the well-tended, authentic, and readily accessible data of the present. Current federal efforts to build a digital information preservation infrastructure at the Library of Congress and the National Archives assume that research institutions responsible for producing large quantities of research data, such as the University of California, will take responsibility for ensuring its long-term access. Is that a reasonable expectation? What is at risk if they do not?”
Abby Smith, “Academic Amnesia: Who is Preserving Our Data?” – Center for Studies in Higher Education, UC Berkeley, November 28, 2006 – http://cshe.berkeley.edu/events/index.php?id=208
The principal finding of the Media Vault Program is that it is essential to have services that make research data safe and easy to share for our campus. What was true in 2006 (when we began the Media Vault) remains true today, although the texture of the challenge is now understood at a much finer grain. Our findings show that obstacles to the development, adoption and sustainability of services can be described in economic, technical, political/organizational and social terms, as corroborated by the excellent work from several leading reports, including:
Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences – Harley et al.
Sustaining the Digital Investment: Issues and Challenges of Economically Sustainable Digital Preservation – Berman et al. [BRTF]
Sustaining Digital Resources: An On-the-Ground View of Projects Today: Ithaka Case Studies in Sustainability – Maron et al.
A Multi-Dimensional Framework for Academic Support: A Final Report – Lougee et al.
Scholarly Communication: Academic Values and Sustainable Models – King et al.
A Report on the Range of Policies Required For and Related To Digital Curation – Jones
Before delving into the obstacles, let’s take a look at several findings that make way to an opportune moment to launch a campus-wide program like the Media Vault:
- • The problem is large, but solutions are essentially needed – Our findings indicate that we need to own the problem coherently. We need to work together (the service providers and technical experts) and harmonize efforts to the greatest degree possible.
- • The problem is manageable – It is possible to make progress incrementally. There are pragmatic, and relatively inexpensive measures that we can put in place, which will provide excellent benefits. See functions and requirements below.
- • Some needs are basic – A safe place to put things, an easy way to share things. The principal need for most users is a safe place to put their research data, and the peace of mind this brings. Easy access to primary content is an essential requirement.
- • Some needs are complex – Long-term digital preservation and permanent access is tricky. The shift of responsibilities from creator to curator brings with it incredible complexities due to the requirements that are typically introduced in order to affirm this transition. We need to be patient and accommodating with our user community and realize the complexities of this domain are impediments to adoption.
- • There are few incentives to do the right thing – We need to encourage good thinking, best practices. – “In many environments, there are few incentives to develop the persistent collaborations and uniform approaches needed to support access and preservation efforts over the long-term.” – Incentives need not be financial, they can be convenience, competitive, ease-of-use, novel.
- • There is a desire to learn and share – Participants are engaged, interested, willing to learn. One of the key strengths of working in an academic environment is the general desire to try things, experiment, and a tolerance to imperfection.
- • WE are the platform – As much as technical services, consulting and problem solving are desperately needed, and go a long way. Our participants are innovative and motivated. The Media Vault Program is potentially a remarkable resource of support for the research endeavor.
- • Media Vault is a good brand – Especially if co-owned and operated by our selected partners. For some of us, the brand may seem too constraining, limited to media – data supporting the research endeavor. Our findings indicate that the majority of the research enterprise is dependent on binary files, defined in the simple terms of Office documents, PDF, images, and video. If we can make progress on making these types of media safe and easy to share, we will have made significant gains.
- • Common solutions are possible – By focusing on workflow and lifecycle, common pain points are revealed for most users – collections, researchers, departments. There are individual researchers with 10’s or 1000’s of images, and departments with the need to share fewer files but broadly. Scale is relative.
- • We need enterprise solutions in order to support an enterprise like Berkeley – We need services that scale. We cannot and need not own every service, but we need to own the service catalog. We need to give position ourselves to make recommendations, have opinions, make assertions, and be helpful.
- • Full service to self-service – Different users have different needs, abilities to pay/contribute. There is not a sliding scale between the haves who can afford the full services and the haves not who cannot. In fact, self-service, meaning self-empowerment, should be a goal. As much as possible, the research enterprise should be both self-reliant and fully supported. Self service is a key to human scalability issues for the suppliers, which translates to lower costs and greater responsiveness.
All major studies and reports on the sustainability of digital resources point to a multitude of barriers that can be clustered into four factors:
Economic: Who owns the problem, and who benefits from the solutions? Who pays for the services, long-term preservation, development, and curation? From the [BRTF]: While there is “general agreement that digital information is fundamental to the conduct of modern research, education, business, commerce, and government,” there is “no general agreement, however, about who is responsible and who should pay for the access to, and preservation of, valuable present and future digital information.”
Technical: Simple services are needed, but they are not simple to build, implement, integrate and support in our complex environment. Successful structures that can support digital scholarship must account for user needs, emerging technologies/file formats, adverse working contexts (fieldwork, offline, multi-platform), and should be supported at the enterprise scale. Commercial/proprietary offerings can provide a lot of functionality out of the box, but with potentially high licensing costs. Open source solutions are prevalent and freely available, but often require significant financial, development and support investment.
Political/Organizational: We think the Media Vault Program community approach to making research data safe and easy to share puts a spotlight on both the urgency of the problem, and the challenges that must be overcome structurally in order to make progress on solutions. For example, there are good reasons for the various service provider organizations to innovate on their own, but there is much to gain from working together on common goals and milestones. In fact, where communities have succeeded in softening the boundaries between content producers and consumers, supporters and beneficiaries, significant successes have been achieved. Conversely, where misalignment around roles, goals and responsibilities persist, so do the barriers to sustainable stewardship.
Social: We live in interesting times, where disruptive technologies such as Facebook and Google are transforming how we communicate culturally, and the prevalence of cheap/stolen media has produced an expectation that things should be always available, conveniently packaged, and free. Where some organizations, such as the Long Now Foundation, are hoping to “provide counterpoint to today’s “faster/cheaper” mind set and promote “slower/better” thinking,” it may be up to those of us who care deeply about the persistence of research data to step up as the seas continue to change.
Sometimes simple is good enough, as is evidenced by many technologies that have solved complex problems adequately. MP3, RSS, PGP, Skype, Twitter, tinyURL, WordPress blogs and gmail. What all of these technologies have in common is that their developers took on a problem and tried to solve an essential part that would have maximal benefits for most, but not all users. If we can devise solutions that will help 80% of our research community, will that be a reasonable and desirable outcome? Will it be a good enough start?
The Media Vault Program represents an opportunity to overcome the barriers to development, adoption and sustainability of services through its community-driven approach. Our community understands the urgency of the problem and faces the challenges posed by these barriers in their everyday work. Furthermore, we foresee that “access to data tomorrow requires decisions concerning preservation today.” Our campus needs a thriving, well-governed, effective program to address what is recognized as one of the “most urgent” and essential problems facing research organizations today.
We believe that in order to make major progress for the community we need three things:
- Program: A supported, sustainable community of participants, providers and sponsors.
- Platform: A next-generation Media Vault platform that is enterprise strength in terms of reliability and scalability.
- Pledge: A statement of support from the campus executive.