December 7, 2009
Greetings MVP Stakeholders,
The past eight weeks have been busy ones. This update provides an overview of MVP partner accomplishments, next-steps taken in response to the September meetings and progress towards developing a “Gen 2” platform and service.
CONFERENCES AND CAMPUS EVENTS
Three October conferences and events featured Media Vault Program partners:
- iPres2009 (hosted by the California Digital Library)
- eScholarship presentation at the UC Berkeley Archaeological Research Facility marking the launch of the enhanced eScholarship service
Congratulations to the University of California Curation Center (née CDL Digital Preservation Program) director Patricia Cruse, conference organizer Perry Willett and their colleagues for hosting a very successful conference, and to Stephen Abrams and John Kunze for their well-received presentations. Kudos, too, to eScholarship program director Catherine Mitchell and her team.
Following up on September’s service partners workshop, IST staff have held a series of meetings with Bernie Hurley, Lynne Grigsby and staff of the Library Systems Office to explore requirements for a program that would expand access to the library’s archiving repository and associated services. We will work with the library in the upcoming months to design our approach, one that can also serve as an example of how research content might be transferred to other partner services. Meanwhile, the library’s gracious offer to provide short-term storage to existing customers helps us extend support for the current Media Vault service.
PLANNING FOR A CAMPUS EVENT
Building on the September community workshop, MVP staff began planning for a campus event designed to educate Berkeley scholars on the range of services available for “making research data safe and easy to share.” We envision an opportunity for campus researchers to roll up their sleeves and get familiar with services they can use in their work. At the same time, the event would promote individual services and foster a larger campus discussion. After giving consideration to holding the event this fall, we’ve decided to delay it until next semester to allow ourselves more time to target participants and develop demonstrations of next-generation platform components and workflows. Stay tuned for more news in the New Year.
SERVICE DESIGN AND PLATFORM ANALYSIS
A major goal of these past two months has been to further decision-making around the selection of a “Gen 2” Media Vault platform and service. Our efforts since the last update have resulted in a recommendation that we feel provides a good frame for the next round of feedback and discussions with our sponsors, advisors and user community. We’d like to take a few moments to detail that process.
Over the last year, the MVP team has analyzed a broad framework of needs that center on research collections, including upload of digital assets, metadata assignment, sharing and archiving/preservation. This framework has allowed us to envision ways to leverage the strengths of existing service providers, such as bSpace, Library Systems and the CDL, who have a long history of focus and expertise in their respective domains. It has also allowed us to identify functional gaps that prevent researchers from managing collections in ways that enable them to make use of those available services.
We started October armed with key findings from earlier phases of the program:
- Connect to existing library collection services, University of California Curation Center (UC3) curation and preservation services, eScholarship publishing services and Web-based archive and access services (think ArtStor, Internet Archive, Flickr)
- Concentrate our efforts on gaps not covered by these services, such as tools to organize materials collaboratively prior to submission and to share materials among trusted colleagues
- Focus on research use cases and workflows
- Build on a platform that can be scaled to support thousands of disparate uses and requires minimal technical support to provision accounts or submit items to the library and other services
Evaluation of options for a second-generation service accelerated in the following month and a half. To guide our platform analysis, we compiled a detailed list of functional, technical, and business requirements that build upon the digital lifecycle analysis, user scenarios and workflows developed to date. Some examples:
- Provides easy to use tool(s) to upload assets
- Supports bulk application of metadata to assets already in the repository
- Accepts standard submission metadata fields that also get sent to archive
- Integrates with campus authentication and authorization services (CAS, LDAP, Active Directory)
- System must run well (and equivalently) across a range of client platforms
- System should run with a web-client and adhere to general web portability and accessibility standards
- System has been in use in production environments, with a good track record
- Supplier and community can respond satisfactorily to UC Berkeley needs
- Project has multi-year funding model that is comprehensive and sustainable
- Program can leverage existing specialized services delivered by key partners. We don’t want to redevelop or compete with valuable campus offerings
Media Vault Program partners offer a number of specialized tools to help campus researchers manage their materials. These include:
- WebGenDL (UCB Library Systems) – the library’s internal system for managing, creating, preserving and discovering digital library content. These tools are aimed primarily at mature, publishable sets of materials, rather than the broader context of research data
- UC3 Curation Micro-services – a set of low barrier tools for full lifecycle enrichment of objects (e.g., identity, fixity, replication, annotation). The first few will be rolled out publicly in January 2010. These are presented not as a user interface, but rather as behind-the-scenes services
- Sakai 3 – the next-generation version of the platform that powers the Berkeley campus’s bSpace application. Due in 2011, Sakai 3 will include a range of social tools to help users extend and disseminate their materials
To augment these services, and to handle use cases beyond their scope, the MVP team examined a number of potential platforms:
- ePrints – a 10-year old, open-source digital repository platform, primarily used for print publications, from the University of Southampton
- Islandora – an integration of Drupal, Fedora and additional services, developed by the University of Prince Edward Island Library. Islandora lets researchers exhibit, access and archive their materials
- Drupal – a popular open source content management system, primarily used for web content, with a growing base of users on the Berkeley campus
- Open Source Enterprise Content Management (ECM) platforms –
ECM platforms are used to manage collections of Web content, documents and records. In our case, we have been interested in document-centered functionality and how workflow-enabled content management can enable collaboration between researchers and partnered services. These capabilities may require only small levels of customization to be leveraged to a wide audience and can be offered as part of an interoperable solution stack
- Alfresco –a Java-based platform that would immediately provide a web-accessible place for researchers to store their materials yet can be further built out over time to meet the needs of campus. Alfresco offers versioning and transformation of documents; ability to add workflows, rules and aspects to objects; and customizable content models (to support, for example, multiple metadata schemas)
- Nuxeo – Similar to Alfresco, Nuxeo is used as the repository layer of IST’s CollectionSpace development project. Though a very powerful platform, Alfresco has a more functional, richer end-user experience and is a closer match to the program’s near term goals
(See the links below for more information about these platforms.)
Detailed analysis, prototyping and discussion of these platforms in terms of fit/gap continued over the six weeks. Team members conducted informational interviews with developers and end-users of each of these platforms.
Of these candidates, Alfresco stands out as the most functional, out-of-the-box solution. With a little customization, it can be readied for user testing. Therefore, the MVP team has selected it as the basis of its next round of discussions with stakeholders, partners and prospective users.
At the time of this writing, the MVP team is configuring a prototype service that will provide a safe place to store, share, and prepare digital media files for archiving and publication. Through a pilot implementation based on this platform, we can address the technical and policy issues involved in creating data connections with partner services; prepare deployment and adoption strategies (which use cases do we target first? On what timeline do we introduce particular features, workflows, schemas and other localization?); and develop the financial projections and metrics necessary to make the business case for the service.
NEXT STEPS: DEMONSTRATOR PROJECT; CONSULTATIONS WITH SPONSORS, ADVISORS & PARTNERS
Our next step is to develop a demonstration of an ECM-based service that we can bring to our community of users and other campus scholars for review.
Conferences and campus events:
• iPres2009, the International Conference on Preservation of Digital Objects, organized and hosted by the California Digital Library
• PASIG-SF, the fall meeting of the Sun Microsystems-sponsored Preservation and Access Special Interest Group
Slide presentations: http://lib.stanford.edu/pasig
• eScholarship: http://escholarship.org/
UC3 Micro-services: http://www.cdlib.org/inside/diglib/
Sakai 3: http://sakaiproject.org/future-directions
Demonstration site: http://demoprints3.eprints.org/
Islandora: http://islandora.org (or http://islandora.ca)