The Dartmouth Digital Library Program
A Report from the Digital Projects and Infrastructure Group (DPIG)
12 September 2011
- Digitized Library Collections
- Digital Publishing
- Digital Exhibits
- Proposal, Evaluation, and Selection
- Discovery and Planning
- Digital Conversion
- Metadata and Markup
- Packaging and Archiving
Go to Draft Policies
- Selection Policy for Digitization Projects
- Digital Preservation Policy
- Rights Management Policy
Go to Appendices
Appendix I: Comments on The Dartmouth Digital Library: Program, Priorities, and Policies (April 2010) Recommendations
Appendix II: Detailed Workflows
Proposal, Evaluation, and Selection
Discovery and Planning
Metadata and Markup
Packaging and Archiving
Appendix III: Technical Infrastructure
Recommendations for Infrastructure Development
The Major Components of the Infrastructure and Their Behaviors
Appendix IV: Web Form: Proposals for the Digitizing of Library Collections
Appendix V: Proposal Guidelines
Appendix VI: Project Management Worksheet
Appendix VII: Digital Project Item Management Worksheet
Appendix VIII: Example of Shared Staffing Model
Pamela Bagley, Research and Education Librarian
Peter Carini, College Archivist
Barbara DeFelice, Director, Digital Resources
Anthony Helm, Head of Digital Media and Library Technologies
Wess Jolley, College Records Manager
Eliz Kirk, Associate Librarian for Information Resources
Paul Merchant, Senior Programmer, Digital Library Technologies Group
Barb Sagraves, Head, Preservation Services
David Seaman, Associate Librarian for Information Management (Chair)
Cecilia Tittemore, Head, Cataloging and Metadata Services
Helen Bailey, Preservation Specialist
Eric Bivona, Senior Programmer, Digital Library Technologies Group
Bill Ghezzi, Cataloging and Metadata Services Librarian
Laura Graveline, Visual Arts Librarian
Roberto Hoyle, Senior Programmer/Analyst, Digital Library Technologies Group
Mark Mounts, Business & Engineering Reference Librarian
Mina Rakhra, Cataloging and Metadata Services Librarian
Jennifer Taxman, Head, Access Services
The Dartmouth Digital Library Program promotes innovative research and teaching through the strategic digitization of Library holdings and the open access publication of new digital scholarship. The Program responds to increasing demand for global access to our collections, and focuses the expertise of our Library staff to serve these needs. Dartmouth's world-class library is committed to exploiting fully the new opportunities for discovery that the digital era provides, and to being as engaged in the working lives of our users in the 21st century as we have been in past centuries. Planning has been underway across the Library for several years, including a 2010 Report that set our current scope and priorities. Now we are using our deep subject expertise, collection-building skills, and technical knowledge to create workflows, policies, and infrastructures to shape, describe, and preserve our online collections and publications.
Scholarship is transitioning from print to electronic access in many fields, and we are thinking strategically about which of our Library holdings would serve Dartmouth best if they were available online. Our physical collections are still heavily used, but we want to realize new teaching and research potential through selective digitization of them. Dartmouth faculty and students are increasingly sophisticated users and creators of digital content, ambitious to teach, research, curate, and collaborate in new ways. To enable these activities, it is time for more of our library materials to be accessible online.
Digital content has been part of Dartmouth's service offerings since the Dartmouth Dante Project began in the 1980s. In the early 1990s we were among the first American institutions to make available online the full text of literary, religious, and philosophical works; later, in 2001, the Library founded the Digital Publishing Program "to manage scholarly information produced by our own faculty and students and to communicate it to the rest of the world" (Dartmouth College, 2002).
More recently, we have embarked on a number of digital projects, to serve immediate needs and to inform our planning with the practical lessons learned from undertaking the work of digitization, preservation, delivery, and assessment. These projects include the digitization of rare books such as Il Clarissimo Poeta Ovidio De Arte Amandí Libro Primo Chominza (Rauner Incunabula 89), manuscripts such as the Brut Chronicle (Rauner Codex 003183), and the papers of Samson Occom and his contemporaries. The digitization of selected compositions of composer Jon Appleton provided experience working with audio materials, while the digitization of historic films related to Dartmouth in the 1940s and 1950s furnished experience with film and video preservation, digitization, and delivery. The Library has also produced digital collections of photographic materials including the Stefansson arctic images (Rauner Stefansson Mss-225 & 229), while a project to digitize the Sanborn Fire Insurance Maps of New Hampshire Towns, 1880s-1920s, provided experience with large format materials. And the digital and print publication of William Scott's The Artistry of the Homeric Simile created a partnership with the University Press of New England (UPNE) as well as experience with the publication of "born digital" scholarship.
The practical experience gleaned from these early projects is of great value as we create the policies, processes, and priorities that make up this Program, and we are confident that we can move forwards with a set of services that will make a difference in the lives of our users. To achieve this, the Program articulated here is built on a foundation of selectivity, standards, and sustainability, and driven by the twin goals of access and assessed value:
Select: The Program makes it easy for Library staff to propose projects through a web form, but has a rigorous evaluation process to select which proposals we take through to completion. We aim to invest our time and funding in areas that have a high return relative to the investment. "High return" does not equate to popularity on the web simply as an indicator of value: while we have published items that get heavy general use (Who's Who and What's What in Dr. Seuss gets thousands of accesses a month), a body of content that is immediately useful to a particular Dartmouth class or a single researcher is also often a high-yielding, high-priority project.
The Program will work closely with the Collections Management and Planning Group (CMPG) to ensure that the selection process meets the Library's overarching goals for the growth of its collections. A subcommittee of CMPG will review and prioritize proposals through that lens.
Standardize: Throughout this Program we are identifying and following international best practices and standards for the digitization, open access delivery, and preservation of our online materials, from XML encoding and high-quality master files, to best practices for managing long-term digital repositories.
Sustain: We seek to ensure the sustainability of this Program by combining newly-endowed staff positions in digital production and program leadership with existing staff members in a range of Library departments. This distributed staffing model draws expertise from across the Library, and embeds the Program in the day-to-day work of multiple departments.
Access: from a rights management policy to the attention we give to how best to deliver, describe, and publicize our content, we are driven by an urge to ensure the widest practical accessibility for our digitized collections and open access publications.
Assess: The Library has worked diligently in recent years to imbue a culture of assessment throughout the organization, and this Digital Library Program is no exception. As part of the planning of a project we will set assessment markers to measure its impact, and the Program as a whole will be assessed annually.
We believe that this new Program is of strategic importance for the Library, aligned with our mission to foster intellectual growth by supporting excellence and innovation in education and research 1 and our vision of "inspiring ideas for personal transformation and global impact."
We look forward to the day, not far hence, when students analyze medieval manuscripts from Rauner on their iPads, or view a century of geo-tagged Dartmouth photographs on their smart phones; when users from all over the world tap into online dissertations and new kinds of digital Dartmouth scholarship; and when our alumni enjoy archival films showcasing Dartmouth College life from decades gone by.
1 Dartmouth College Library Mission Statement: “The Dartmouth College Library fosters intellectual growth and advances the mission of Dartmouth College and affiliated communities by supporting excellence and innovation in education and research, managing and delivering information, and partnering to develop and disseminate new scholarship." Back to text
This document outlines a Digital Library Program that allows us to provide local digitization and open access digital publication services in an orderly manner as an ongoing Library endeavor. This report incorporates and expands on 2010's The Dartmouth Digital Library: Program, Priorities, and Policies, which laid out the scope and purpose of the Program, and offered recommendations for moving us forwards.
A number of the recommendations outlined in the 2010 report have been achieved or are in active development, including the adoption of the Program's scope and purpose; ongoing infrastructure design; a developing expertise in the Digital Projects and Infrastructure Group (DPIG) for evaluating project proposals and designing workflows; fund-raising priorities given to the Digital Library Program; and a willingness on the part of multiple departments to take on new collaborative roles as part of a shared staffing model. Other 2010 recommendations are carried over here, after review to ensure their continued validity. See Appendix I for a full report.
Over the last two years DPIG has drafted several policies to support the Library's Digital Program, including a selection policy for digitization projects, a digital preservation policy, and a rights management policy. We believe these statements provide the essential intellectual infrastructure needed for the implementation of a successful digital library program.
In FY11/12, DPIG is designing "a robust and adaptable digital production process for local collections that includes selection and production policies and a prioritization process to maximize the impact on education and scholarship" [Dartmouth College Library FY11/FY12 strategic objective]. This document is the result of that work and an identification of essential next steps.
The Dartmouth Digital Library Program comprises of the following services, as a complement to our purchasing of ebooks, electronic journals, and databases.
Digitized Library Collections
This category contains digitized surrogates of physical items already in the Dartmouth College Library collections. A good digital collection facilitates discovery, access, analysis, interoperability, and re-use; it combines objects, metadata, and user interfaces to create a satisfying user experience,  and it is "born-archival" ; that is, it is created to preservation standards that help ensure its permanence in the collection. All media types in which the Library collects are in scope for the digital program, including but not limited to: film, video, audio, manuscripts, books, maps, dissertations, theses, photographs, and realia. Digitization at Dartmouth is designed to be a highly strategic endeavor, however, as demand will always outstrip our digitization capacity. While the scope is wide, the work is bounded by the priorities, policies, and procedures outlined in this document.
Our open access digital publishing endeavors focus on scholarly publications that are "born digital" and created by Dartmouth faculty: that is, books, journals, or editions which have their original publication in a digital format, rather than items published in paper form (or other analog media) and converted through digitization.
As a practical qualification to this scope, however, the Library has a limited capacity to support new journals, given the ongoing commitments to produce and archive them. We are better able to support original scholarship that requires a finite span of intensive work in its creation followed by the lower-energy phase of archiving. Therefore, preference will be given to "encapsulated" publications such as monographs or edited items (letters, a video, a collection of visual objects accompanied by text).  Recent activity has included a critical edition and transcription of a rare manuscript, several journals, and a monograph in partnership with the University Press of New England (UPNE).
Dartmouth College Library has a growing ambition to deliver digital versions of selected Library exhibits, and this Program aims to incorporate and support these developing needs. Discussions are underway initially with Baker-Berry exhibit staff to determine how best to decide when to provide online versions of physical exhibits. Exhibits from all parts of the Library system are in scope, however, and we recognize that the exhibits program encompasses multiple libraries. The digital production capacity that the Dartmouth Digital Library Program brings us can also help with the creation of digital objects for online exhibits.
2 This definition is based on the NISO Framework of Guidance for Building Good Digital Collections(NISO, 2007, p. 4). Back to text
3 This concept is borrowed from Smith (2007), p.19, and from the Sixth International Conference on Preservation of Digital Objects (2009). Back to text
4 Kirk (2008), p. 3. Back to text
Dartmouth College Library has adopted the following priorities in developing its digital program (listed in order of importance):
- Selective digitization of library holdings. Priority will be given to projects that have a clear teaching or research impact, and which will make a difference to a known user community. Project selection should support the Library's overall collection development goals.
- Support for publication of original scholarly works in digital form.
- Transfer of existing electronic journals into the Dartmouth publication space.
- Support for publication of newly created electronic journals by Dartmouth editors.
These priorities are informed by our budget realities, our staffing resources, our existing and developing infrastructure, and our experience with ongoing digitization and e-journal projects.
NISO's Framework of Guidance for Building Good Digital Collections (2007) promotes nine characteristics, which we are applying to our online collections, publications, and exhibits:
- Intentional: created according to an explicit policy.
- Clear: described in a manner to allow one to determine the authenticity, integrity, and interpretation (scope, format, restrictions on access, ownership) of the item.
- Curated: actively managed during its lifecycle.
- Accessible: avoiding unnecessary impediments to use, and accessible to persons with disabilities.
- Respectful: conscious of the intellectual property rights of all parties.
- Useful: supplies data that allows standardized measures of usefulness.
- Interoperable with other systems, both local and international.
- Integrated into the user's teaching and research workflows.
- Sustainable over time.
These principles inform our thinking in the formation of the program definitions, policies, workflows, infrastructure, and recommendations contained in this document.
The Digitization Workflow outlined here spans many Library departments. In this regard, it is similar to the workflows that are applied to newly acquired materials, where items move through a number of departmental processes depending on their type and condition. This Digital Library Program, therefore, like so much of our daily work, is built on and will be sustained by a shared staffing model, even after full-time digitization program staff members are in place. Moving forwards, we continue to see key roles being undertaken by existing staff in a variety of departments, although this model is unlikely to succeed without the addition of a full-time Digital Production Manager, who will manage, document, and evaluate the digitization process for multiple simultaneous projects, and a Director of Digital Initiatives, who will coordinate and develop the Program and its policies as a whole, promote our services, liaise with stakeholders, and provide ongoing assessment (see Resources, below). The following sections outline in summary form the main steps we see as necessary in the Program, and begin with a graphical summary of the workflow. More detailed descriptions of these stages, with color-coded flow diagrams, can be found in Appendix II: Detailed Workflows.
The workflow articulated here is a generalized one that can be applied to manuscripts, photographs, films, sound recordings, born-digital items, or files that are the result of previous digitization. 5 While the focus in designing these workflows has been chiefly on the digitizing of materials from our Library collections, the processes also explicitly support the creating of digital content for exhibits and born-digital publishing, and to their delivery and long-term archiving.
The stages discussed here are the result of planning sessions drawing on expertise from many departments, and tempered by what we have learned in our recent digitization and publishing projects.
These workflows require the development of a largely new technical infrastructure to support the Program. Appendix III offers recommendations on building out thattechnical infrastructure and lists the characteristics, or “behaviors,” of its five major components:
- Metadata Workspace System
Long-Term Object and Metadata Storage System
Item Tracking Worksheet
Project Management Worksheet
- User Delivery System
In FY11’s planning we have deliberately avoided recommending specific software packages or hardware configurations, but have concentrated on what we want the infrastructure to do for us. In FY12, members of DPIG will refine these technical infrastructure elements, and will work with DLTG to recommend the hardware and software that can instantiate these desired repository behaviors.
The first step in the process is to gather recommendations for desirable and impactful items and collections to digitize. To achieve this, any Library staff member or committee can propose a collection or item to be digitized by completing the Project Proposal Form that is available in the Library Staff Web [see Appendix IV].
The Collection Management and Planning Group (CMPG) will develop a database for proposals that it will then prioritize in terms of how the individual projects support the Library’s overarching goals for the growth and management of its collections. The desired outcome of this process is to continue to develop the Library’s collections in a manner that is intellectually coherent and responsive to the resource needs of the Dartmouth community and our extended collecting partners (especially BorrowDirect).
Good proposals describe collections and items that have a high impact relative to the effort and cost of digitizing them and support the Library’s overall collection development goals. The proposer uses the guidelines for filling out the form [see Appendix V], and proposers and reviewers use the Selection Policy for Digitization Projects as a guide to developing and reviewing the proposal [see Draft Policies section]. The evaluation therefore considers the impact, timing, cost and complexity of the project. The proposal is shared with DPIG and bibliographers and the information is made available on Staff Web. A designated DPIG member reads the proposal and consults with the proposer if more information is needed.
DPIG then reviews the revised proposal to determine whether to move forward with the discovery and planning for the project, return to the proposer for more information, or decide that the project is out of scope. The result is made known to bibliographers and DPIG. If the project advances, a Project Manager is appointed.
This method of selection will inevitably result in some projects that digitize only a small portion of a given collection. For example, we already have two 19th century comic books online (The Adventures of Mr. Obadiah Oldbuck and The Fortunes of Ferdinand Flipper) from a larger collection of 89 caricatures and cartoons listed for Special Collections in the Library Catalog. From a collections standpoint this may be undesirable – if our comic books collections are a priority we would rather have all of them digitized – but the impact on a given class of scanning the two they were concentrating on was sufficiently high-yield and high-service to justify these “one-offs”. As the Program matures and capacity increases, we would hope to be able to balance these service-driven requests for items with our ability to decide as librarians what collections as a whole should be digitized, and our ability to liaise with faculty and students on larger-scale selection priorities. DCAL, the Library’s student advisory group, and the Council on Libraries are all potential partners in this discussion.
Discovery and Planning
When a project passes the initial selection and evaluation process, a Project Manager is identified, and a Project Team is formed to conduct a thorough review of the project and create an action plan based on input from all relevant areas of expertise. The Team may often include (or consult with) members who are not part of DPIG, including other Library staff and faculty. This team may discover information at this stage that leads to the rejection of the proposal, or at least its postponement pending rights, resources, or other questions being successfully resolved. Appendix VI gives an example of the Project Management Worksheet we use to manage this process and record our decisions.
If DPIG decides that the project should move forwards after the discovery phase, the Project Team develops a success statement for the project, and plans the digitization workflow for that collection of objects. The Project Team typically includes individuals from Preservation, Cataloging and Metadata Services, and other stakeholders or subject experts as needed for a given project. The Project Team reports back to DPIG the results of the workflow planning for the collection, including delivery and curation of the digital objects, metadata needed, and a description of how users will access the product.
The Digital Production Unit provides an ongoing, professional production capacity within the Dartmouth College Library. All work undertaken by the Library will meet current digital library best practices for digital preservation, digitization metadata, and digital capture benchmarks. 6 This Unit is responsible for documenting the production workflows it manages, and will collaborate with other parts of the Library that have production expertise and equipment where they exist, especially the Jones Media Center for audio and video re-formatting.
Digitization may take place in the Library or it may be contracted out to a commercial vendor when budgets allow and the material in question can safely travel. In both cases the Digital Production Unit is responsible for making sure that the material is digitized to agreed-upon standards, technical metadata sufficiently describes the digitization processes, our file-naming conventions are applied, the correct derivative files are generated for the subsequent delivery stage, and the materials are packaged up and ingested into the long-term repository.
Appendix VII is a draft of the item management worksheet that will be filled in as an item travels through the various digitization stages. This information will be included in the future electronic project management tracking and workflow database, to be created as part of the Digital Production Infrastructure.
Metadata and Markup
Descriptive metadata and text markup standards, design, and activities are determined by the nature of the desired user discovery experience. During Discovery and Planning, the Project Team describes how library users will query and interact with the digital item(s), and a metadata plan emerges from this description. Metadata and markup activities include defining the appropriate schema(s) to use, the “level” of objects to be described, controlled vocabulary development, and the extent of the description to be created.
Metadata is created and/or transformed to meet the definitions in the project plan. This work can take place before, during, or after digital conversion, whether that conversion is done in-house or is outsourced. Pre-existing descriptive metadata for the physical item(s) are transformed to describe the digitized item(s), and/or new descriptive metadata are created. When needed, XML text markup is created for textual materials.
Through quality assurance tests and sample metadata sets, the metadata specialist assesses the outcome to ensure that the project’s success statement will be met, and that user discovery is supported. In some cases, the metadata specialist will plan for future enhancement of descriptive metadata or markup as new uses of a collection may emerge in the future.
Packaging and Archiving
At this stage, the master files, access files (“derivatives”), and associated technical, descriptive, and rights metadata are packaged up and ingested into the repository infrastructure. Today, this consists of secure SAN storage in the College’s computing system, and an agreed-upon layout of files and directories manually managed by DLTG and the Digital Production Unit. The infrastructure we need moving forwards, whose behaviors are outlined in Appendix III, will be more automated and will include a formal data-packaging model (probably METS, used widely in other libraries); data integrity checks through frequent audits and validation tests; and redundant storage. Disaster recovery processes will be established and annually reviewed. If a digital object no longer contributes to the collecting policies of the Library it may be removed from the collection following established procedures for review and de-accessioning.
User Interface and Platform
Whenever possible, all content will be accessible online from a central online digital library web space; items are also part of the Library collection and will be reflected in the Library catalog and in Special Collections finding aids/collections web pages. Whenever possible, all content will be made available for harvesting into external systems such as Summon, Oaister, etc., as an expression of our ambition to share our holdings widely. In occasional cases, we may digitize materials for local use that cannot legally be shared globally.
Content will be matched to the most appropriate online delivery platform extant in the Dartmouth Digital Library system, and delivered through that interface. In some cases, content will be delivered through multiple tools – a general digital collections interface and a more specialized image manipulation tool, for example. 7
Delivery tools and desired uses for digital content will change over time. In anticipation of this, master files are created to ensure maximum flexibility for future use. The file formats chosen, metadata recorded, and storage practices used for master files are carefully selected from existing standards (TEI, TIFF, OAIS, PREMIS, etc.) that have been developed with long-term accessibility and curation in mind.
The Library does not typically charge fees for access to its resources, and this will hold true for materials that are produced through the Digital Library Program. When legally possible, materials are made available to a global audience through policies, practices and licenses that enable open access to the materials, and re-use of the materials for scholarly purposes (while recognizing the copyrights of all parties involved).
To this end, the Library has developed Rights Policies for the Publication of and Access to Digital Works" [see Draft Policies section]. The key provisions are:
- Respect for and adherence to U.S. copyright law;
- Assertion of rights to use copyrighted information to the fullest extent provided by law;
- Author retention of copyright;
- Assertion of Dartmouth College copyright or trademark (where appropriate) for work produced by the Library, work done to create the digital “home” or web site on which a resource resides, and the logos and marks of the College;
- Licenses for use that enable limited permissible uses by other parties without case-by-case intervention by the Library.
Assessment is integrated throughout the Digital Library Program and includes these components:
- Assessment of the value of the project: A needs assessment is built into the project proposal review process, where the demand for the project, how the project supports the Library’s overall collection development goals, and the kinds of use made of the project are integral parts of the evaluation of the project.
- Assessment of each major step in the project: Before the project can move from one major step in the workflow to the next, the success of that step is evaluated by DPIG, after reports from the Project Team. The Digital Production Manager will take a leading role in this assessment process.
- Assessment of the success of each project: Each project has a “success statement” which will be used to determine when the project has completed its goals.
- Assessment of the impact of each project over time: Each project will develop its own impact assessment, in coordination with the key stakeholders, which may include the subject specialist, faculty members involved, intended users, and grant funders. This assessment will draw on expertise in the Library’s Assessment Committee and User Assessment Group. If a project is deemed by a stakeholder to be no longer relevant or useful, it will be considered for withdrawal.
- Assessment of the Program as a whole: DPIG will report on the progress of the Program as a whole through its annual report, which will include sections on the sustainability of the current plans, the number of projects, staff involved, and impact of the projects.The Director of Digital Initiatives will take a leading role in this process.
5 An instance of the latter case could be one in which text pages were originally scanned to generate images of the pages, and now those images are being subjected to OCR and the resulting text is marked up so as to produce a searchable corpus of the original content. Back to text
6 A local document outlines these digitization standards (Dartmouth College Library, 2006). Back to text
7 The Brut Chronicle is a good example: /~library/digital/collections/manuscripts/ocn312771386/index.html Back to text
To make this new service sustainable will require significant financial and intellectual capital, and the collective desire to invest in this Program over other worthy service-oriented goals. We need policies to provide clarity and rigor in scope, selection, and priority setting, and we need funding and staffing commensurate to this activity’s importance to the Library.
A Program of this scope and priority needs both full-time and shared positions dedicated to it. Minimum suggested staffing includes the following:
Director of Digital Initiatives, whose job is to make sure the Program moves forward effectively. The Director of Digital Initiatives will coordinate the many groups involved in the various portions of this Program, manage the budget of the Program, promote its work, assess its success, and seek outside resources and partnerships to extend its reach. This person will be the public face of the Program reaching out to users, and managing selected major projects.
Digital Production Manager, whose job is to undertake the creation of digital content efficiently and in accordance with the policies of this Program. The Production Manager will make sure that the Digital Production Unit runs smoothly, coordinating both local production processes and using outside vendors when cost-effective. This position will also be responsible for the design and documentation of production workflows, evaluation of equipment, project tracking, and assessment of the production piece of the Program. Filling a key role in the middle of a workflow that ranges from proposal to preservation, this person will have to be able to coordinate with members of the Project Teams working on the pre- and post-production stages. Management of digital conversion should be attached to a department with existing digitizing and workflow expertise; Preservation Services is a good candidate.
In addition, existing staff members from a variety of departments have already committed time and expertise in a growing collaborative team, and the continuation of this is vital to the ongoing Program. Appendix VIII illustrates how this shared staffing model has worked in a recent project.
These statements are a commitment of strategic growth within the departments too, who are ambitious to align themselves with our shared digital future.
Subject specialists/liaisons maintain a unique role in the digitizing process due to their rich relationships with their faculty and with the collections supporting the curriculum and research in their subject areas. Subject specialists/liaisons collaborate with faculty in their teaching and in growing the library’s collections to support both the faculty’s research and curricular needs. Due to their sustained development and maintenance of the library’s collections they retain a rich understanding of its strengths and weaknesses over time as well as which elements of the collections may be significant candidates for digitization. http://researchguides.dartmouth.edu/subjectlibrarians/
CMPG is responsible for setting the Library’s overarching collection development goals and for monitoring how the collections grow and are managed. CMPG provides expertise in the selection, acquisition, bibliographic description, and preservation of collections in all formats, as well as the user services created to support the Dartmouth community’s effective use of the collections. CMPG will appoint a subcommittee to review and prioritize proposals for digitization from a collection development point of view.
Cataloging and Metadata Services is responsible for the design of intellectual access, descriptive metadata, and text mark-up for the Digital Library Program. It is also responsible for the creation, organization and management of all descriptive metadata and text mark-up for materials in the Program. This responsibility includes the design of user access models and the application of appropriate schemas, standards and vocabularies to ensure successful discovery by library users. It also includes the creation of detailed inventories of materials to be reformatted, and analysis of the availability of descriptive metadata. At the core of the department’s contributions to the Program is the intellectual process of analyzing the content of materials and creating descriptions, applying mark-up to text, and applying vocabularies to enhance discovery and scholarship. Through careful management of metadata, the department’s activities ensure that the metadata can be shared with external systems, repurposed for alternative discovery platforms, and preserved within the context of a digital preservation plan. Cataloging and Metadata Services staff members are uniquely qualified to organize and manage complex production throughput processes, and can assume responsibility for project set-up and throughput management for the Program at any time. The department assumes an initial level of work for the Program at approximately five FTE, in an anticipated “start-up” range from three to seven FTE depending on the needs of the Program at any given time. It is deeply embedded in the management culture of the department to absorb new projects, adjust priorities and staff resources as needed, and move from project to project in a rapid, flexible, continuous manner.
Preservation Services has assigned .25 FTE of the Collection Conservator's time in reviewing and conserving items prior to and post digitization. Additional time of conservation technicians or students is available as needed. Preservation Services is also building expertise in digital preservation and curation; the Head of Preservation Services dedicates about .10 FTE and the Preservation Specialist .25 FTE to these areas. Furthermore, the Readex Conservation Technician is assigned to Digital Production for .5FTE and the Preservation Specialist could take on .25 FTE in this area if needed. There is a clear need to have a digital preservation librarian in the future.
The Digital Production Unit, soon to be headed by the Manton Digital Production Manager, provides the orderly digitization of Library collections, whether the work is conducted in house on our local scanning equipment or sent out to a vendor. This unit typically sits in the middle of a project workflow, after selection and preservation of physical objects, and prior to delivery.
The Digital Library Technologies Group will provide significant dedicated programming time, especially during the development phase. DLTG members have been invaluable during this year’s workflow and infrastructure planning and we expect to see similar consultative roles continue into the future. DLTG staff will be primarily involved in implementing the electronic versions of the Project Management Worksheet (PMW) and the Item Tracking Worksheet (ITW) and will design the long-term storage infrastructure. Based on specifications that have been developed thus far, DLTG expects at least 1 FTE-year to get these systems to a level of general usability, with ongoing development at a reduced staff load following. Current staffing levels in DLTG will be a challenge to these ambitions.
Special Collections has proven to be a significant participant in digital projects to date and, due to the nature of its holdings, Special Collections expects to continue to support the Library’s digital program. The Department is working to reposition its staff to support digital projects and collection building. The newly filled Manuscript Processing Specialist position will spend a portion of her time assisting with collection preparation and delivery for digital projects. In addition, Special Collections expects to have other members of the staff participating in, or leading projects, in the future.
Digital Resources Director: 25% of this position is now dedicated to the digital publishing program.
Web Team: the Web Team’s role to date has been limited by current staffing levels, but the Team is ambitious to provide HTML support and design expertise for the public-facing websites that deliver and contextualize the Digital Library content.
General Library staff: in addition to the departments named above, we will look out for opportunities to cross train staff elsewhere in the Library system, as needs and interests dictate. For example, staff who already possess scanning expertise in Access Services may be able to contribute to digital production processes if they have downtime.
The Digital Library Program requires the implementation of a technology infrastructure to support the work of the staff through tools to support project and metadata management, safeguard the materials produced over the long term, and deliver a good user experience.
- The Digital Library Program requires the implementation of a long-term storage repository in order to proceed further. Highest priority should be given to the articulation of specifications for this storage repository, and subsequently to its development and implementation, including retrospective ingest of existing digital files and accompanying metadata and application of digital preservation policies. If new staff positions are created and filled, these new staff members should be directed to contribute to this effort until it is accomplished.
The Digital Library Program requires the implementation of project management, item management, and metadata management support systems. Implementation of the workflows described in this report requires the implementation of these support systems to enable coordinated contributions from all project staff and the simultaneous progress on many projects. These support systems must be available and useable by everyone who contributes to the program, from any workstation on the network. High priority should be given to the articulation of specifications for these systems, and subsequently to their development and implementation, in order for the workflows in this report to be implemented.
- Program workflows require the creation of unique object identifiers, (i.e. semantic identifiers that become the basis for URLs) to ensure persistent identification and access throughout the process. The library should adopt a scheme for the creation of these identifiers and apply it retrospectively to pre-existing digital collections.
- Documentation of policies and practices for preservation, metadata, long-term storage, rights policy, preferred user delivery systems for various data formats, and generalized workflows for common object types need to continue to be developed.
The ambitions, workflows, and procedures outlined here are the work of many individuals across the Dartmouth College Library and represent, we believe, the firm foundation for a new area of strategic concentration for the Library. In addition, we have the opportunity to develop these policies, services, and content in partnership with other peer institutions with similar ambitions, perhaps within the BorrowDirect consortium as part of its evolving collaborative work around collections. We feel confident that we have the main issues and challenges mapped out and are ready to implement a Program whose potential -- seen in miniature through the projects we have undertaken during the planning stage – can be to offer exciting, even transformative, new services and ways of working with our collections for research, teaching, and life-long learning.
Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information. http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf
Consultative Committee for Space Data Systems. Reference Model for an Open Archival Information System (OAIS). http://public.ccsds.org/publications/archive/650x0b1.pdf
Dartmouth College Library (2002, March 3). Dartmouth library builds 'born digital' capability. Press release. Retrieved January 11, 2010, from http://www.dartmouth.edu/~news/releases/2002/march/030302b.html
Dartmouth College Library (2006). Digital publishing standards. Retrieved January 11, 2010, from /~library/home/about/ digpub_standards.html
Dartmouth College Library (2009). Baker-Berry Library exhibits policies & guidelines. Retrieved January 13, 2010 from /~library/leo/exhibits_policy_06_2009.pdf
Digital Preservation Coalition: Digital Preservation Handbook. http://www.dpconline.org/advice/preservationhandbook/introduction/definitions-and-concepts
Digital Projects and Infrastructure Group. The Dartmouth Digital Library: Program, Priorities, and Policies. /~library/admin/docs/DDL-policy.pdf
iPres (2009). Sixth international conference on preservation of digital objects. Retrieved January 11, 2010, from http://www.cdlib.org/iPres/call_for_abstracts.html
JISC, DPC, Digital Archives Department of the University of London Computer Centre, Portico. JISC Project Report: Digitisation Programme. http://www.jisc.ac.uk/media/documents/programmes/digitisation/jisc_dpp_final_public_report.pdf
Kirk, E (2008, July). Dartmouth digital library services: Refining our structures and direction. Unpublished white paper, Dartmouth College Library. Retrieved January 11, 2010, from /~library/col/0809/docs/DC_Lib_Dig_Pub_Directions.pdf
NISO (2007, December). A framework of guidance for building good digital collections (3rd edition). NISO: Baltimore, MD. Retrieved January 11, 2010, from http://www.niso.org/publications/rp/framework3.pdf
Oxford Digital Library. Metadata in the Oxford Digital Library. http://www.odl.ox.ac.uk/metadata.htm
Smith, A. (2007). "Valuing preservation." Library Trends, 56(1), 4-25.
Archival Information Package (AIP) - An information package consisting of all production content information and its complete associated descriptive metadata (DMD) and preservation description information (PDI), which is deposited into long-term storage.
Archive - Place where objects are deposited with expectation that they may be accessed for use long into the future.
Archiving - Activities that enable long-term retention of digital materials. Together with curation, often referred to as stewardship.
Backup - Duplication of data either on-site or at a location removed from the original data. Assumes no managed activity to ensure data is accessible in the future.
"Cooked" Digitized Files - Files produced from the production steps immediately after digitization, including such processes as image cropping and straightening, OCR generation and correction, XML markup, and quality assurance. Depending on the type of digitization process used, "cooked" digitized files may include Master files (such as images that have only been cropped and straightened) and Derivative files (such as low-quality delivery images or XML markup). In certain cases files may be "cooked" all or in part by a vendor.
Curation - Activities that enable use and long-term accessibility. In digital preservation, curation and archiving together comprise stewardship.
Dark Archive - A digital repository that is not publicly accessible; often used for secure storage and backup, and for materials embargoed for one reason or another.
Derivative Content Files - Files that are created as part of the "cooking" processes of production. Derivative content files may or may not be kept in long-term storage as part of the archival information package (AIP), depending on the specifics of a project.
Descriptive Metadata (DMD) - Information describing the intellectual content of the object, such as MARC cataloguing records, finding aids or similar schemes.
Digital Archiving - This term is used very differently within sectors. The library and archiving communities often use it interchangeably with digital preservation. Computing professionals tend to use digital archiving to mean the process of backup and ongoing maintenance as opposed to strategies for long-term digital preservation.
Digital Preservation - The series of management policies and activities necessary to ensure the enduring usability, authenticity, discoverability, and accessibility of content over the very long-term.
Digital Repository - A place where digital assets are deposited and stored.
Lifecycle - A series of stages through which something, in this case digital information, passes during its lifetime. The lifecycle for digital information includes creation, use and reuse, migration or emulation, and storage.
Lifecycle Management - Records management practices have established lifecycle management for many years, for both paper and electronic records. The major implications for lifecycle management of digital resources... is the need actively to manage the resource at each stage of its lifecycle and to recognize the inter-dependencies between each stage and commence preservation activities as early as practicable. This represents a major difference with most traditional preservation, where management is largely passive until detailed conservation work is required, typically, many years after creation and rarely, if ever, involving the creator.
Long-term Storage - A conscious decision to retain object in perpetuity or until agreements or selection policies change. Also implies management of object to migrate data as necessary to keep it accessible and understandable.
Master Content Files - Files that are created as part of the initial digitization steps of content capture. Master files may have undergone minor "cooking" processes such as cropping and rotating, so long as there is no loss of original content information, such as through image compression. MASTER files may be produced in-house or by a vendor. Master files will always be deposited in long-term storage as part of the archival information package (AIP).
Open Access – material that is accessible at no charge to the world on the Internet. Open Access is supported by rights policies that encourage a wide variety of non-commercial uses. "Open Access is compatible with copyright, peer review, revenue (even profit), print, preservation, prestige, quality, career-advancement, indexing, and other features and supportive services associated with conventional scholarly literature." http://www.earlham.edu/~peters/fos/overview.htm
Preservation - Activities that enable the use and long-term accessibility of information; often used interchangeably with stewardship.
Preservation Description Information (PDI) - Information that is necessary for the preservation of information content, including descriptive, administrative, structural, and technical metadata.
Preservation Strategy - The series of decisions taken over the course of the digital lifecycle to ensure long-term accessibility and usability, and to reduce outstanding risks to loss and degradation of the materials.
Production Information Package (PIP) - An information package consisting of all outputs of the pre-production stage, including the original project proposal, the project management worksheet (PMW) completed through pre-production, a project title and unique ID number, identification of the project team, an object inventory, and identification of existing descriptive metadata. This package must be complete prior to beginning the production stage of a project.
Project Management Worksheet (PMW) - Record of decisions and other information about a digital production project. This will be updated as part of the workflow during each stage of the project. Eventually this will be in the form of a database/web form, but an outline version currently exists in the appendix of the Dartmouth Digital Library Policies document.
"Raw" Digitized Files - Files produced during the initial digitization (scanning, image capture, etc.) and quality assurance processes. These are the first digital files produced from the original item(s), and may or may not be saved as the Master files, depending on the type of digitization process used. "Raw" files may be produced either in-house or by a vendor.
Submission Information Package (SIP) - An information package consisting of all outputs of the production stage, including the outputs of pre-production, the project management worksheet (PMW) completed through production, the master content files, derivative content files, descriptive metadata (DMD), and preservation description information (PDI). This package must be complete prior to beginning the post-production stage of a project.