Data Management Plan Checklist

What will you be producing? - Types of Data

Observational data captured around the time of the event

Examples: Sensor readings, telemetry, survey results, neuroimages
Usually irreplaceable

Experimental data from lab equipment

Examples: gene sequences, chromatograms, toroid magnetic field readings
Often reproducible, but can be lengthy and expensive

Simulation data generated from test models

Examples: climate models, economic models
Models and metadata (inputs) more important than output data.
Reproducible, but possibly expensive

Derived or compiled data

Examples: text and data mining, compiled database, 3D models
Reproducible, but possibly expensive

Samples and other non-digital data forms

Samples, physical collections, notebooks
All may be considered data for the purposes of presenting a data management plan

Other Data Examples

Digital Data
Software
Samples
Curricular Materials
Physical Collections

File Types

Text: e.g. ASCII, Word, PDF

Numerical: e.g. ASCII, SAS, Stata, Excel, netCDF, HDF

Database: e.g. MySQL, MS Access, Oracle

Multimedia: e.g. JPEG, TIFF, Dicom, MPEG, Quicktime

Models: e.g. 3D VRML, X3D

Software: e.g. Java, C

Domain Specific: e.g. FITS in Astronomy, CIF in Chemistry

Vendor Specific: e.g. Varian NMR data format, LeCroy digital oscilloscope format.

Where will the data be stored? - Data Storage

Personal computer
Cloud storage
Lab server
ThayerFS
Webserver
rSTor

Data Backup

Frequency - How often?

Location(s) of backups or file copies - Office, building, off-site

What kind of system or software - College backup (NetBackup), Retrospect, Online: Mozy or Carbonite

Testing procedures - will you test the restore process to make sure backups are working correctly.

Levels of Data

What are levels of data?

Raw data -> Cleaned data -> Processed data -> Summary Level data -> Publication data
Metadata. Information about the data.

How long will you keep the data?

What are the procedures envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence.

How will you document your data?

Is there good project and data documentation?

What directory and file naming convention will be used?

Will you be using versioning controls?

Metadata

Title: Name of the dataset or research project that produced it

Creator: Names and addresses of the organization or people who created the data

Identifier: Number used to identify the data, even if it is just an internal project reference number

Subject: Keywords or phrases describing the subject or content of the data

Funders: Organizations or agencies who funded the research

Dates: Key dates associated with the data, including: project start and end date; release date; time period covered by the data; and other dates associated with the data lifespan, e.g., maintenance cycle, update schedule

Location: Where the data relates to a physical location, record information about its spatial coverage

Methodology: How the data was generated, including equipment or software used, experimental protocol, other things one might include in a lab notebook

Sources: Citations to material for data derived from other sources, including details of where the source data is held and how it was accessed

List of file names: List of all data files associated with the project, with their names and file extensions (e.g. 'NWPalaceTR.WRL', 'stone.mov')

File formats: Format(s) of the data, e.g. FITS, SPSS, HTML, JPEG, and any software required to read the data

File Structure: Organization of the data file(s) and the layout of the variables, when applicable

Variable: ListList of variables in the data files, when applicable

Code Lists: Explanation of codes or abbreviations used in either the file names or the variables in the data files (e.g. '999 indicates a missing value in the data')

Versions:Date/time stamp for each file, and use a separate ID for each version (see organizing your files)

Checksums: To test if your file has changed over time.

What are my options for sharing? - Data Sharing

Self-dissemination
Discipline based repositories
Institutional repositories
Websites - www.dartmouth.edu account, departmental server, hosted server space
Cloud (Amazon, RackShare, Google, etc)
Restricted use collections

Privacy & Security

Protected personal information: medical (HIPPA), student information (FERPA)?, other?
National security?
Patent related
Other confidentiality concerns
Informed consent

Other

How the data management plan will maximize the value of the data?

IMPACT: What is the possible impact of the data within the immediate field, in other fields, and any broader, societal impact?

What about transfer of people or data?

Admissions

Academics

Campus Life

More

NSF Data Management Plan Checklist

What will you be producing? - Types of Data

Observational data captured around the time of the event

Experimental data from lab equipment

Simulation data generated from test models

Derived or compiled data

Samples and other non-digital data forms

Other Data Examples

File Types

Where will the data be stored? - Data Storage

Data Backup

Levels of Data

How long will you keep the data?

How will you document your data?

Metadata

What are my options for sharing? - Data Sharing

Privacy & Security

Other

People

Director

Find Us

Address

Talk to Us

NSF Data Management Plan Checklist

What will you be producing? - Types of Data

Observational data captured around the time of the event

Experimental data from lab equipment

Simulation data generated from test models

Derived or compiled data

Samples and other non-digital data forms

Other Data Examples

File Types

Where will the data be stored? - Data Storage

Data Backup

Levels of Data

How long will you keep the data?

How will you document your data?

Metadata

What are my options for sharing? - Data Sharing

Privacy & Security

Other

Share