Policies and Guidelines

Dataverse Deposit Guidelines

Purpose

The purpose of this document is to provide researchers and data creators with best practices for depositing data for preservation and storage into the MacEwan University Dataverse. For questions or assistance with Dataverse, please contact Library Data Services at data@macewan.ca.

Introduction

Dataverse is an open-source, web-based data repository tool developed by the Institute for Quantitative Social Sciences (IQSS) at Harvard University. Dataverse is used by institutions all over the world, and the Canadian instance is hosted by Scholars Portal. Dataverse is used to manage, store, share, preserve, explore, and analyse research data.

MacEwan University manages its own Dataverse within the Scholars Portal Dataverse and MacEwan researchers can create and manage their own Dataverses within the MacEwan Dataverse. Researchers can also deposit individual Datasets into the MacEwan Dataverse. Dataverse accepts a wide variety of files including tabular, text, and images.

A Dataset is a container for an individual data set. Datasets all have an associated metadata record that will be created when the Dataset is deposited.

Dataverse has many benefits including:

  • Provides a venue for secure data management.
  • Allows researchers to share Datasets with collaborators or to make their data publicly available.
  • Provides versioning of Datasets.
  • Provides long-term, stable access to Datasets.
  • Increases research visibility.
  • Enables researchers to meet grant or publishing requirements that require researchers to deposit data related to a publication or research project.

Expectations for Depositors

Data deposited into Dataverse must support the teaching and scholarly activity of MacEwan University. Data must be generated through the course of a scholarly pursuit, teaching activity, or research project and/or deposited with the expectation of future use for those purposes. The depositor must be the creator of the data or have permission from the creator to deposit data into Dataverse.

Many files may be created over the course of a research project. An important step for Depositors in preparing their data for deposit is the selection or appraisal of which files should be deposited and preserved. In making these decisions, Depositors should provide enough supporting documentation for other researchers to understand how data were created, reproduce methods and findings, and reuse the data files.

When making decisions of what documentation to include with your data, consider what someone (or your future self) would need to know to understand, evaluate, analyze, or replicate your data without having to ask you.

When possible, Depositors are strongly encouraged to include a version of their raw data, in addition to any processed data used in published analyses and figures.

Dataverse is intended for final data resulting from a project and is not intended for in progress research data. Data can remain unpublished in Dataverse, however, there is an expectation that it will eventually be published.

Dataverse does not encrypt data and is not intended as a solution to store confidential or sensitive data. Access to files can be restricted, however, complete data security cannot be guaranteed. If raw data contain any confidential or sensitive information, Depositors should follow best practices for de-identification and ensure they have permission to share before depositing. Alternatively, a metadata record for a sensitive Dataset can be deposited to Dataverse with a note to contact the Depositor for further access.

Once a Dataset has been published it will automatically be assigned a DOI and the expectation is that it will remain published and accessible. Under extraordinary circumstances, Depositors can request the removal of a Dataset from Dataverse. The MacEwan University Library will consider such requests and make a determination regarding the removal of a Dataset. If a Dataset is linked to a publication or if removing a Dataset would violate the terms of a grant or publishing agreement, the Dataset will not be deleted. However, an explanatory note may be amended to the Dataset. If a Dataset is deleted a landing page with basic citation metadata will always be accessible to the public by using the DOI assigned to the Dataset. It is not possible to delete this metadata record.

For more information about using Dataverse, consult the Scholars Portal Dataverse Terms of Use: https://learn.scholarsportal.info/all-guides/dataverse/terms-of-use/

Standards for Deposit

Dataverse has a maximum file size limit for individual file uploads of 3GB. Before depositing in Dataverse, Depositors must ensure their Dataset(s) meet the following standards.

1. Use consistent and comprehensible file names and file structures.

Following proper file naming conventions makes it easier to navigate and find specific files, and allows other researchers to understand and reuse your Dataset.

  • Name files consistently. Create a system for naming files and stick to it.
  • Keep files names short (25 characters or less) but meaningful.
  • Do not use spaces to delimit words. Use capital letters, hyphens, or underscores.
  • Do not use non-alphanumeric characters.
  • Denote dates using ISO8601 standard YYYY-MM-DD (e.g. 2019-01-10).

A well-structured file hierarchy will make it easier to locate and share your files. Recommended practices include:

  • Restricting the level of folders to a maximum of four deep
  • Limiting the number of folders within each folder to ten

2. Deposit your files in preferred file formats to support preservation and reuse.

The use of preferred file formats is important to support the long-term preservation of your research data. Consult the following resources for a non-exhaustive list of preferred file formats.

If appropriate, files may be deposited in their original file format, in addition to a preferred format. If you have any questions about preferred file formats for your research data, contact Library Data Services at data@macewan.ca.

3. Describe your Dataset with rich metadata to facilitate discovery.

Depositors must complete all required fields in the descriptive metadata. Depositors are strongly encouraged to complete geospatial metadata fields and subject-specific metadata fields, as appropriate. Consult the following resource for guidance on Dataverse metadata fields.

MacEwan University Library data services staff may suggest changes to the descriptive metadata for the purposes of discovery, reuse, and preservation.

4. Include a ReadMe file to support correct interpretation and reuse of your Dataset.

For research data to be read and interpreted correctly, it requires sufficient documentation. All deposited Datasets must include a “ReadMe” file that includes the following information:

  • Details about Dataset creation
  • Description of files contained in the Dataset.
  • Information about Dataset completeness
  • Limitations on reuse

ReadMe files must conform to the following:

  • ReadMe files must be saved as a Unicode UTF-8 plain text file (.txt). Alternatively, ReadMe files may be saved in PDF/A format.
  • ReadMe files should use forced numbering in the filename (e.g. 00_ReadMe.txt) to make it appear at the top of the file overview.

Consult the following resource for a basic ReadMe file template:

Expectation of MacEwan University Library

MacEwan University Library data services staff will review all deposited Datasets for alignment with the Deposit Guidelines. In the case of non-compliance, the Depositor will be asked to make necessary modifications. Once your submission is approved, you will be notified, and the Dataset will be published. All published Datasets will receive a Digital Object Identifier (DOI) to allow your Dataset to be cited.

The MacEwan University Library commits to preserving published Datasets for the life of the repository. Notwithstanding, our objective remains the continued access and preservation of deposited Datasets for the long term. To support this objective, the MacEwan University Library reserves the right to convert deposited files to any medium or format and make multiple copies for the purposes of security, back up, and preservation. The MacEwan University Library will never modify file contents and will only make changes to file formats in the interest of long-term access and reuse.

Please note, the MacEwan University Library does not attempt to judge the scholarly quality of deposited Datasets and trusts the judgement and research expertise of those who created and deposited the Dataset. Thus, a determination of a Dataset’s research quality is at the sole discretion of the Contact Person as named in descriptive metadata.

The MacEwan University Library reserves the right to remove any submission at any time for any reason, including, but not limited to, upon receipt of claims of copyright infringement or privacy or confidentiality breach.

References

The information above has been adapted from several sources, including:

Authorship and Approval History

2021-04-20 Drafted by Tara Stieglitz
2021-08-31 Approved by Library Council