NBI Digital Repository Documentation

Instructions to curate and upload research output supported by NBI

Specifications & Technical Details

Licenses

There are many license options out there and a nice review is here (scroll to the Standard License section). Another repository, figshare, also has a nice summary. An important consideration when applying a license is to balance the desire/need to restrict reuse rights with the need to make the work reusable far in the future. This is especially true for datasets. Data may need to be aggregated with other datasets and if the licenses is too restrictive, this becomes a monumental task (e.g. if aggregated datasets all have different licenses, there may be no way to easily license the aggregated dataset).

We suggest two licenses for the NBI Digital Repository:

There are six main Creative Commons licenses, outside of the CC0 license, each with various restrictions, and most do not meet Open Access requirements:

Here is what some of the terms mean:

Adding derivative or commercial restrictions to a license will make the material (the dataset) less reusable and less interoperable. The Share Alike addition can also cause problems. Rathmann (2018) provides a great overview of licensing data and the paper is on Zenodo! The following quotes from a variety of sources summarize the impact of restrictions:

“This means the Share Alike and No Derivatives conditions might have further reaching consequences than intended.” (Digital Curation Centre )

“the No Derivatives condition would likely disallow most substantive types of reuse” (Digital Curation Centre)

“Similar to how a non-commercial licence might restrict meaningful reuse of your dataset, a ND [No Derivative] license can have the same effect: it may prevent someone from recombining and reusing your data for new research.” (OpenAIRE)

“…using a non-commercial licence may prevent researchers from using your data in work destined for publication. This can subsequently affect the dissemination, recognition, and impact of your dataset. And it is definitively NOT open access.” (OpenAIRE)

“…the prohibition of monetary compensation may also prevent non-governmental organisations (NGOs) from re-using the data.” (Rathmann (2018))

There is a data specific license available called the Open Data Commons Attribution License but it seems that the Creative Commons licenses are a better option for now.

For Creative Commons Licenses, use version 4.0 because this allows a better attribution format (read more at the Digital Curation Centre)

This interactive tool will help you choose a license.

Controlled Vocabularies

These lists are either specified by Zenodo or created specifically for NBI. To add a term to an NBI controlled vocabulary (Geographic Location, Specific Location, Study Type, or Method) open the Controlled-Vocabulary Google sheet in the NBI Google Drive. You must add the term to each applicable list. For example, here are the steps to add “Lily Pond” to the Specific Location lists:

  1. In the Controlled Vocabulary sheet, add Lily Pond to the bottom of Specific Location1
  2. Highlight the Specific Location1 terms and sort alphabetically
  3. Copy and Paste that full list into each of the other five Specific Location columns.
  4. Add any other terms to other columns as needed.
  5. Scroll to the right and hit the button that says “Update Form” (This will take up to a minute so be patient)

The last step puts all your updates into the Metadata Form dropdowns.

Study Type

Group

Geographic Location

Specific Locations

This is a very long list and is available here.

Methods

This is also a very long list and is available here.

Contributor Relationships

Google Drive Organization

The following diagram shows the folder structure to store research outputs. Files are stored by category (Ready, Uploaded, Embargoed) and then year to upload or year uploaded. The Metadata folder contains the files used to collect metadata from the NBI Wranglers.

A diagram of the Google Drive folder structure

Citations