How to publish a dataset

1. Register/log in to data.npolar.no.

Select ‘Login’ in the upper right corner.

  • NPI users: Enter your NPI login details and press ‘Login’ again.

  • Other users: Request an account by e-mail to data@npolar.no, await the response and follow the instructions.

2. Before you start:

  • Check that you have all your data files available in their final form, along with all required information about the dataset, including (if applicable):

    • Field log

    • Methods of collection, quality assurance and validation

    • Information on software use and documentation

    • Instrument description, sources of error, measurement standards etc.

  • Technical information about the dataset - if too verbose or specific for the general dataset description - should be entered in a readme file and uploaded with the actual data files.

  • Please check the consistency of your data, i.e. that all dates, coordinates and values are in one (accepted) format only, files have adequate headers, appropriate formatting standards and domain vocabularies are applied, etc. The dataset description should be available in English. Optionally, a Norwegian description is acceptable.

  • It is a good idea to have a supervisor or a colleague check your dataset before publishing.

3. Describe, upload and publish the dataset

Press ‘New dataset’ (upper right corner) to open the metadata entry form. You will be prompted to enter a dataset name (see below for naming recommendations). Enter a suitable name, press ‘Create’, and then follow the guidelines below to enter your dataset description (metadata). The guidelines are ordered by the tabs in the entry form. After completing the dataset description, you can upload your data files with ancillary information and specify their release dates. (Files can be uploaded as soon as a dataset title has been saved.)

A DOI will be reserved for the dataset as soon as a title has been entered and saved. The DOI will not be activated until the dataset has been published, but remains reserved while the dataset is in draft state and can be used for future reference, e.g. to publishers.

When the data files have been uploaded and all information entered and proofread, you can complete the process by pressing the yellow ‘Publish’ button on the dataset landing page. This will lock the dataset, make it public (except for non-released data files), and activate the DOI.

A dataset can no longer be deleted when it has been published and assigned a DOI. Please do not upload dummy files to obtain a DOI, as the DOI will stick with the dummy files. However, updated data versions can be published.

Guidelines for dataset documentation

  • When describing your data you should take the perspective of future users, who will be asking: “what is this dataset about and what can I use it for?”.

  • All items having an “Add..” button can have additional entries appended by clicking the button. Items can be deleted or moved up and down with the handles to the right of each entry.

  • Mandatory fields are highlighted and will remain highlighted along with their tabs until completed.

General

Title

The title should be concise and descriptive, giving users an instant impression of what the data might be relevant for. The recommended format is “what has been measured, where, when”. Occasionally other elements might be useful to include, such as instrument type or project/programme name.

Good example:

“Carbon and nitrogen stable isotopes from marine biota in Svalbard 2005-2020”

Bad example:

“Stable isotopes”

This example does not provide enough descriptive information to guide the user.

Summary

The summary should give a brief description of the dataset that allows potential users to determine if it is useful for their needs. The key information is what has been measured, where, when, and for what purpose.

Please do not copy your project description or paper abstract here. Links to such documents should be added under the “Links” tab, and can be updated at a later stage if necessary. Please write out acronyms in full, and use proper capitalisation.

Supplemental information about your dataset should be added as appropriate. Where applicable, the summary should include brief statements on the following information (in order of priority):

  • Details on parameters measured and instruments used

  • Explanation of parameter names and encoding (if not provided in the data)

  • Units and unit resolution

  • Methods, analytical tools

  • Data processing (gridded, binned, swath, raw, algorithms used, necessary ancillary data sets)

  • Data set organization (how data are organized within and by files)

  • Quality: flags, indicators or other information about the data quality or any quality control procedures

  • Similarities and differences of these data to other closely-related data sets

Links to online documentation may be provided under the ‘links’ tab.

Keywords

Select one or more scientific keywords to properly identify the relevant topic(s) of research. Start typing a term and then select the appropriate keywords from the dropdown list that appears. Multiple keywords can be applied.

Keywords are selected from a controlled vocabulary curated by GCMD. Thus, only terms from the dropdown list can be applied. The GCMD Keyword Viewer can be helpful when looking for the appropriate keywords.

Geographical coverage

Use the drawing tools in the map window to outline the area(s) from which your data have been collected. Select the appropriate tool on the left to draw lines, polygons, squares or point markers. Double-click to end a line or close a polygon. Multiple areas can be added if necessary. Pan or zoom in/out as appropriate.

The controls on the upper right can be used to switch map projections. Mercator and north/south polar stereographic are available.

Time frames

Press “Add” and enter the start and end dates of the data collection period(s) here, using the pop-up calendar in the date entry window or by typing in the YYYY-MM-DD format (2023-01-31). Multiple time intervals may be entered, if applicable.

If the period is open-ended, select “is ongoing” and enter only the start date. If data were collected during a single day, please choose the same date twice.

Contributors

Press “Add” and enter the details for each person. Identify their roles by checking the appropriate box(es) in the dropdown list:

  • Author is an originator of the dataset

  • Editor is the person entering the metadata for the dataset

  • Point of contact is a person who can be contacted for more information about the dataset

  • Principal investigator is the person responsible for the research project under which the dataset has been collected

  • Processor is a person who has processed the data in a manner such that the resource has been modified.

A valid e-mail address is required for the point(s) of contact.

All persons identified as authors will appear in the dataset citation string. The sequence of authors can be altered by using the handles on the right to move persons up or down on the list.

To edit personal details, select person by checkbox and press “Edit”.

Organisation

Fill in the official name (in English) of the organisation where the person works, not the acronym. Please write Norwegian Polar Institute for NPI.

ORCID

The ORCID is a persistent digital identifier for researchers, see https://orcid.org/

Organisations

Please provide the name, principal e-mail address and webpage of the relevant institutions involved in or supporting the collection of the dataset, and identify their roles:

  • Author: Use this role only when no individuals are to be credited in dataset citations, i.e. when the institution(s) alone should be credited

  • Originator is any institution instrumental in the creation of the dataset

  • Owner is the legal owner of the dataset

  • Point of contact is the organisation which can be contacted for more information about the dataset

  • Processor is an institution where the dataset has been processed in a manner such that the resource has been modified

  • Resource provider is any institution contributing resources involved in the creation of the dataset.

Harvesters

The NPDC data catalogue is continuously harvested by various external data catalogues, which means that datasets published here will also be visible and accessible in national and international data portals. Please tick the check boxes if your dataset is relevant to the NMDC and/or SIOS data portals. ‘NMDC’ should be selected for ALL marine datasets.

Upload data files

Press ‘Add’ to select files from your computer (one by one). The files will be uploaded instantly when selected.

Data release date can be selected individually per file, in accordance with the NPI data policy.

Publish your dataset

The dataset, including all metadata and uploaded files, will remain in ‘draft’ state until published. While in draft state, the dataset remains invisible to public users and can still be edited freely. Uploaded files can still be deleted.

When ready to publish your dataset, press the yellow ‘Publish’ button on the top line. Read the warnings in the pop-up window and press ‘Publish’ again when done. The following happens:

  • The dataset description becomes visible and searchable on the public NPDC website.

  • Data files can no longer be deleted.

  • The pre-reserved DOI will be activated and can be used as a permanent, persistent identifier of the dataset.

  • Only limited editing of the dataset description is permissible.

Licence and citations

By default all NPDC datasets are published under the CC0 public domain dedication. All external parties will be unconditionally free to reuse the dataset for any purpose, while the dataset originators are cleared of any liability or responsibility associated with reuse. However, by standard scientific practice and ethical norms the dataset should be properly cited and author(s) formally acknowledged whenever the dataset is reused. For this purpose a standard citation string will be generated when the dataset is published.

If a more restrictive licence is required, please contact NPDC staff.

Citation string

The automatically generated citation string will appear on top of the dataset landing page when it is published. The string will be in the APA style: ‘Author(s) (Publication year). Dataset title. Publisher. DOI’ Up to 7 authors will be shown individually; if more, the style will be ‘Lead author & al.’ In our case, the publisher will be ‘Norwegian Polar Institute’.

For further information, please see the Joint Declaration of Data Citation Principles, and the DataCite guidelines.