How to publish a dataset
1. Register/log in to data.npolar.no.
Select ‘Login’ in the upper right corner.
NPI users: Enter your NPI login details and press ‘Login’ again.
Other users: Request an account by e-mail to data@npolar.no, await the response and follow the instructions.
2. Before you start:
Check that you have all your data files available in their final form, along with all required information about the dataset, including (if applicable):
Field log
Methods of collection, quality assurance and validation
Information on software use and documentation
Instrument description, sources of error, measurement standards etc.
Technical information about the dataset - if too verbose or specific for the general dataset description - should be entered in a readme file and uploaded with the actual data files.
Please check the consistency of your data, i.e. that all dates, coordinates and values are in one (accepted) format only, files have adequate headers, appropriate formatting standards and domain vocabularies are applied, etc. The dataset description should be available in English. Optionally, a Norwegian description is acceptable.
It is a good idea to have a supervisor or a colleague check your dataset before publishing.
3. Describe, upload and publish the dataset
Press ‘New dataset’ (upper right corner) to open the metadata entry form. You will be prompted to enter a dataset name (see below for naming recommendations). Enter a suitable name, press ‘Create’, and then follow the guidelines below to enter your dataset description (metadata). The guidelines are ordered by the tabs in the entry form. After completing the dataset description, you can upload your data files with ancillary information and specify their release dates. (Files can be uploaded as soon as a dataset title has been saved.)
A DOI will be reserved for the dataset as soon as a title has been entered and saved. The DOI will not be activated until the dataset has been published, but remains reserved while the dataset is in draft state and can be used for future reference, e.g. to publishers.
When the data files have been uploaded and all information entered and proofread, you can complete the process by pressing the yellow ‘Publish’ button on the dataset landing page. This will lock the dataset, make it public (except for non-released data files), and activate the DOI.
A dataset can no longer be deleted when it has been published and assigned a DOI. Please do not upload dummy files to obtain a DOI, as the DOI will stick with the dummy files. However, updated data versions can be published.
Guidelines for dataset documentation
When describing your data you should take the perspective of future users, who will be asking: “what is this dataset about and what can I use it for?”.
All items having an “Add..” button can have additional entries appended by clicking the button. Items can be deleted or moved up and down with the handles to the right of each entry.
Mandatory fields are highlighted and will remain highlighted along with their tabs until completed.
General
Title
The title should be concise and descriptive, giving users an instant impression of what the data might be relevant for. The recommended format is “what has been measured, where, when”. Occasionally other elements might be useful to include, such as instrument type or project/programme name.
Good example:
“Carbon and nitrogen stable isotopes from marine biota in Svalbard 2005-2020”
Bad example:
“Stable isotopes”
This example does not provide enough descriptive information to guide the user.
Summary
The summary should give a brief description of the dataset that allows potential users to determine if it is useful for their needs. The key information is what has been measured, where, when, and for what purpose.
Please do not copy your project description or paper abstract here. Links to such documents should be added under the “Links” tab, and can be updated at a later stage if necessary. Please write out acronyms in full, and use proper capitalisation.
Supplemental information about your dataset should be added as appropriate. Where applicable, the summary should include brief statements on the following information (in order of priority):
Details on parameters measured and instruments used
Explanation of parameter names and encoding (if not provided in the data)
Units and unit resolution
Methods, analytical tools
Data processing (gridded, binned, swath, raw, algorithms used, necessary ancillary data sets)
Data set organization (how data are organized within and by files)
Quality: flags, indicators or other information about the data quality or any quality control procedures
Similarities and differences of these data to other closely-related data sets
Links to online documentation may be provided under the ‘links’ tab.
Keywords
Select one or more scientific keywords to properly identify the relevant topic(s) of research. Start typing a term and then select the appropriate keywords from the dropdown list that appears. Multiple keywords can be applied.
Keywords are selected from a controlled vocabulary curated by GCMD. Thus, only terms from the dropdown list can be applied. The GCMD Keyword Viewer can be helpful when looking for the appropriate keywords.
Geographical coverage
Use the drawing tools in the map window to outline the area(s) from which your data have been collected. Select the appropriate tool on the left to draw lines, polygons, squares or point markers. Double-click to end a line or close a polygon. Multiple areas can be added if necessary. Pan or zoom in/out as appropriate.
The controls on the upper right can be used to switch map projections. Mercator and north/south polar stereographic are available.
Time frames
Press “Add” and enter the start and end dates of the data collection period(s) here, using the pop-up calendar in the date entry window or by typing in the YYYY-MM-DD format (2023-01-31). Multiple time intervals may be entered, if applicable.
If the period is open-ended, select “is ongoing” and enter only the start date. If data were collected during a single day, please choose the same date twice.
Contributors
Press “Add” and enter the details for each person. Identify their roles by checking the appropriate box(es) in the dropdown list:
Author is an originator of the dataset
Editor is the person entering the metadata for the dataset
Point of contact is a person who can be contacted for more information about the dataset
Principal investigator is the person responsible for the research project under which the dataset has been collected
Processor is a person who has processed the data in a manner such that the resource has been modified.
A valid e-mail address is required for the point(s) of contact.
All persons identified as authors will appear in the dataset citation string. The sequence of authors can be altered by using the handles on the right to move persons up or down on the list.
To edit personal details, select person by checkbox and press “Edit”.
Organisation
Fill in the official name (in English) of the organisation where the person works, not the acronym. Please write Norwegian Polar Institute
for NPI.
ORCID
The ORCID is a persistent digital identifier for researchers, see https://orcid.org/
Organisations
Please provide the name, principal e-mail address and webpage of the relevant institutions involved in or supporting the collection of the dataset, and identify their roles:
Author: Use this role only when no individuals are to be credited in dataset citations, i.e. when the institution(s) alone should be credited
Originator is any institution instrumental in the creation of the dataset
Owner is the legal owner of the dataset
Point of contact is the organisation which can be contacted for more information about the dataset
Processor is an institution where the dataset has been processed in a manner such that the resource has been modified
Resource provider is any institution contributing resources involved in the creation of the dataset.
Links
Under this tab you can add links to any external resource related to the dataset, such as project web pages, online documentation, publications, etc.
Press ‘Add’ and add the URL (incl. the protocol, e.g. “https://”) and a suitable title, and select the appropriate relation. The following relations are available:
Parent can be used if the current dataset is a subset or “offspring” of a larger dataset
Project links to a related project, typically the project under which the dataset has been collected
Publication can link to associated publications that use or describe the dataset
Related can be used to link any other online resource that is somehow related to the dataset
Service should be used for an API or other digital service where the dataset is accessible
Harvesters
The NPDC data catalogue is continuously harvested by various external data catalogues, which means that datasets published here will also be visible and accessible in national and international data portals. Please tick the check boxes if your dataset is relevant to the NMDC and/or SIOS data portals. ‘NMDC’ should be selected for ALL marine datasets.
Upload data files
Press ‘Add’ to select files from your computer (one by one). The files will be uploaded instantly when selected.
Data release date can be selected individually per file, in accordance with the NPI data policy.
Publish your dataset
The dataset, including all metadata and uploaded files, will remain in ‘draft’ state until published. While in draft state, the dataset remains invisible to public users and can still be edited freely. Uploaded files can still be deleted.
When ready to publish your dataset, press the yellow ‘Publish’ button on the top line. Read the warnings in the pop-up window and press ‘Publish’ again when done. The following happens:
The dataset description becomes visible and searchable on the public NPDC website.
Data files can no longer be deleted.
The pre-reserved DOI will be activated and can be used as a permanent, persistent identifier of the dataset.
Only limited editing of the dataset description is permissible.
Licence and citations
By default all NPDC datasets are published under the CC0 public domain dedication. All external parties will be unconditionally free to reuse the dataset for any purpose, while the dataset originators are cleared of any liability or responsibility associated with reuse. However, by standard scientific practice and ethical norms the dataset should be properly cited and author(s) formally acknowledged whenever the dataset is reused. For this purpose a standard citation string will be generated when the dataset is published.
If a more restrictive licence is required, please contact NPDC staff.
Citation string
The automatically generated citation string will appear on top of the dataset landing page when it is published. The string will be in the APA style: ‘Author(s) (Publication year). Dataset title. Publisher. DOI’ Up to 7 authors will be shown individually; if more, the style will be ‘Lead author & al.’ In our case, the publisher will be ‘Norwegian Polar Institute’.
For further information, please see the Joint Declaration of Data Citation Principles, and the DataCite guidelines.