Ecotox (contaminant), biomarkers databases
Ecotox database changes name to contaminant database in NPI new data storage.
This repository contains the following databases:
Contact person(s): Env Data Section: siri.uldal@npolar.no, Ecotox biology: igor.eulaers@npolar.no
1. Contribution
1.1 Use standardized Excel schemas
Using standardized Excel schemas for data storage can save NPI time and money. Please ask if you need help or schema modification. To get the schemas from gitlab, you need to choose “LDAP” pane and username “firstname.lastname” and your usual NPI login password.
NB! Plankton data goes into the plankton database, contact biology: anette.wold@npolar.no.
Biology fieldwork: https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/templates/fieldworkbiology/-/tree/main?ref_type=heads
Geoscience fieldwork: https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/templates/fieldworkgeoscience Contaminant (prev ecotox) lab: https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/templates/contaminant Prev storage: See Teams, the NILU-NPI data pipeline (Nd) - under General - Files - Templates - Transfer template
Biomarker lab: https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/templates/biomarker Biomarker is the remains of lab work - currently with an old schema. The database has an uncertain future..
Stabile isotopes lab: https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/templates/stabiso Note: For the stabile isotope schema you will need to activate the macro that Microsoft deactivates by default. See https://support.microsoft.com/nb-no/topic/en-potensielt-farlig-makro-er-blokkert-0952faa0-37e7-4316-b61d-5b5ed6024216 (in Norwegian).
Fatty acids lab: https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/templates/fattyacid Under development.
1.2 Use standarized IDs
Good ID naming helps if samples are in the freezer, but also in cases where the same sample is used for several analyses/lab work.
For birds - use bird metal ring if applicable. If not, use the following ID: Date+species (2 letters latin family name followed by 2 letters latin species name) + three letters matrix name. F.ex. 230628-URLO-EGG-02 for Uria Lomvia egg sample. This ID suffers currently from the lack of matrix standardization. GBIF is currently working on standards, but until then, please contact technical resposible Environmental Data Section.
Polar bears have their own IDs taken from the polar bear database.
1.3 Put your data on a backed up disk
You will need access to disk area \\NPDATA\PROJECT\Ecotox\
. Ask ruben.dens@npolar.no or mikhail.itkin@npolar.no.
\\NPDATA\PROJECT\Ecotox\RAW
new quality checked research data field trip(s) and lab data.
\\NPDATA\PROJECT\Ecotox\RAW
new lab data from old biology field samples.
All older samples must have enclosed database eventID to ensure that your new lab data will be linked to the correct samples.
Write what you have done in a README file. Mark all new results with a yellow marker.
\\NPDATA\PROJECT\Ecotox\RAW\Corrected_older_research\
old research data from other reseachers. Use directory naming as described below, alt use a title that is easy to understand like species-your-name, f.eks. “arctic_fox_old_data_Heli_Routti”.
Write what you have done in the README file. Mark all updated results with a yellow marker.
For older samples you must enclose the database eventID and for older lab results OccurrenceID otherwise it will not be possible to include your updates.
Remember to enclose all eventIDs for deleted duplicates, otherwise it will
not be possible to find the rows to delete. Only new entries should be without eventIDs and OccurrenceID.
\\NPDATA\PROJECT\Ecotox\PROCESSED\ECOTOX_2018\
Unsorted old excel sheets from previous study which should be in the database, but might not be..
\\NPDATA\PROJECT\Ecotox\PROCESSED\ECOTOX_2018\Data_ecotox_WORK\innleste filer_2018\ferdig
. The original datafiles for most of the research transferred to the ecotox database.
\\NPDATA\PROJECT\Ecotox\PROCESSED\ECOTOX_2023\
Files read and processed 2023. RAW are the excel files received from the researchers.
\\NPDATA\PROJECT\Ecotox\WORKSPACE\Igor Eulars\
contains an ordered collection of the RAW excel files.
\\NPDATA\PROJECT\Ecotox\WORKSPACE\READ-TO-DATABASE\
contains the files ready to be read into the database.
1.4 Directory name and structure
Directory label should be XXXXX-project name, where XXXXX is the NPI project number. If this is not feasible, you may use your name as directory name.
Analyte names should be standardized according to the list found at https://data.npolar.no/dataset/b31ace10-2eae-4ec1-be1d-f15176c18c27
1.5 README file
If information is not in the data file(s), please include a README file with:
1.6 Collaboration with NILU
NILU and NPI share a common schema, link above for contaminant/ecotox. Check with igor.eulars@npolar.no.
There is also a contract attachment between NILU and NPI.
1.7 Other lab data
For people delivering to the marine/plankton database, NPI has a collaboration with IOPAN. At the moment, NPI do not have any collaboration with other labs on data format/storage.
2. Getting data back
The data is behind a login - this means that you will not be able to access the database through the browser, but use tools like Curl or Python/Rlang. Login is that same as for all data accessed from https://data.npolar.no.
Get the data, using https (Curl and Rlang descriptions below)
Getting lab data from NILU
2.1 Using Kibana to get data (obsolete)
NP has a version of the Kibana visualization software set up with the ecotox database, available only through NPI’s network:
NOTE! Server uses the http protocol, not the secure https. Some browsers, f.ex. Firefox may have problems showing pages with http requests without additional configuration. Edge or Chrome usually presents the data with a warning asking if you want to continue to load an unsecure connection. Say yes as it is an NP webpage.
This server can be used to seach the ecotox database for data. Go to the “hamburger” menu in the upper left
corner and select Analytics
- Discover
.
Then it depends what you want to know, if your question is of type “give me all data managed by researcher x”. Or
“give me all data from Bjørnøya between 2001-2004” or “give me all data on larus hyperboreus from Jan Mayen” then
on the left side, choose biology-fielddata
. To get data back you also need to set the time frame in the upper
right corner to years back rather than the last 15 minutes. Choose Absolute
and set the calendar dates.
Now you should see that the number of Available fields
in the left side has changed from zero to ca 41.
Choose the fields that applies to your query, f.ex. “locality”.
F.ex. if you want the locality to be “Bjørnøya”, use the search field in the upper right corner.
Write “locality.keyword : “Bjørnøya”” and press the blue refresh
button on the upper right side. Then
if there are any entries these should now turn up. Similarly, you can choose other fields.
If your search criteria goes the other way, like “give me all data on PCBs” then choose the
biology-fielddata-ecotox
database on the upper left side. Set the dates and pick the fields as described above
to see what the database contains.
2.2 Get data back by using curl/https
From a Windows PC, download and install Curl.
Fieldwork database example:
curl GET "https://v2-api.npolar.no/biology/fielddata/_search?and=scientificName:Larus+hyperboreus&and=dynamicProperties.matrix:plasma&verbose=true&page=.." -u your.email.address@npolar.no > fieldworkdata
Ecotox database example:
curl GET "https://v2-api.npolar.no/biology/fielddata/_all_/ecotox/_search?and=scientificName:Larus+hyperboreus&and=dynamicProperties.matrix:plasma&verbose=true&page=.." -u your.email.address@npolar.no > ecotoxdata
Remember to replace your.email.address@npolar.no with your own npolar.no email address.
Now you will get two json files downloaded. You can get these converted into Excel by following the description on Windows.
2.3 Get data back using Rlang
Code snippet to get all data into R lang:
# Download two databases - fieldwork with info about fieldtrips and registered measurements,
# lab with ecotox lab results. Downloading all is too large to be merged with R -it has to be done afterwards
# in Excel.
# Fetch libraries, if you don't have them they can be imstalled with command "install.packages('jsonlite')" etc.
library(jsonlite)
fieldwork_json = fromJSON("https://v2-api.npolar.no/biology/fielddata/?page=..&includeData=true")
lab_json = fromJSON("https://v2-api.npolar.no/biology/fielddata/_all_/ecotox/?page=..&includeData=true")
# Traverse JSON hierarchy
fieldwork_df = fieldwork_json$items$data
lab_df = lab_json$items$data
# Ways of viewing and see columns by using names
View(fieldwork_df)
head(fieldwork_df, 5)
names(fieldwork_df)
field_df_flat = flatten(fieldwork_df, recursive = TRUE)
lab_df_flat = flatten(lab_df, recursive = TRUE)
2.4 How to locate and search data in the database
In order to find what you are looking for you need to know the parameter names.
The parameter names follows the Darwin Core standard, however the ecotox database also have parameters under dynamicProperties that are tailored to NPI’s needs. Some of the most used fields in biology-fielddata:
For the dataset biology-fielddata-ecotox, many of the same variables exist, but also:
To merge the the biology-fielddata
and biology-fielddata-ecotox
, use the
field:
Missing a variable? JSON schemas for fieldwork and ecotox holds all varibles with short descriptions (you need to log in with your NPI logon using LDAP):
https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/schemas/-/blob/main/v2-jsonschema/biology/fielddata/fielddata.v1.0.5.json
https://gitlab.npolar.no/npdc/other/npdc-docs-public-repo/schemas/-/blob/main/v2-jsonschema/biology/fielddata/ecotox.v1.0.5.json
Examples (use curl to log in, link is not available through the browser):
Ecotox fieldwork database, search for glaucous gulls and matrix plasma:
https://v2-api.npolar.no/biology/fielddata/_search?and=scientificName:Larus+hyperboreus&and=dynamicProperties.matrix:plasma&verbose=true&page=..&type=feed
https://v2-api.npolar.no/biology/fielddata/_all_/ecotox/_search?and=scientificName:Larus+hyperboreus&and=dynamicProperties.matrix:plasma&page=..
Get all data for project MOSJ (only the field database has this the parameter projectName)
https://v2-api.npolar.no/biology/fielddata/_search?and=dynamicProperties.projectName:MOSJ&verbose=true&page=..
&page=..
means all data, not just the first page.
&verbose=true
means include all metadata as well.
&type=feed
means download as nd-json.
2.5 Project repository
Project repository with scripts to transform raw excel files (requires special access): https://gitlab.npolar.no/eds/other/ecotox