This file describes the steps required to map the data to Darwin Core Description

1 Setup

Load libraries:

2 Read data

Define data types
Read habitat data (raw data)
Read native range (raw data)
Read ecofunctional_group (interim)
Read interim_literature_references (interim)
Read remove_taxa (interim)
Read core_taxa

3 Create transformation functions

The description extension will be a combination of the following descriptors: - habitat - native range - ecofunctional group

Each descriptor is imported as a separate dataset. We need to imply the same transformation steps to each dataset:

Remove all empty values (if applicable)
Keep distinct values only (if applicable)
Map description
Map type
Select relevant columns

As these steps are repeated for each of the description datasets, we write a function here.

4 Habitat

Generate eunis_habitat

Apply transformation function:

5 Native range

6 Ecofunctional group

7 Descriptors extension mapping

Combine habitat, native_range and ecofunctional_group
Link with sourceid from literature_references
Keep column names for description extension only and rename
Remove all taxonID’s that are not included in the taxon core
Scan for duplicated taxa (see this issue):

Change data type of taxonid to numeric:
Sort on taxonid
Preview data:

Export as .csv