4 Unify taxa
In this chapter we unify taxa on their verificationKey
.
4.1 Read taxa
Read taxa from data/interim/taxa_with_verification.csv
.
4.2 Unify taxa
Remove taxa without verificationKey.
Separate multiple
verificationKey
s (if any) for single taxa.Group taxa by
verificationKey
, saving thedatasetKey
andtaxonKey
of the taxa that are bundled per key indatasetKeys
andtaxonKeys
.Extract
verificationKey
as a vector.Number of unique taxa: 3905
4.3 Get GBIF backbone taxonomy information
Even though we stored some backbone information for most of our taxa in the previous steps, we want to start from scratch here and retrieve it from GBIF again, as 1) some taxon keys in verificationKey
s will be new and 2) we want to store more attributes per taxon this time.
Get GBIF backbone taxonomy information.
Rename
accepted
toacceptedName
.Select columns of interest.
Join backbone information with our unified taxa, so we keep
datasetKeys
andtaxonKeys
.Move columns
datasetKeys
andtaxonKeys
to the end.Preview merged information:
- Number of taxa: 3905
4.4 Explicitly remove incorrect taxa
- Some taxa are purposely excluded in a source checklist (e.g. see this issue for Alien birds), but still end up in the unified checklist because they are incorrectly included in another no-longer-updated source checklist (e.g. RINSE pathways). Here we explicitly remove those taxa:
ntaxa <- nrow(taxa_unified)
taxa_unified <-
taxa_unified %>% filter(
scientificName != "Anser fabalis (Latham, 1787)",
scientificName != "Anser anser (Linnaeus, 1758)",
scientificName != "Branta leucopsis (Bechstein, 1803)"
)
Number of removed taxa: 3
Total number of taxa: 3902
Save to CSV.