Normally, if the taxon would have been present there, it would have been recorded. Other datasets, however, are not informative at all about absences. Typical examples are museum collections. The fact that a specimen is found at a particular place confirms that it lived there, but does not give information on any other taxon being present or absent in the same spot. A difficulty is that some datasets have searched for a restricted part of the total community, e.g. only sampled shellfish but no worms. In this case, absence of a shellfish taxon is relevant, but absence of a worm is not. The dataset can only be used to infer absence for the taxa it has targeted. Here we implicitly assume that a dataset inventorying the endomacrobenthos, is targeting all taxa belonging to this functional group. Usually, the distinction can be made on the basis of the metadata. It is also helpful to plot the total number of taxa versus the total number of samples. Incomplete datasets have far less taxa than expected for their size, compared to 'complete' datasets. At the taxon level, taxonomic registers such as WoRMS (WoRMS Editorial Board, 2021) give information on the functional group the taxon belongs to. This information is present for many taxa, but it is most likely incomplete. The size of the register excludes any easy test of completeness of the traits. However, even if incomplete, the register trait data can be used to select the most useful datasets. If one were to use an incomplete register directly to restrict the taxa to be used in mapping, that would cause loss of interesting information. Therefore the present workflow contains additional steps using the identified promising datasets rather than the taxon list based on the register’s traits.
The large databases of EMODNET Biology only store confirmed presences of taxon. However, when mapping taxon distribution, it is also important where the taxon did not occur: there is at least as much information in absences as in presences. Inferring absences from presence-only databases is difficult and always involves some guesswork. In this product we have used as much meta-information as possible to guide us in inferring absences. There is important meta-information at two different levels: the level of the data set, and the level of the taxon. Datasets can contain implicit information on absences when they have uniformly searched for the same taxon over a number of sample locations.