Validation rules and messages

Introduction

The automated validation tool checks a set of defined rules to validate the submitted data. Each rule has a code which you can look up below to explain the problem.

Validation rules generate three outcomes:

  1. Error: The data is invalid and cannot be submitted for manual validation or to GBIF until it is corrected
  2. Warning: The data may be correct, but looks suspicious. It needs to be checked to confirm that it is correct before submission.
  3. Alert: This is a special warning for vector distribution. The data needs to be checked to ensure it is correct, but if so, it suggests that the species has been detected in an area where it was previously absent.

Errors

Structural error

This error relates to the overall structure of the submitted worksheet.

Code Message Explanation / actions Rule
s01 Missing required variables/columns The workbook is missing some required columns. Generate a new customised Excel template using the link in your email to ensure that all columns are present. The 1.DATA INPUT worksheet header row contains all columns in the list of required columns

Data errors

These errors relate to the content of specific columns.

Code Column Message Explanation / actions Rule
e01 projectID Missing required value Include a value for projectID projectID cannot be empty or blank
e02 country Missing required value Include a value for country country cannot be empty or blank
e03 higherGeographyID Missing required value Include a value for higherGeographyID higherGeographyID cannot be empty or blank
e04 decimalLatitude Missing required value Include a value for decimalLatitude decimalLatitude cannot be empty or blank
e05 decimalLongitude Missing required value Include a value for decimalLongitude decimalLongitude cannot be empty or blank
e06 coordinatePrecision Missing required value Include a value for coordinatePrecision coordinatePrecision cannot be empty or blank
e07 CollectionEffortStartDate Missing required value Include a value for CollectionEffortStartDate CollectionEffortStartDate cannot be empty or blank
e08 CollectionEffortEndDate Missing required value Include a value for CollectionEffortEndDate CollectionEffortEndDate cannot be empty or blank
e09 samplingProtocol Missing required value Include a value for samplingProtocol samplingProtocol cannot be empty or blank
e10 individualCount Missing required value Include a value for individualCount individualCount cannot be empty or blank
e11 sampleSizeValue Missing required value Include a value for sampleSizeValue sampleSizeValue cannot be empty or blank
e12 scientificName Missing required value: a vector name should be provided Include a value for scientificName scientificName cannot be empty or blank
e13 decimalLatitude The data type is not valid. Expected numeric value. Provide a valid numeric value for decimalLatitude. Ensure that the format is recognised by Excel as a decimal value (try changing the number of decimal places to check). decimalLatitude must be numeric.
e14 decimalLongitude The data type is not valid. Expected numeric value. Provide a valid numeric value for decimalLongitude. Ensure that the format is recognised by Excel as a decimal value (try changing the number of decimal places to check). decimalLongitude must be numeric.
e15 coordinatePrecision The data type is not valid. Expected numeric value. Provide a valid numeric value for coordinatePrecision. Ensure that the format is recognised by Excel as a decimal value (try changing the number of decimal places to check). coordinatePrecision must be numeric.
e16 sampleSizeValue The data type is not valid. Expected numeric value. Provide a valid numeric value for sampleSizeValue. This should be an integer. Check that Excel is not interpreting the data as text. sampleSizeValue must be numeric.
e17 individualCount The data type is not valid. Expected integer value (no decimals). Provide a valid integer value for individualCount. This should be an integer. Check that Excel is not interpreting the data as text. individualCount must be an integer.
e18 CollectionEffortStartDate The data type is not valid. Expected date value in Excel date format (yyyy-mm-dd). Check for dates stored as text. Format CollectionEffortStartDate as a valid Excel date. Try changing the date format (e.g. short to long) to check that Excel is recognising the value as a date. CollectionEffortStartDate must be an Excel date.
e19 CollectionEffortEndDate The data type is not valid. Expected date value in Excel date format (yyyy-mm-dd). Check for dates stored as text. Format CollectionEffortEndDate as a valid Excel date. Try changing the date format (e.g. short to long) to check that Excel is recognising the value as a date. CollectionEffortEndDate must be an Excel date.
e20 identifiedByID The data type is not valid. The entered value doesn’t fit the defined OrcID format, starting with https://orcid.org. Also note that multiple DOIs should be separated by | Provide a valid OrcID format starting with https://orcid.org. identifiedByID must match the OrcID format.
e21 verbatimSiteNames The data type is not valid. The length of the text shouldn’t be longer than three characters. Shorten the text length to three characters or fewer. verbatimSiteNames length must be <= 3 characters.
e23 country The submitted value is not valid. Please use only proposed values. Choose a valid country from the standard list. country must belong to the approved list.
e24 higherGeographyID The submitted value is not valid. Please use only proposed values. Choose a valid geography identifier from the proposed list. Look at the interactive map interface to confirm the correct location and NUTS code. higherGeographyID must belong to the approved NUTS list.
e25 coordinatePrecision The submitted value is not valid. Please use only proposed values. Choose a precision value from the proposed options (0.01, 0.001, etc.). coordinatePrecision must belong to the approved list.
e26 samplingProtocol The submitted value is not valid. Please use only proposed values. Choose a valid sampling protocol from the proposed list. samplingProtocol must belong to the approved list.
e27 sampleSizeUnit The submitted value is not valid. Please use only proposed values. Choose a valid sample size unit from the proposed list. sampleSizeUnit must belong to the approved list.
e28 CollectionEffortStartDate Invalid date value. The parsed date is in the future. Please check that the provided date is correct and in yyyy-mm-dd format. Ensure the collection start date is not in the future. CollectionEffortStartDate cannot be a future date.
e29 CollectionEffortEndDate Invalid date value. The parsed date is in the future. Please check that the provided date is correct and in yyyy-mm-dd format. Ensure the collection end date is not in the future. CollectionEffortEndDate cannot be a future date.
e30 CollectionEffortStartDate Please confirm that the date value is correct and in excel date format ( yyyy-mm-dd ). The parsed date is before the year 1920. Verify that the start date is correct and after 1920. CollectionEffortStartDate must be after 1920-01-01.
e31 CollectionEffortEndDate Please confirm that the date value is correct and in excel date format( yyyy-mm-dd ). The parsed date is before the year 1920. Verify that the end date is correct and after 1920. CollectionEffortEndDate must be after 1920-01-01.
e32 CollectionEffortStartDate Invalid date value. The ‘CollectionEffortEndDate’ should be greater than ‘CollectionEffortStartDate’ (and vice-versa). Adjust dates so that the end date is greater than or equal to the start date. CollectionEffortEndDate must be >= CollectionEffortStartDate.
e33 scientificName The submitted vector is not referenced in the GBIF vocabulary or is not from the correct vector group for this spreadsheet. Please check the species. If the reference is correct, please contact your VectorNet contact point to request that the vector be added. Verify the species name is valid according to the Darwin Core Taxonomy or contact VectorNet to add the vector. scientificName must match GBIF/VectorNet vocabulary.
e34 associatedTaxa The submitted host is not referenced in the GBIF vocabulary. Please check the species. If the reference is correct, please email your VectorNet contact point. Verify the host species or contact VectorNet to add the host. The species should be the scientific name. associatedTaxa must match GBIF vocabulary.
e35 country The submitted coordinates (longitude and latitude) are not in the specified country. See the interactive maps for more details. Please verify that the coordinates use the WGS84 reference system. Ensure coordinates fall within the specified country boundary using WGS84 map projection. Check the interactive map interface to check for problems. Coordinates must match the specified country.
e36 sex The submitted value is not valid. Please use only proposed values. Choose a valid sex value from the proposed list. sex must match the approved list.
e37 lifeStage The submitted value is not valid. Please use only proposed values. Choose a valid life stage from the proposed list. lifeStage must match the approved list.
e38 occurrenceRemarks The submitted value is not valid. Please use only proposed values. Choose a valid occurrence remark from the proposed list. occurrenceRemarks must match the approved list.
e39 associatedTaxa A host value was provided for a non-tick vector and a sampling protocol not using a bait. Remove the host or update the protocol/vector group. associatedTaxa value is only allowed for ticks or baited protocols.
e40 associatedTaxa Missing required value: a host should have been provided based on sampling protocol. Provide an associated host value. associatedTaxa value is required for this sampling protocol.
e41 decimalLatitude The value is out of range. Latitude must be between -90 and 90. Enter a latitude value between -90 and 90. decimalLatitude must be within [-90, 90].
e42 decimalLongitude The value is out of range. Longitude must be between -180 and 180. Enter a longitude value between -180 and 180. decimalLongitude must be within [-180, 180].
e43 sampleSizeUnit Missing required value Include a value for sampleSizeUnit. sampleSizeUnit cannot be empty or blank.
e44 occurrenceRemarks Missing required value: an occurrence status should be provided Include an occurrence status in occurrenceRemarks. occurrenceRemarks cannot be empty or blank.
e45 locationAccordingTo Missing required value: information about the source of the location information should be provided Include a source for the location information. locationAccordingTo cannot be empty or blank.
e46 locationAccordingTo The submitted value is not valid. Please use only proposed values. Choose a proposed value like ‘GPS Coordinates’ or ‘Centroid of NUTS3/GAUL’. locationAccordingTo must use approved values.
e47 bibliographicCitation Missing required value Include a value for bibliographicCitation. bibliographicCitation cannot be empty or blank.

Warnings

These errors relate to the content of specific columns.

Code Column Message Explanation / actions Rule
w01 decimalLatitude A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. Confirm that this sequence is correct and not a result of Excel auto-fill dragging. Check for sequence pattern errors in decimalLatitude.
w02 decimalLongitude A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. Confirm that this sequence is correct and not a result of Excel auto-fill dragging. Check for sequence pattern errors in decimalLongitude.
w03 sampleSizeValue A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. Confirm that this sequence is correct and not a result of Excel auto-fill dragging. Check for sequence pattern errors in sampleSizeValue.
w04 individualCount A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. Confirm that this sequence is correct and not a result of Excel auto-fill dragging. Check for sequence pattern errors in individualCount.
w05 CollectionEffortStartDate A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. Confirm that this sequence is correct and not a result of Excel auto-fill dragging. Check for sequence pattern errors in CollectionEffortStartDate.
w06 CollectionEffortEndDate A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. Confirm that this sequence is correct and not a result of Excel auto-fill dragging. Check for sequence pattern errors in CollectionEffortEndDate.
w07 scientificName The submitted vector is not a usual species identified through vector surveillance in the Vectornet Area (EU and neighbouring countries). Double-check that the species name is accurate. If it is correct, request that the species be added to the standard list. Check that scientificName is part of the standard list of VectorNet vector species.
w08 associatedTaxa The submitted host is not a usual species identified through vector surveillance in the Vectornet Area (EU and neighbouring countries). Double-check that the host name is accurate and intended for this region. Check the associatedTaxa is part of the VectorNet standard list.
w09 higherGeographyID The submitted coordinates (longitude and latitude) are not in the specified NUTS unit. See the interactive map for more details. Verify that the coordinates align with the designated NUTS region. Coordinates should match the specified higherGeographyID.
w10 verbatimIdentification The submitted verbatimIdentification is not in the standard list or the scientificName is not a GBIF-compliant version (e.g. genus name when a species complex or s.l. is used) Verify the compliance of the scientific name or standard listing of verbatim identification. Validate taxonomy compliance for verbatimIdentification.
w11 projectID The projectID does not include a value from the standard list. Verify that the project identifier is valid according to standard project designations. This field can include other codes, separated by | projectID should be in the standard list.
c01 1.DATA INPUT Some columns are not part of the expected columns (required or optional). They are ignored. If you would like it to be imported into the GBIF data, please contact EFSA to ask for a new optional column to be implemented. Remove unexpected columns or reach out to EFSA to add them as standard optional columns. Worksheet columns must match the schema definitions for 1.DATA INPUT.

Alerts

This error relates to the distribution of vectors.

Code Column Message Explanation / actions Rule
a01 scientificName The species has been identified in an area in which it has previously been considered absent. Check the species and location. Confirm the combination of species and location coordinates to ensure accuracy for potential new distribution records. Check the interactive map to see the distribution of vector species. If the verbatimIdentification or scientificName is present in the VectorNet species distribution dataset, an alert is raised if the species is recorded in a NUTS3/GAUL administrative unit in which the species has been defined as absent.