Validation rules and messages
Introduction
The automated validation tool checks a set of defined rules to validate the submitted data. Each rule has a code which you can look up below to explain the problem.
Validation rules generate three outcomes:
- Error: The data is invalid and cannot be submitted for manual validation or to GBIF until it is corrected
- Warning: The data may be correct, but looks suspicious. It needs to be checked to confirm that it is correct before submission.
- Alert: This is a special warning for vector distribution. The data needs to be checked to ensure it is correct, but if so, it suggests that the species has been detected in an area where it was previously absent.
Errors
Structural error
This error relates to the overall structure of the submitted worksheet.
| Code | Message | Explanation / actions | Rule |
|---|---|---|---|
| s01 | Missing required variables/columns | The workbook is missing some required columns. Generate a new customised Excel template using the link in your email to ensure that all columns are present. | The 1.DATA INPUT worksheet header row contains all columns in the list of required columns |
Data errors
These errors relate to the content of specific columns.
| Code | Column | Message | Explanation / actions | Rule |
|---|---|---|---|---|
| e01 | projectID | Missing required value | Include a value for projectID | projectID cannot be empty or blank |
| e02 | country | Missing required value | Include a value for country | country cannot be empty or blank |
| e03 | higherGeographyID | Missing required value | Include a value for higherGeographyID | higherGeographyID cannot be empty or blank |
| e04 | decimalLatitude | Missing required value | Include a value for decimalLatitude | decimalLatitude cannot be empty or blank |
| e05 | decimalLongitude | Missing required value | Include a value for decimalLongitude | decimalLongitude cannot be empty or blank |
| e06 | coordinatePrecision | Missing required value | Include a value for coordinatePrecision | coordinatePrecision cannot be empty or blank |
| e07 | CollectionEffortStartDate | Missing required value | Include a value for CollectionEffortStartDate | CollectionEffortStartDate cannot be empty or blank |
| e08 | CollectionEffortEndDate | Missing required value | Include a value for CollectionEffortEndDate | CollectionEffortEndDate cannot be empty or blank |
| e09 | samplingProtocol | Missing required value | Include a value for samplingProtocol | samplingProtocol cannot be empty or blank |
| e10 | individualCount | Missing required value | Include a value for individualCount | individualCount cannot be empty or blank |
| e11 | sampleSizeValue | Missing required value | Include a value for sampleSizeValue | sampleSizeValue cannot be empty or blank |
| e12 | scientificName | Missing required value: a vector name should be provided | Include a value for scientificName | scientificName cannot be empty or blank |
| e13 | decimalLatitude | The data type is not valid. Expected numeric value. | Provide a valid numeric value for decimalLatitude. Ensure that the format is recognised by Excel as a decimal value (try changing the number of decimal places to check). | decimalLatitude must be numeric. |
| e14 | decimalLongitude | The data type is not valid. Expected numeric value. | Provide a valid numeric value for decimalLongitude. Ensure that the format is recognised by Excel as a decimal value (try changing the number of decimal places to check). | decimalLongitude must be numeric. |
| e15 | coordinatePrecision | The data type is not valid. Expected numeric value. | Provide a valid numeric value for coordinatePrecision. Ensure that the format is recognised by Excel as a decimal value (try changing the number of decimal places to check). | coordinatePrecision must be numeric. |
| e16 | sampleSizeValue | The data type is not valid. Expected numeric value. | Provide a valid numeric value for sampleSizeValue. This should be an integer. Check that Excel is not interpreting the data as text. | sampleSizeValue must be numeric. |
| e17 | individualCount | The data type is not valid. Expected integer value (no decimals). | Provide a valid integer value for individualCount. This should be an integer. Check that Excel is not interpreting the data as text. | individualCount must be an integer. |
| e18 | CollectionEffortStartDate | The data type is not valid. Expected date value in Excel date format (yyyy-mm-dd). Check for dates stored as text. | Format CollectionEffortStartDate as a valid Excel date. Try changing the date format (e.g. short to long) to check that Excel is recognising the value as a date. | CollectionEffortStartDate must be an Excel date. |
| e19 | CollectionEffortEndDate | The data type is not valid. Expected date value in Excel date format (yyyy-mm-dd). Check for dates stored as text. | Format CollectionEffortEndDate as a valid Excel date. Try changing the date format (e.g. short to long) to check that Excel is recognising the value as a date. | CollectionEffortEndDate must be an Excel date. |
| e20 | identifiedByID | The data type is not valid. The entered value doesn’t fit the defined OrcID format, starting with https://orcid.org. Also note that multiple DOIs should be separated by | | Provide a valid OrcID format starting with https://orcid.org. | identifiedByID must match the OrcID format. |
| e21 | verbatimSiteNames | The data type is not valid. The length of the text shouldn’t be longer than three characters. | Shorten the text length to three characters or fewer. | verbatimSiteNames length must be <= 3 characters. |
| e23 | country | The submitted value is not valid. Please use only proposed values. | Choose a valid country from the standard list. | country must belong to the approved list. |
| e24 | higherGeographyID | The submitted value is not valid. Please use only proposed values. | Choose a valid geography identifier from the proposed list. Look at the interactive map interface to confirm the correct location and NUTS code. | higherGeographyID must belong to the approved NUTS list. |
| e25 | coordinatePrecision | The submitted value is not valid. Please use only proposed values. | Choose a precision value from the proposed options (0.01, 0.001, etc.). | coordinatePrecision must belong to the approved list. |
| e26 | samplingProtocol | The submitted value is not valid. Please use only proposed values. | Choose a valid sampling protocol from the proposed list. | samplingProtocol must belong to the approved list. |
| e27 | sampleSizeUnit | The submitted value is not valid. Please use only proposed values. | Choose a valid sample size unit from the proposed list. | sampleSizeUnit must belong to the approved list. |
| e28 | CollectionEffortStartDate | Invalid date value. The parsed date is in the future. Please check that the provided date is correct and in yyyy-mm-dd format. | Ensure the collection start date is not in the future. | CollectionEffortStartDate cannot be a future date. |
| e29 | CollectionEffortEndDate | Invalid date value. The parsed date is in the future. Please check that the provided date is correct and in yyyy-mm-dd format. | Ensure the collection end date is not in the future. | CollectionEffortEndDate cannot be a future date. |
| e30 | CollectionEffortStartDate | Please confirm that the date value is correct and in excel date format ( yyyy-mm-dd ). The parsed date is before the year 1920. | Verify that the start date is correct and after 1920. | CollectionEffortStartDate must be after 1920-01-01. |
| e31 | CollectionEffortEndDate | Please confirm that the date value is correct and in excel date format( yyyy-mm-dd ). The parsed date is before the year 1920. | Verify that the end date is correct and after 1920. | CollectionEffortEndDate must be after 1920-01-01. |
| e32 | CollectionEffortStartDate | Invalid date value. The ‘CollectionEffortEndDate’ should be greater than ‘CollectionEffortStartDate’ (and vice-versa). | Adjust dates so that the end date is greater than or equal to the start date. | CollectionEffortEndDate must be >= CollectionEffortStartDate. |
| e33 | scientificName | The submitted vector is not referenced in the GBIF vocabulary or is not from the correct vector group for this spreadsheet. Please check the species. If the reference is correct, please contact your VectorNet contact point to request that the vector be added. | Verify the species name is valid according to the Darwin Core Taxonomy or contact VectorNet to add the vector. | scientificName must match GBIF/VectorNet vocabulary. |
| e34 | associatedTaxa | The submitted host is not referenced in the GBIF vocabulary. Please check the species. If the reference is correct, please email your VectorNet contact point. | Verify the host species or contact VectorNet to add the host. The species should be the scientific name. | associatedTaxa must match GBIF vocabulary. |
| e35 | country | The submitted coordinates (longitude and latitude) are not in the specified country. See the interactive maps for more details. Please verify that the coordinates use the WGS84 reference system. | Ensure coordinates fall within the specified country boundary using WGS84 map projection. Check the interactive map interface to check for problems. | Coordinates must match the specified country. |
| e36 | sex | The submitted value is not valid. Please use only proposed values. | Choose a valid sex value from the proposed list. | sex must match the approved list. |
| e37 | lifeStage | The submitted value is not valid. Please use only proposed values. | Choose a valid life stage from the proposed list. | lifeStage must match the approved list. |
| e38 | occurrenceRemarks | The submitted value is not valid. Please use only proposed values. | Choose a valid occurrence remark from the proposed list. | occurrenceRemarks must match the approved list. |
| e39 | associatedTaxa | A host value was provided for a non-tick vector and a sampling protocol not using a bait. | Remove the host or update the protocol/vector group. | associatedTaxa value is only allowed for ticks or baited protocols. |
| e40 | associatedTaxa | Missing required value: a host should have been provided based on sampling protocol. | Provide an associated host value. | associatedTaxa value is required for this sampling protocol. |
| e41 | decimalLatitude | The value is out of range. Latitude must be between -90 and 90. | Enter a latitude value between -90 and 90. | decimalLatitude must be within [-90, 90]. |
| e42 | decimalLongitude | The value is out of range. Longitude must be between -180 and 180. | Enter a longitude value between -180 and 180. | decimalLongitude must be within [-180, 180]. |
| e43 | sampleSizeUnit | Missing required value | Include a value for sampleSizeUnit. | sampleSizeUnit cannot be empty or blank. |
| e44 | occurrenceRemarks | Missing required value: an occurrence status should be provided | Include an occurrence status in occurrenceRemarks. | occurrenceRemarks cannot be empty or blank. |
| e45 | locationAccordingTo | Missing required value: information about the source of the location information should be provided | Include a source for the location information. | locationAccordingTo cannot be empty or blank. |
| e46 | locationAccordingTo | The submitted value is not valid. Please use only proposed values. | Choose a proposed value like ‘GPS Coordinates’ or ‘Centroid of NUTS3/GAUL’. | locationAccordingTo must use approved values. |
| e47 | bibliographicCitation | Missing required value | Include a value for bibliographicCitation. | bibliographicCitation cannot be empty or blank. |
Warnings
These errors relate to the content of specific columns.
| Code | Column | Message | Explanation / actions | Rule |
|---|---|---|---|---|
| w01 | decimalLatitude | A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. | Confirm that this sequence is correct and not a result of Excel auto-fill dragging. | Check for sequence pattern errors in decimalLatitude. |
| w02 | decimalLongitude | A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. | Confirm that this sequence is correct and not a result of Excel auto-fill dragging. | Check for sequence pattern errors in decimalLongitude. |
| w03 | sampleSizeValue | A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. | Confirm that this sequence is correct and not a result of Excel auto-fill dragging. | Check for sequence pattern errors in sampleSizeValue. |
| w04 | individualCount | A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. | Confirm that this sequence is correct and not a result of Excel auto-fill dragging. | Check for sequence pattern errors in individualCount. |
| w05 | CollectionEffortStartDate | A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. | Confirm that this sequence is correct and not a result of Excel auto-fill dragging. | Check for sequence pattern errors in CollectionEffortStartDate. |
| w06 | CollectionEffortEndDate | A numerical sequence of more than 5 elements was identified. Please confirm that this is not an input error due to value dragging. | Confirm that this sequence is correct and not a result of Excel auto-fill dragging. | Check for sequence pattern errors in CollectionEffortEndDate. |
| w07 | scientificName | The submitted vector is not a usual species identified through vector surveillance in the Vectornet Area (EU and neighbouring countries). | Double-check that the species name is accurate. If it is correct, request that the species be added to the standard list. | Check that scientificName is part of the standard list of VectorNet vector species. |
| w08 | associatedTaxa | The submitted host is not a usual species identified through vector surveillance in the Vectornet Area (EU and neighbouring countries). | Double-check that the host name is accurate and intended for this region. | Check the associatedTaxa is part of the VectorNet standard list. |
| w09 | higherGeographyID | The submitted coordinates (longitude and latitude) are not in the specified NUTS unit. See the interactive map for more details. | Verify that the coordinates align with the designated NUTS region. | Coordinates should match the specified higherGeographyID. |
| w10 | verbatimIdentification | The submitted verbatimIdentification is not in the standard list or the scientificName is not a GBIF-compliant version (e.g. genus name when a species complex or s.l. is used) | Verify the compliance of the scientific name or standard listing of verbatim identification. | Validate taxonomy compliance for verbatimIdentification. |
| w11 | projectID | The projectID does not include a value from the standard list. | Verify that the project identifier is valid according to standard project designations. This field can include other codes, separated by | | projectID should be in the standard list. |
| c01 | 1.DATA INPUT | Some columns are not part of the expected columns (required or optional). They are ignored. If you would like it to be imported into the GBIF data, please contact EFSA to ask for a new optional column to be implemented. | Remove unexpected columns or reach out to EFSA to add them as standard optional columns. | Worksheet columns must match the schema definitions for 1.DATA INPUT. |
Alerts
This error relates to the distribution of vectors.
| Code | Column | Message | Explanation / actions | Rule |
|---|---|---|---|---|
| a01 | scientificName | The species has been identified in an area in which it has previously been considered absent. Check the species and location. | Confirm the combination of species and location coordinates to ensure accuracy for potential new distribution records. Check the interactive map to see the distribution of vector species. | If the verbatimIdentification or scientificName is present in the VectorNet species distribution dataset, an alert is raised if the species is recorded in a NUTS3/GAUL administrative unit in which the species has been defined as absent. |