Literal values for a given property must have datatypes that are mutually compatible. Suppose a dataset contains the
dct:created property. If all corresponding literals have datatype IRI
xsd:date, or datatype IRIs that can be cast to it (e.g.,
xsd:gYear), then creation dates can be uniformly filtered and aggregated over. However, if only some literals have datatype IRI
xsd:date while others have ― for example ― datatype IRI
xsd:string, values cannot be uniformly filtered or aggregated over.
The following pie charts give an overview of the datatype compatibility for the properties
iisg:dateOfManufacturing. They quantify the number of literals for each datatype IRI. The left hand side shows the results for the 2017 version of the IISG Knowledge Graph, while the right hand side shows the results for the 2018-09 version.
The following diagram shows the 100 most common string values for the predicate
iisg:dateOfPublication. The diagram shows that cleaning only the 10 most common strings will significantly improve the data quality of this property.
The following pie charts give an overview of the datatype compatibility for the property
iisg:dateOfManfacturing. Notice that many values have datatype IRI
xsd:string, i.e., cannot be interpreted as dates.
In following table below the first 100 types of misclassified data values are shownfor the predicate
iisg:dateOfManufacturing. Changing the first 10 elements will lower the number of misclassifications significantly.