Data Quality Overview of the IISG Knowledge Graph

Improving the data quality of the IISG Knowledge Graph makes it easier to process and analyze the data. To obtain high-quality data, it must be consistent and unambiguous. Due to schema and format, data can become inconsistent and lose quality. data quality the data needs to undergo data cleansing, but to effectively clean the data insight into the data is needed.

This Data Story shows the improvements that have already been made to the IISG Knowledge Graph, by comparing the quality of the current live version with a previous version from 2017. The below pages shows visualization for various data quality aspects. Results are generated on the fly and cover the whole Knowledge Graph.

Quality Comparison: Person entities
Provides an overview of some of the quality improvements that have been made to person entities that appear in the IISG Knowledge Graph.
Quality Indicator 1: Dates & Times
Overview of data quality issues with dates & times.
Quality Indicator 2: Persons & locations with multiple labels
Overview of person and location authorities that have alternative spellings.
Quality Indicator 3: Persons & locations with multiple identifiers
Overview of persons and locations with the same name, but different identifiers.
Quality Indicator 4: Encoding issues
Overview of encoding issues in various strings.
Quality Indicator 5: Size of collection items
Overview of numeric errors with respect to the size (width & height) of collection items.
Quality Indicator 6: Languages
Overview of language authorities.