Data Quality

Poor data quality leads to unreliable results of any kind of data processing and has profound economic impact. Although there are tools to help users with the task of data cleansing, support for dealing with the specifics of time-oriented data is rather poor. However, the time dimension has very specific characteristics which introduce quality problems, that are different from other kinds of data. To this end we tackle this important topic with Visual Analytics methods.

Data quality control can be divided into

  1. Data Profiling: identifying and communicating quality problems (e.g., w.r.t. specific Data Quality Metrics)
  2. Data Wrangling: transforming table formats or merging different sources
  3. Data Cleansing: correcting the found quality problems

Recent Publications

Data

Since data quality is a problem in any domain, we consider multidimensional, time-oriented data in general.

Tasks

Data Profiling: identifying and communicating quality problems within the data

Data Wrangling: transforming data into another format that is suitable for further processing -- this may include merging and splitting of data entries, changing the formatting of data entries, merging two or more data tables, or augmenting the data with information from different sources

Data Cleansing: handling and correcting the identified quality problems

Users

data analysts of any domain

Scale
ordinal
discrete
continuous
Scope
point-based
interval-based
Arrangement
linear
cyclic
Granularity & Calendars
single
multiple
Time Primitives
instant
interval
span

Publications

Christian Bors, Theresia Gschwandtner, Simone Kriglstein, Silvia Miksch, Margit Pohl, "Visual Interactive Creation, Customization, and Analysis of Data Quality Metrics", Journal of Data and Information Quality (JDIQ), vol. 10, pp. 3:1–3:26, 2018. paper
Christian Bors, Theresia Gschwandtner, Silvia Miksch, "Visually Exploring Data Provenance and Quality of Open Data", EuroVis 2018 - Posters, pp. 9–11, 2018. paper
Christian Bors, Markus Bögl, Theresia Gschwandtner, Silvia Miksch, "Visual Support for Rastering of Unequally Spaced Time Series", 10th International Symposium on Visual Information Communication and Interaction (VINCI), pp. 53-57, 2017. paper
Christian Bors, Theresia Gschwandtner, Silvia Miksch, "QualityFlow: Provenance Generation from Data Quality", Proceedings of the Eurographics Conference on Visualization (EuroVis) - Posters 2015, pp. 3, 2015. paper
Simone Kriglstein, Margit Pohl, Nikolaus Suchy, Johannes Gärtner, Theresia Gschwandtner, Silvia Miksch, "Experiences and Challenges with Evaluation Methods in Practice: A Case Study", Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization (BELIV '14), pp. 118-125, 2014. paper
Christian Bors, Theresia Gschwandtner, Silvia Miksch, Johannes Gärtner, "QualityTrails: Data Quality Provenance as a Basis for Sensemaking", Proceedings of the IEEE VIS Workshop on Provenance for Sensemaking, pp. 1–2, 2014. paper