Grooviler: A Visual Analytics Approach to Communicate and Identify Time-Oriented Data Quality Problems

Master Thesis
Author
Advisor
Co-Advisor
Abstract

Scientific disciplines like climate research or high-energy physics build up large repositories of time series data. In order to process big data sets for data mining or explorative analysis, data integrity and quality has to be guaranteed. To increase data quality, Data Cleansing is performed to remove unwanted and misleading data anomalies before executing the analytical process. Data Profiling also helps to identify and communicate data quality problems, which can then be cleansed. Existing Data Profiling tools provide mechanisms to identify invalid data but without prior focus on time-oriented data. Often those tools are limited providing simple statistics about a given data set and do not offer interactive operations. There is already a prototype being developed at the Technical University of Vienna which detects, partially cleanses, and annotates erroneous data entries. This thesis is about designing, implementing and evaluating a prototypical module of this prototype, visualizing detected errors and providing visualizations to find further data problems with a focus on time-oriented quality checks.

Year of Publication
2017
Secondary Title
Institute of Visual Computing and Human-Centered Technology
Number of Pages
103
Publisher
TU Wien
Place Published
Vienna
DOI
10.34726/hss.2017.32206
reposiTUm Handle
Paper
TU Wien Library AC14475173
Download citation