Data cleansing / data wrangling

Problem

In spite of advances in technologies for working with data, analysts still spend an inordinate amount of time diagnosing data quality issues and manipulating data into a usable form. This process of ‘data wrangling’ often constitutes the most tedious and time-consuming aspect of analysis. Since data cleaning and integration are longstanding issues in the database community, there are a number of software tools which address this problem. In recent years also the Information Visualization community presented some ideas on how interactive visualization can facilitate the process of data wrangling.

Aim

A comprehensive description of existing approaches of dealing with data quality problems (identification of problems, visualization of problems, cleansing the data set, preparing the data set for further processing steps).

Other information

Starting point(s) for research (contact person listed below for details):

Research Directions in Data Wrangling: Visualizations and Transformations for Usable and Credible Data

Contact

Further information

Area
Information Visualization (IV)
Not specified
Scope
SE
Status
in progress