Nowadays it’s a common task to retrieve data from databases and put the data into MS Excel. The acquired data is usually part of the information in decision processes.
The data that is stored in databases are either collected automatically or manually. The process of collecting data manually can either be dictating or free. With ‘dictating’ I mean that no fields are allowed to be left empty upon registration while with ‘free’ I refer to the opposite, i e fields can be left empty.
One of the more critical data for all kind of corporate is the volume of quotations which is a key performance indicator (KPI). This type of data is usually manually entered into the business system.
Measuring the quality of the data is to try to answer the question to which degree the data is accurate and reliable. To do it we can use different approaches where we can create rather sophisticated models but here I prefer to keep it simple. The reason for it is that we need to make sure that it can be communicated in an easy way within the corporate in order to improve the quality. In my experience we should therefore keep it straight and simple.
A dictating system may seems to be attractive to use as it makes sure that users enter all required data by filling in all fields. But in view of the quality of the data we get a situation where it is very difficult to actually measure it. A free system can be a better approach as it allows to leave data fields empty which makes it possible to, at least, measure which fields are filled in and not.
The following is an output from a free system where the data is retrieved in order to present the volume of quotations for a given time period:
As we can see in the above table some fields are left empty. To measure we need to decide which fields are important as indicators of the data quality. Here in the example the fields Expiring Month (EM), Sales Type (ST) and Sales Value (SV) have been selected.
Here we only measure the number of records that has one or more of these fields empty or not. The following table shows the output for the case:
In some corporate the presentation paradigm is based on charts instead of tables so here is the output presented in a chart as well:
The output of this simple measurement of quality indicates how accurate and reliable the information for the volume of quotations actually is.
In my experience, when some indicators for the quality of the data are available the decision makers gets a better understanding for the presented information. The indicators also provide input to improve the quality of data.