Data cleaning: the solution for data quality control

Data cleaning the solution for data quality control Data quality control is one of the most difficult challenges for today’s large and medium-sized enterprises. Just managing the names of suppliers, or of customers in contact with customer care is overwhelming, not to mention the latest developments over recent years, newsletter subscribers and company blogs, for example. Different groups, yet united by the same characteristic: their importance to the company. The supply of spare parts, materials and products, in fact, depends on suppliers. Likewise, the brand’s reputation, and therefore the company’s image in the public eye, depends on customer service. The same goes for the many people who follow the company and are registered on the official website: these people represent a company’s most important customer base. This set of contacts must be organised with great care, so as to obtain top quality data.

There are many risks for a company that does not respect certain data quality standards. Slow decision making, waste of resources for activities that could be carried out with less staff, errors in sending communications (replies to customers, delivery of promotional brochures, etc.) … in short: an increase in unnecessary costs. We talked about this recently in our article 5 Problems Avoided Thanks to Address Validation, which summarises some of the main obstacles which can be avoided with correct data administration (for postal addresses). Address validation actually only represents a part of the available solutions for knowledgeably dealing with data. If we look at the issue from a broader perspective, we find that many of the operations useful for ensuring optimal data quality fall within the more generic process of data cleaning. Let’s take a more detailed look at what this is and what the advantages are for a company.

Validate your database


In the past, data cleaning was simply considered as the “cleansing” of data in databases, data warehouses, datasets, and so on. With the evolution of technology and the development of increasingly powerful computer programs, however, this definition has become obsolete: today, data cleaning involves a set of techniques aimed at verifying, correcting, updating, duplicating and, where necessary, enriching data archived or stored in a single database or several different databases. Let us assume, for example, that you have a database in which, over the years, the data were collected by hand (data entry) via a telephone exchange and by several different people. These are customers who have called for assistance on a certain product or service purchased in a store or online. In order to receive the necessary assistance, these customers have given their full name and address. These data were entered “by hand”, which, as we well know, is subject to typing errors, omissions or repetitions, and this is why it is obvious that the quality of the data is poor.

Therefore, what can we do with this valuable resource consisting of thousands of names? Let us assume that it is necessary to inform these customers of the opportunity to activate an annual assistance programme for the product or service in question (a boiler, a personal computer, a washing machine and so on) for €99 per year. It is more than likely that, by blindly using the above database, the results will be well below expectations. However, the names of customers who have contacted the customer care service in the past are only listed there. Over 4,000 names scattered throughout Italy, which could generate an income of €396,000. Is it worth simply sending out a notification by mail? No, not if you haven’t first carried out data recovery with special data cleaning programs. By verifying, correcting, updating, deduplicating and enriching the data, we can obtain a safe and reliable database, and, more importantly, we can be sure that the postal addresses and names are formatted for using in analysis programs, for shipping, or other actions that are useful to our business.


Given the benefits of data cleaning, a key question remains: what tools we can use for the effective recovery of the data in our possession? Our answer at Address4 is advanced data cleaning software, able to carry out sophisticated operations in Address ValidationAddress GeocodingName ValidationEmail Validation and Phone Validation. A complete suite, designed for easy use by developers, data entry personnel, data quality experts, data warehouse managers, and anyone who has to manage large volumes of names, addresses and data. The multiple applications in web forms, CRM, e-commerce and much more, provide extreme flexibility for working in any context and for any need. If you also want to improve the data quality of your databases, you just have to try the free Address4 demo.

Register for free and try out our data cleaning software now