What does Data Cleansing & Enhancement mean and how can I justify it?
A how to do it guide and a cost-justification methodology
Celsius International, 2012
Every B2B marketer knows that marketing databases need to be clean and kept that way. But it's not always easy to explain to budget-holding senior management exactly what data cleansing means and what benefits it delivers. This White Paper lists the major elements of data cleansing and enhancement, together with why they are necessary, and then provides a framework for return on investment (RoI) cost-benefit analysis.
DATA CLEANSING
Let's start with the basics. You are using a marketing database but it doesn't deliver what you want. Or, very commonly, you have several different files with different layouts and content which need putting together to get real value from them. Or perhaps you are dealing with a situation where two companies have merged and you have to combine their marketing databases.
The fundamental building bricks of data cleansing will certainly cover at least the following:
CONSOLIDATION
What it means
If you have more than one file or database, you need to combine records from all of them into one database, probably consisting of more than one table ? generally, at a minimum, tables of organisations/ addresses and of contact names.
The art of consolidation is not to lose anything. Make sure that you have captured all the fields from each file into your combined database. If you don't, you will probably suddenly discover that information you really needed was 'hidden' in some unconsidered and discarded field.
Why do it?
Working with a single database is a necessity if you want to maintain consistent quality, avoid duplicates and have a single source of data for analysis.
MATCHING AND DE-DUPLICATION
What it means
Whether you are dealing with a single database or with one which has been consolidated from several sources, the next step is to find and deal with duplicates, at address and contact name level. By far the most efficient way to do this is to use commercially available software ? why try to do it yourself when there is a huge body of expertise easily at hand? But to get the best from such software, our advice is to format your address and contact data first, for example by extracting the postcodes and putting them into a single field and by placing contact first names, initials and last names into three separate columns.
This doesn't mean that good software won't find duplicates without this preliminary work, but it will work much better if you prepare the data first.
Then establish 'survivorship' rules. If two or more duplicated records have conflicting data in the same field, you will need rules which define which data source is to be regarded as correct for which data fields.
Why do it?
For two principal reasons:
- You will save money directly by not sending the same mailing or not calling the same company or contact twice.
- You will avoid annoying the customer or prospect who will otherwise question your competence as an actual or potential supplier.
DATA AND ADDRESS CLEANSING AND CODING
What it means
Data cleansing means ensuring that address and other data is valid, standardised and fit for its purpose in ensuring delivery to the customer. Typical activity includes:
- Standardising addresses to national postal standards (especially town and postcode).
- Standardising phone numbers and identifying those which are invalid.
- Removing invalid strings and characters, such as 'Mickey Mouse' or '¥'.
- Validating e-mails (do they contain '@' and '.'?).
You may also want to consider using low cost resources to add missing addresses and telephone numbers through web research.
Coding typically means rendering the data usable for targeting and analysis, for example looking up first names to derive missing genders so that salutations are correct, coding job titles into standard responsibilities so that you select, say, financial directors and placing number of employees into codes bands of your choice so that you can target all smaller companies.
Why do it?
Principally to:
- Maximise delivery of postal and e-mail and minimise telemarketing costs.
- Allow the identification of productive segments in your data, which can be used to drive smaller and cheaper marketing campaigns without reduction in effectiveness.
To access the full PDF, download
What does Data Cleansing & Enhancement mean and how can I justify it? (pdf, 1.02Mb)