The Science of Proper Data Hygiene

Spray bottle labeled Data Clean pointed at laptop
You don’t have to be a rocket scientist to understand that having a wrong phone number for a person can keep you from reaching that person.

You don’t have to be a rocket scientist to understand that having a wrong phone number for a person can keep you from reaching that person.

Now at look what happens in the business world: every year thousands of people change jobs, get phone numbers, get married or divorced, move to a new city — and their lives take all kinds of different directions.

30% of people change jobs every year,which means that you can reach out to a prospect only to find out that they no longer work for the company you want to do business with.Or perhaps they changed their last name after a divorce.43% of people change their phone numbers in a year, and25% to 33% of email addresses become outdated within 12 months.

All this has a significant impact on your delivery and response rates, not to mention how much time you will waste trying to trace a prospect.

As you can see, B2B data deteriorates at a very fast pace, and if you fail to keep abreast of these changes, you’ll find yourself amongthe 62% of organizations who are infected with inaccurate customer data.

What is Data Hygiene Exactly?

Data hygiene refers to the collective processes conducted to ensure the cleanliness of data.

Data is considered clean if it is relatively error-free. Dirty data is a term used to describe inaccurate, incomplete, and inconsistent data.

What Are Some of the Most Common Errors to Watch Out For?

Dirty data can be caused by any number of factors. Here are some of the most common errors:

  • Missing data/missing fields
    (the data record might not have the right data elements; there may be important information that isn’t being tracked)
  • Inconsistent mapping
  • Data that is not validated at the time of entry
  • Separate overlapping systems with duplicate or conflicting data
  • Generalized, badly defined values
  • Confusing naming schema
  • Out of date, invalid, inaccurate data
  • Sloppy formatting/invalid syntax that could break the new system

The Trillion-Dollar Cost of Bad Data


According to Forbes, poor data quality costs the U.S. economy approximately $3.1 trillion annually. But where does that expense come from?

Let’s look at the 1-10-100 Quality rule, as formulated by George Labovitz and Yu Sang Chang. This states that it costs $1 to verify a data record as it’s entered, or at the point of data capture (the prevention cost); it costs $10 to cleanse and deduplicate the record (the correction cost); and it costs $100 continuing to work with a record that’s never cleansed (the failure cost).

While this is expressed in US dollars in the above description, it can be understood to mean any number of ‘units’, measured in financial terms. We can also use 1-10-100 to measure ‘cost’ in terms of resources or time.

What this means in practical terms is that if you don’t invest in quality data hygiene, you will end up losing a lot of money in the long run.

That’s not to say that all companies are being short-sighted about data hygiene. As described above, data can go bad rapidly. B2B data decays at a rate of 70% per year, and the average company loses 12% of its revenue as a result. When you look at these percentages, the $3.1 trillion lost in the US economy in a year comes into clearer focus.

What is the Best Approach for a Business or a Nonprofit to Stay Out in Front of Deteriorating Data?

Make your ERP selection a robust system that includes functions which make it easy to mandate certain fields be filled in. Build in some validation functionality, and do some logic checks. Establish standardization rules and add constraints. As much as possible, make the system validate the data at the point it is entered.

There are ERP tools and integrations for keeping records updated in real-time. As well, there are ERP tools that automate data cleansing.

If you’re infected with bad data, it is going to take a dedicated project devoted to parsing, standardizing, cleansing and validating the data. We here at ERP Advisors Group can provide guidance in designing such a project.

Once you’re disinfected, you can put in systems and personnel to keep your data healthy.

Be watchful for indicators that dirty data has started to impact your business. Data hygiene can be an underlying reason why your performance as a business is struggling. Dirty data infestation might be an underlying driver as to why you’re not closing deals, or sales deals are taking longer, or accounting isn't collecting faster, or manufacturing costs are going up, or engineering is buying too much product — there are endless examples.

Here at ERP Advisors Group, we specialize in providing expert guidance on data hygiene. We help our clients assess the quality of the data, whether it is good or bad, and how much effort is required to clean it. We can provide direction as to what ERP tools to select for data cleansing and for keeping data updated, as well as guidance in designing projects to disinfect dirty data.

Contact us for a free consultation today.