In order to solve data quality challenges that plague organizations today, businesses must understand the most prevalent data quality misconceptions.New research fromdata management tool firm Infogix debunks seven popular data quality myths that are doing businesses more harm than good—and reveals the truth behind them.
According to a recent survey from O’Reilly Radar, “Organizations are dealing with multiple, simultaneous data quality issues. They have too many different data sources and too much inconsistent data. They don’t have the resources they need to clean up data quality problems. And that’s just the beginning.”
Myth #1: Data governance and data quality are two different initiatives
Data governance ensures people, processes and technology involved in managing data also establish trust in information to generate reliable insights that benefit the business. This is impossible to do without data quality.
The truth: Data governance without data quality is data governance light
When data travels through the data supply chain, it is exposed to new processes, uses and transformations, which impact its integrity. By scoring and monitoring data quality and enacting data integrity controls within a data governance program, businesses establish data trust by preventing downstream data issues that could adversely affect audit, risk and compliance reporting, management presentations and general decision making.
Myth #2: Solving data quality issues is extremely costly and time-consuming
Machine learning and analytics capabilities continuously monitor data integrity, automate data quality tasks that typically take large teams of people to accomplish and ensure data is error-free.
The truth: Advanced data quality checks and integrated machine learning capabilities allow companies to automatically monitor and improve enterprise data trust
When business users know data is accurate, they trust the information to help drive better business decisions.
Myth #3: Knowing if data is either high or low in quality is the only thing important
It doesn’t matter if data is somewhat inaccurate or very inaccurate. For those consuming the data, if it is inaccurate, at any level, it is useless. However, there could also be times that data is accurate, but not exactly useful. For example, if an organization has accurate data that is six years old, business users will likely consider that information unreliable because it isn’t timely, even though it may be useful to different departments in other ways.
The truth: Data quality is a moving target and simply categorizing it as low or high won’t cut it
Being able to characterize, catalog, and provide lineage for the data might be more important to a user of the data than actually knowing if it’s high or low, depending on how the consumers intend to use it.
Myth #4: Third parties are solely responsible for the quality of their own data
Third party data drives the opportunity to exchange data with outside sources to uncover insights and improve the customer experience. There’s no clear approach to how one organization can influence helping the partner organization get a handle on their data quality.
The truth: Whether it was created internally or not, it is critical to verify and maintain data quality across all data sources
When organizations ingest information from various outside sources, they must ensure the data’s quality by applying integrity checks as it enters a company’s data supply chain.
Administering regular checks for accuracy and completeness upon receipt of outside information—across every system and process—helps monitor data immediately as it enters the enterprise to ensure quality is always maintained from source to system.
Myth #5: It’s easy for employees to understand why data quality is important
Often, data exists in a different format in a different department with different information. This vertical environment very likely hinders the ability for employees to understand the impact of needing data quality at all.
The truth: Complex siloed data environments allow little visibility into the full process of how a specific dataset or transaction is configured
Today, the speed and scale of data and the sheer number of data platforms and applications utilized is staggering. As a result, the risk to data quality is steadily increasing. Companies must establish appropriate information quality oversight to navigate changes within convoluted data environments.
Myth #6: Data quality and data integrity are different concepts
Historically, the term data integrity referred to the validity of data and data quality meant the completeness, accuracy and timeliness of data. However, to understand the validity of data, businesses must be aware of its completeness, accuracy and timeliness.
The truth: Data quality and data integrity are terms that can almost be used interchangeably
Data integrity also implies movement of data throughout an organization. Integrity is ensured throughout the data supply chain process and combines with data quality. Therefore, either data integrity or data quality can describe the validity, completeness, accuracy, timeliness, etc. of data. The two terms are tightly coupled in their mission: provide trustworthy data.
Myth #7: Implementing data quality is a technology initiative
In today’s data-driven world and over-burdened IT teams, businesses can no longer assume IT always understands the business requirements in which quality needs to be validated, nor should they be required to rely on their data integration tools to address quality.
The truth: Every single person in the organization is responsible for doing their part in addressing data quality
With more individuals across the organization requiring high-quality data to do their job effectively, everyone has a stake in determining how best to validate the information. As we look ahead, business users need to start implementing their own quality routines as they know the requirements they’re after and getting more involved in actually implementing quality checks simplifies things all around.
“The technology and applications that businesses use to manage data and the different ways they utilize information is always evolving,” said Emily Washington, executive vice president of product management at Infogix, in a news release. “The one constant that always remains is the unparalleled need for accurate, consistent and reliable information to drive business decisions.”