A new online survey conducted by SourceMedia Research and commissioned by intelligent data firm Paxata, finds that many businesses are behind the curve when it comes to data quality (DQ)—and reveals how these barriers can be diminished.
According to the study, less than half of organizations (40 percent) surveyed have developed a mature data quality model, while even less have deployed one.
A mature DQ approach is defined when organizations reach higher levels of data quality satisfaction as they implement, or plan to implement, strategies such as high data lake usage; high public cloud usage; the use of data profiling, preparation and quality tools; high-value data prep activities; and a high level of CIO involvement in data quality.
Surveying 290 executives and IT professionals at enterprises with $100M or more in annual revenue, the results from The State of Data Quality in the Enterprise, 2018 show a variety of challenges are confronting organizations as they strive to turn data into valuable business insights that can drive organizations forward. Data complexity and variety is growing as companies continue to ingest data from first, second, and third-party sources, which creates a complex mix of data types.
While data lake and public cloud usage is growing and helping organizations to meet the data storage challenge, users continue to wrestle with data preparation and processing issues:
- Only 15 percent of organizations have actually deployed and just 40 percent have developed a mature data quality model.
- Companies are experiencing two major obstacles: significant data variety and a complex mix of data types:
- Data variety: 37 percent of any organization’s data comes from external, second party and third-party sources
- Data types: 8 percent reported using all structured data while 64 percent used mostly structured/little unstructured data, 21 percent structured/unstructured data and 6 percent all unstructured data.
- Data preparation process breaks out with data ingest taking the majority of time (30 percent). Data profiling (21 percent) and data remediation (21 percent) follow in order of effort.
- 84 percent are already using the public cloud to store at least some portion of their data. And while just 14 percent of organizations currently store 61 to 80 percent of their data in a data lake, nearly a quarter (23 percent) will be storing this amount of data in a data lake in just 12 months.
According to the report, organizations that are experiencing data quality satisfaction are significantly more likely to be using data profiling, preparation and quality tools. Indeed, 56 percent of organizations that have deployed a mature data model are using these types of solutions.
“Businesses need data quality solutions that can support interactivity with both structured and unstructured data. It also must ingest and prepare large volumes of data and allow business users, as well as technical staff members, to become more fully engaged in data quality initiatives,” said Nenshad Bardoliwalla, co-founder and chief product officer at Paxata, in a news release. “We purposefully designed Paxata to address the most time-consuming part of data quality projects, providing our customers with an intuitive, visual, and interactive application for business users to onboard, profile, and create quality information.”
The online survey was conducted by SourceMedia Research in November 2017. The results analyzed in this report were gathered from 290 executives and IT professionals at enterprises with $100M or more in annual revenue.