Data quality monitoring. Data testing. Data observability. Say that five times fast. Are they different words for the same thing? Unique approaches to the same problem? Something else entirely? And more importantly-do you really need all three? Like everything in data engineering, data quality management is evolving at lightning speed. The meteoric rise of data and AI in … [Read more...] about The Past, Present, and Future of Data Quality Management: Understanding Testing, Monitoring, and Data Observability in 2024
big data quality
Your Data’s (Finally) In The Cloud. Now, Stop Acting So On-Prem
Imagine you've been building houses with a hammer and nails for most of your career, and I gave you a nail gun. But instead of pressing it to the wood and pulling the trigger, you turn it sideways and hit the nail with the gun as if it were a hammer. You would probably think it's expensive and not overly effective, while the site's inspector is going to rightly view it as a … [Read more...] about Your Data’s (Finally) In The Cloud. Now, Stop Acting So On-Prem
How to Build a 5-Layer Data Stack
Like bean dip and ogres, layers are the building blocks of the modern data stack. Its powerful selection of tooling components combine to create a single synchronized and extensible data platform with each layer serving a unique function of the data pipeline. Unlike ogres, however, the cloud data platform isn't a fairy tale. New tooling and integrations are created almost daily … [Read more...] about How to Build a 5-Layer Data Stack
All I Want To Know Is What’s Different – But Also Why and Can You Fix It ASAP?
I link to Benn Stancil in my posts more than any other data thought leader. I might not always agree with his answers, but I almost always agree with his questions. True to form, last week he tackled one of the most important questions data leaders need to ask which is, "How do we empower data consumers to assess the credibility of MDS-generated data products?" The question … [Read more...] about All I Want To Know Is What’s Different – But Also Why and Can You Fix It ASAP?
Data Freshness Explained: Making Data Consumers Wildly Happy
What is data freshness and why is it important? Data freshness, sometimes referred to as data timeliness, is the frequency in which data is updated for consumption. It is an important data quality dimension and a pillar of data observability because recently refreshed data is more accurate, and thus more valuable. Since it is impractical and expensive to have all data refreshed … [Read more...] about Data Freshness Explained: Making Data Consumers Wildly Happy