At some point in the last two decades, the size of our data became inextricably linked to our ego. The bigger the better. We watched enviously as FAANG companies talked about optimizing hundreds of petabyes in their data lakes or data warehouses. We imagined what it would be like to engineer at that scale. We started humblebragging at conferences, like weight lifters talking … [Read more...] about I Don’t Care How Big Your Data Is
big data quality
Autonomous data observability and quality within AWS Glue Data Pipeline
Data operations and engineering teams spend 30-40% of their time firefighting data issues raised by business stakeholders. A large percentage of these data errors can be attributed to the errors present in the source system or errors that occurred or could have been detected in the data pipeline. Current data validation approaches for the data pipeline are rule-based - designed … [Read more...] about Autonomous data observability and quality within AWS Glue Data Pipeline
Webinar: Driving Data Literacy in the Public Sector
AI in Government is hosting guest speaker Jeremy Golant, Government Solutions Consultant at Coursera presenting 'Driving Data Literacy in the Public Sector' on Wednesday, August 10, 2022 from 12 PM - 1 PM ET!' ' As data becomes more seamlessly integrated within decision-making processes, public sector agencies must ensure individuals across the workforce can read, work with, … [Read more...] about Webinar: Driving Data Literacy in the Public Sector
The Right Way to Measure ROI on Data Quality
Last week, I was on a Zoom call with Lina, a Data Product Manager who oversees a data quality program for her organization. Her team is responsible for maintaining 1000s of data pipelines that populate many of the company's most business critical tables. Reliable and trustworthy data is foundational to the success of their product, yet Lina was struggling to find a clear way … [Read more...] about The Right Way to Measure ROI on Data Quality
Detect Data Errors in Snowflake Data in 60 Seconds
Without effective and comprehensive validation, a data warehouse becomes a data swamp. With the accelerating adoption of Snowflake as the cloud data warehouse of choice, the need for autonomously validating data has become critical. While existing data quality solutions provide the ability to validate Snowflake data, these solutions rely on rule-based approach that are not … [Read more...] about Detect Data Errors in Snowflake Data in 60 Seconds