Data lakes are used to store data in its natural format. Data experts can then test data relations without committing to a structure. It’s a flexible data storage strategy for combining structured and unstructured data, and is best used as a sandbox alongside a data warehouse.
So, What’s Damming Data Lakes?
The promise of combining unstructured and structured data in one place is alluring, but this leads to one very serious dilemma.
When too much data is dumped into a data lake it risks becoming a data swamp instead. This term was coined by Michael Stonebraker to describe the murkiness of data curation. If the data isn’t curated before analysis, it’s impossible to gain valuable insights. Curation requires meticulous detail within these four steps:
- Ingestation
- Transformation
- Cleansing
- Consolidation
While investigating the growth of the data lake phenomenon, we were able to hear from experts within Tableau, Logi Analytics and Qlik. We’ve also included an opinion from Gartner in light of this 2016 business intelligence trend.
Infographic developed by Better Buys

