• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

How is Data Quality Measured?

Tom Wilson / 4 min read.
February 15, 2021
Datafloq AI Score
×

Datafloq AI Score: 70.33

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/5lApV

In the age of information, we continuously have to deal with more and more data. For a contemporary business, data is as necessary as any physical material that is used, if not more.

And as with anything else that businesses use or produce, what matters is quality. However, data quality might seem a little trickier to grasp than the quality of more traditional products and tools. After all, it is not something tangible or visible at first glance. So, how would someone go about evaluating the quality of the datasets they possess?

What value does high-quality data bring to the business?

Data quality, just as the quality of any product or service, can be recognized by how well it does what we expect from it. Therefore, the first step in defining high-quality data is looking at what we want from it.

Generally, data is used, on the one hand, as part of daily operations of a business, and, on the other hand, as part of business intelligence. We use data for business intelligence when we want to make informed decisions about what strategic actions need to be taken to ensure that the company successfully moves forward.

This means that high-quality data will enhance the daily workflow of the business, allowing everything to run smoothly. It will also provide valuable insights into the current situation relevant to the business, which will allow for better management decisions.

Dimensions of data quality

Having this in mind, we can look at the criteria for measuring data quality. A good way to start is the list of data quality dimensions provided by DAMA UK, a not-for-profit data management community.

1. Completeness

When evaluating the quality of a particular dataset, one thing to check for is whether there are important elements missing. For example, if you have requested some information about the customer, but values are missing in the telephone number field, the data is incomplete. However, if the value is missing in a field that you deemed optional, for example middle name, the data is still considered complete as long as you have all the mandatory information in there.

2. Uniqueness

Sometimes also called deduplication, as it concerns whether there are duplicate data units in your database. Duplicates mean lower quality as they take up valuable storage space and give you the wrong impression regarding how much data you actually have.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

3. Timeliness

This criterion refers to whether your data is recent enough to still be as useful as expected. The utility of information might decrease gradually with time, or it might drop dramatically after a certain deadline expires. Therefore, the data that is available within the required time frame is of higher quality.

4. Validity

The data is valid when it is the correct type in relation to what was requested. For example, if the field in the dataset is for putting in telephone numbers, the data that is typed in letters instead of digits is invalid. The letters might be meaningful and correct data, for example, the name of the client that was typed instead of the number. But as it is not what was requested, it is not valid for the same purposes.

5. Accuracy

This is the most obvious standard for measuring data quality. Data has to be accurate. No matter how much information is possessed, if it is incorrect, it cannot be considered good information, and it cannot provide much insight for business strategies.

6. Consistency

Information throughout the database should be consistent. This means that there should not be data units that seem to be in conflict with each other. That could happen if, for example, in one dataset within the enterprise a certain telephone number is related to one company name, but on another dataset with a different company name. Even if the information is not inaccurate, which could be if it is the same company that has changed its name, inconsistencies in the way data is recorded might cause confusion and obstruct workflow. Therefore, information that has been updated to avoid such clashes is of higher quality.

There is always more to quality

The set of standards above provides an introduction to how data quality could be measured. However, it is crucial to understand that what constitutes high-quality data may be different in various situations.

This means that, firstly, the list is by no means exhaustive. Other criteria may be added, substitute, or be derived from those above. And, secondly, the definitions of the dimensions are not set in stone. For different purposes, what counts as qualifying for any of the dimensions may be specified or redefined.

At the end of the day, every business needs to figure out what features of data make it most suitable for its purposes. Then the initial set of criteria can be adjusted to reflect what data quality means specifically for this business.

Categories: Big Data
Tags: Big Data, big data analyst, data quality, data science

About Tom Wilson

I am a content editor at Coresignal. Coresignal is a leading provider of fresh alternative data, offering access to a continuously updated database of 14TB of data.

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

How to Build Microservices with Node.js

March 30, 2023 By Annie Qureshi

What is Enterprise Application Integration (EAI), and How Should Your Company Approach It?

March 29, 2023 By Terry Wilson

eCommerce Expo, Singapore

March 29, 2023 By r.chan

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics application applications Artificial Intelligence BI Big Data business China Cloud Companies company costs crypto Data design development digital engineer environment experience future Google+ government Group health information learning machine learning market mobile news public research security services share skills social social media software strategy technology

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Narrative Economics
  • Arctic Peoples and Cultures
  • Strengthening territorial response for better health
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • Everything You Should Know About 3D Pose Estimation
  • 12 Data Quality Metrics That ACTUALLY Matter
  • How to Build Microservices with Node.js
  • How to Validate OpenAI GPT Model Performance with Text Summarization (Part 1)
  • What is Enterprise Application Integration (EAI), and How Should Your Company Approach It?

Search

Tags

AI Amazon analysis analytics application applications Artificial Intelligence BI Big Data business China Cloud Companies company costs crypto Data design development digital engineer environment experience future Google+ government Group health information learning machine learning market mobile news public research security services share skills social social media software strategy technology

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!