• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

Data Science Governance Don’t Reinvent The Wheel

Bill Franks / 4 min read.
January 13, 2021
Datafloq AI Score
×

Datafloq AI Score: 80.67

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/DgZMD

As data science processes continue to become operationalized and embedded within business processes, the importance of governing those processes continues to rise. While governance has been a major focus for many years when it comes to managing data, governance focused on data science processes is still far less mature. That needs to change. This blog will discuss a couple of distinct areas of governance that organizations should consider.

Governance and Ethics Are Inextricably Linked

When defining governance procedures and guidelines, it is necessary to account for ethical considerations up front. The reason is that once governance policies are put in place, they will incentivize and disincentivize various behaviors. Without accounting for the ethics of those behaviors, there is a risk of creating a terrifically managed and tightly governed process that does horribly unethical things.

Imagine that a company creates a process 1) using well-governed data on people‘s behavior that has been 2) prepared with a well-defined and consistent set of computations to 3) generate summary metrics to feed into a model. Furthermore, the company monitors the performance, bias, and consistency of the model while also tightly controlling who has access to the output and what it is used for. Sounds like a very well governed process, doesn’t it? Now imagine that the model produced used that social media data to predict who is likely to commit a crime so that law enforcement can intervene as in the movie Minority Report.

Such a process may be well-governed, but it is horribly unethical. That is why I said that you can’t separate ethics from governance. To be truly effective, governance must be ethically sound as well as technically rigorous.

Auditing A Process Doesn’t Mean Revealing All Its Secrets

It is often necessary to perform audits to prove that a data science process is working appropriately. A common concern is that in order to provide a complete audit, it is necessary to reveal the secret sauce behind the process. While this concern is especially common if a 3rd party will be performing the audit, it does not have to be the case.

Consider beverage giant Coca-Cola. Only a couple of people in the entire world know the full recipe for a bottle of Coke, and none of those people have a regulatory oversight role. Yet, people are still comfortable that Coke products are safe to enjoy. Why is that? First, while the exact mix of ingredients in the recipe may not be known, they are all standard food products. So, both the company and oversight agencies can confirm that any given ingredient going into a Coke is safe and approved. Similarly, the final product can be checked for toxins, chemical composition, etc. to ensure that the ingredients were not somehow mixed in a way that caused unforeseen problems. In other words, it is possible to audit that a Coke is safe to drink without having to know the secret formula.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

The same is true with machine learning and artificial intelligence. To validate that a process accurately predicts what it is attempting to predict, is free from bias in those predictions, and that the predictions are stable over time, it is not necessary to unveil the exact formulation of the underlying model. By passing a wide range of data to the model, we can demonstrate accuracy, consistency, and bias level while still maintaining the confidentiality of the secret sauce behind the model. It is possible to have algorithms that provide a competitive advantage, while providing strong governance and auditing of the process, without revealing the core IP that has been developed. Therefore, there is no reason to argue against auditing. I’m actually a fan of the idea of having 3rd party auditors involved much like is done in the accounting space. We may soon see a company rise to prominence by providing such services.

Borrow from Other Fields Liberally!

One thing those of us in the data science field are often guilty of is trying to build things ourselves, even if there might be something close to what we need already available. Rather than tweaking the existing approach to our needs, we start from scratch. The urge to do this should be resisted!

When it comes to governance as it relates to safety, quality, and audits, there are highly mature approaches in other disciplines that can be borrowed. Traditional product development and engineering teams have strong protocols that have been developed over many decades. While it is certainly true that engineering protocols for safety assurance will not translate directly to data science processes, it is also true that tweaking an engineering approach to fit within a data science context is probably a faster path to progress than developing and testing protocols from scratch.

One terrific example of a set of protocols that data science teams have adapted successfully is in the area of agile software development. While the agile protocols originally developed for software developers do not translate exactly to a data science context, many require little change. Data science teams now follow agile analytics protocols that take full advantage of the principles originally produced to support agile software development. Sure, there are some differences and additions, but the data science community is certainly better off for borrowing from a proven approach in a related discipline than if we tried to start a new grassroots approach on our own.

Don’t Make Governance Harder Than It Needs to Be

Governance is not nearly as interesting and engaging as creating awesome data science processes, but it is necessary. Do not assume the pain we face in tackling data science governance has to be long and painful with a lot of totally new protocols needing to be developed. The data science community can borrow and adapt much of what has been done by others in the areas of data governance, quality control, safety, and auditability. By resisting our urges to create bespoke approaches from scratch, we will not only accelerate our efforts, but we will avoid learning the same hard lessons that others learned as they built the governance processes we are borrowing from.

Originally published by the International Institute for Analytics

Categories: Big Data, Strategy
Tags: big data strategy, data science, governance

About Bill Franks

Bill Franks is an internationally recognized chief analytics officer who is a thought leader, speaker, consultant, and author focused on analytics and data science. Franks is also the author of Winning The Room, 97 Things About Ethics Everyone In Data Science Should Know, Taming The Big Data Tidal Wave, and The Analytics Revolution. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit organizations. You can learn more at https://www.bill-franks.com.

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

Why We Need AI for Air Quality

March 21, 2023 By Jane Marsh

A Complete Career Guide to Becoming an Artificial Intelligence Engineer in 2023

March 21, 2023 By Pradip Mohapatra

What Are Foundation AI Models Exactly?

March 21, 2023 By Terry Wilson

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics application applications Artificial Intelligence benefits BI Big Data business China Cloud Companies company costs crypto Data design development digital engineer environment experience finance financial future government Group health information machine learning mobile news public research security services share skills social social media software strategy technology

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Build automated speech systems with Azure Cognitive Services
  • Sneak Peek: Dartmouth’s Digital Transformation Certificate
  • Velocity Data and Analytics Summit, UAE
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • How BlaBlaCar Built a Practical Data Mesh to Support Self-Service Analytics at Scale
  • How Blockchain Technology Can Enhance Fintech dApp Development
  • How to leverage novel technology to achieve compliance in pharma
  • The need for extensive data to make decisions more effectively and quickly
  • How Is Robotic Micro Fulfillment Changing Distribution?

Search

Tags

AI Amazon analysis analytics application applications Artificial Intelligence benefits BI Big Data business China Cloud Companies company costs crypto Data design development digital engineer environment experience finance financial future government Group health information machine learning mobile news public research security services share skills social social media software strategy technology

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!