• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

Why Monitoring Machine Learning Models Matters

Jessica Bruce / 4 min read.
May 10, 2021
Datafloq AI Score
×

Datafloq AI Score: 84

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/0SXXL

Monitoring machine learning models is crucial for any business that has operating ML models in production. This can be even more important than some other software systems that DevOps teams are more used to monitoring for multiple reasons. Firstly, ML models may fail silently if not monitored, and thus errors can potentially go undetected for a long time. Additionally, ML models are often some of the more essential components in the larger software systems. They are responsible for making intelligent decisions, and we rely on their predictions heavily.

Despite the importance of monitoring machine learning models in production, there is not yet a standard practice or framework for doing this, and thus many models go into production without proper monitoring and testing. This is because the technology of ML models is only starting to mature, and MLOps, the intersection of DevOps and ML, is still a new field. In this post we will discuss the importance of machine learning model monitoring and potential issues that may arise with your model in the production environment.

Illustration of the full lifecycle of a machine learning model from conception to production. In practice, many models do not go through a proper monitoring stage. (source)

What
Can Go Wrong?

The short answer is a lot. ML models are often treated as black boxes. Many software engineers don’t understand how these models are constructed and how to evaluate their performance, and if the model endpoint is alive, they assume that it is performing as expected.

ML models are often treated as a black box, which could lead to costly undetected errors and loss of control (source)

However, even a frozen model does not live in a frozen environment, and thus we must continuously evaluate whether our model is still fit for the task. We will discuss some of the reasons your model may not perform as expected.

Data
Drift and Concept Drift

One of the common causes for model degradation over time is data drift. Any change of the distribution of the input data over time is considered data drift. This can be caused either by a shift in the real world (e.g. new competitor affects the market, pandemic changes user behavior) or by more technical changes to the data pipeline (e.g. incorporating data from additional sources, introducing new categories to categorical data). Similarly, Concept drift occurs when there is a shift in the relation between the features and the target.

More specifically, during training, the model learns to fit the patterns in the training data. Usually a test set is set aside for evaluation to simulate performance on real world data. However, if there is a significant shift in the makeup of the data, we cannot expect our model to perform flawlessly, and our model may become stale.

Taxonomy of types of concept drift (source)

Data
Integrity Problems

The data pipeline can be very complex, involving different data sources that may be owned by different bodies. ML models can be extremely intelligent when it comes to identifying patterns in the makeup of the DNA, but when it comes to understanding that two columns have been swapped, or that an input attribute changed scale, they are completely stupid. Monitoring the data that is fed to your model constantly, will help you identify any change to the data schema early on, and enable you to fix the issue before any significant damage has been done.

Mismatch
Between Dev and Production Environments

Deploying an ML model in production is a very different setting from the development setting. This difference can potentially manifest in the structure of the data that is fed to the model from the moment it was deployed. Without proper monitoring, you may wrongly assume your model is performing exactly as it did in the development environment.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

Serving
Issues

Since ML models can be relatively hidden in the production environment, it may happen that your model is not even providing the basic functionality of generating predictions given an input and you may not know about it. Issues like this can be caused by high latency, misconfiguration of the production environment and the model endpoint and so on. Monitoring metrics such as number of requests processed per endpoint can assure you that your model is fully operational

Control
Your Model

Proper implementation of machine learning monitoring will enable you to detect all these issues early on, and notify you when it’s time to retrain your model on up-to-date data, update the way a feature is calculated or fix the data pipeline. Thus, you will be able to be in control of your ML models, and answer any of the following question instantly:

– Is my model starting to degrade?

– Does the current input data have a similar distribution to the data used for training?

– Is the data pipeline and the schema of the input data intact?

– Is there any increase in biases with respect to attributes such as race or gender?

– Is my model handling the number of requests it receives as required? Do we need more resources? Would less be sufficient?

– Are there any subsets of the data where my model performs better or worse?

Conclusion

Monitoring and testing machine learning models is an area that’s often overlooked. The data science team may be content with achieving good results in the experimental environment, while the DevOps team does not always understand the terminology and concepts of machine learning. If you want to ensure that your ML models are not only theoretically useful but a product that gives added value to your business, continuously, it is worth investing in a proper monitoring system.

Categories: Artificial Intelligence, Internet Of Things
Tags: AI, machine learning, Taxonomy

About Jessica Bruce

I'm an eCommerce business expert, Influencer, and business advisor. I have also written several posts related to the eCommerce niche which has attracted readers.

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

How to Validate OpenAI GPT Model Performance with Text Summarization (Part 1)

March 29, 2023 By mark

Big Data & AI World, Singapore

March 29, 2023 By r.chan

Velocity Data and Analytics Summit, UAE

March 29, 2023 By shiwangi-7725

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics app application applications Artificial Intelligence BI Big Data blockchain business China Cloud Companies company costs crypto Data design development digital environment experience future Google+ government Group health information machine learning market mobile news public research security services share skills social social media software strategy technology

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Big Data & AI World, Singapore
  • Intel AI Fundamentals
  • Data Platform, Cloud Networking and AI in the Cloud
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • 12 Data Quality Metrics That ACTUALLY Matter
  • How to Build Microservices with Node.js
  • How to Validate OpenAI GPT Model Performance with Text Summarization (Part 1)
  • What is Enterprise Application Integration (EAI), and How Should Your Company Approach It?
  • 5 Best Data Engineering Projects & Ideas for Beginners

Search

Tags

AI Amazon analysis analytics app application applications Artificial Intelligence BI Big Data blockchain business China Cloud Companies company costs crypto Data design development digital environment experience future Google+ government Group health information machine learning market mobile news public research security services share skills social social media software strategy technology

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!