What if Machine Learning as a Service Does Not Work?

In our previous article on Data Science Industry Perspectives in the Cloud, we’ve discussed that evolution is key if you plan to grow your business. You can start with the ready-made solutions, then, in time, you can switch to the in-house ones that can be done with the help of a group of data scientists.

This time around we’ll talk about the cases and solutions when ML as a Service doesn’t work. When this happens, your company shouldn’t start off with hiring al Data Scientist. The best option is to invest in a custom-made solution to solve your urgent business needs. Only then when you have a workable solution, can you dive into deep Data Science and create a proper team that can create an in-house Data Science solution.

Key Takeaways

The time when you don’t need to hire data scientists and when you should start investing in data science resources.
Start building a comprehensive Data Science product not from the research, but from the end-to-end solution that solves business problems.
People envisage a Data Scientist that has a balance of knowledge in relative subject matters, but in real life, you can barely find such an ideal candidate.
Make your Data Scientists successful and productive. Any team that deals with Data Science has to be cross-functional with adjacent roles contributing to the end solution.
Deliver end-to-end solutions that solve business problems rather than research papers.

Custom made end-to-end solution as a start

Here we should talk about the classical Data Science, where you have data, goals and you need to build models to solve a pressing issue. The best way to do this is to jumpstart such a process by putting together bits and pieces of some ready-made services into a single workable product and show your customer a clear-cut result shortly. This can be done without any complex or global research, and you can comfortably formulate specifications taking into account all the feedback from your customer and create a more high-level Data Science product in the long run.

One of the biggest issues for any Data Science project is in formulating the specifications for it. The usual request is Create something for my company using Data Science. Analyze data for me. This type of a job has lots of trials and errors. And having some custom basic end-to-end solution from the start lets you insert into it the needed extra services on the go. This lets us provide insights and predictions into a specific business workflow much easier.

I believe that you should start to build your system from the ground up not from the Data Science research, but from the point of view of an end-to-end solution. And then bit by bit you can take out and insert the required services in the process.

How to do Data Science research?

You start with employing a Data Scientist that matches your company’s needs. You can use the standard Data Scientist chart: this employee should have a firm footing in the application environment, mathematics, and programming. Their business skills are key to success. I believe that an understanding of the topical area is key here because we are solving business issues. And the Data Scientist that solves more academic problems will be more focused on winning a Kaggle competition than addressing the business needs. It is important that this person understands the product development cycle. This way he builds up models and analyses data in such a way that it would be possible to be deployed in production.

For example, if a person is using R (language and framework for Data Scientists) we should note that it is more aimed at research and it is not production ready. Correspondingly the results of such research cannot be deployed in any end-to-end solution. Therefore, he should take note of this and work in pair with a programmer. Although in the above diagram we see that the Data Scientist should have hacking skills, in reality, this is not the case. In their vast majority, Data Scientists are not able to write quality code. And if you need not just the research but a solution then process-wise you need to have a Data Scientist working together with a Data Engineer.

Interested in what the future will bring? Download our 2024 Technology Trends eBook for free.

CAP theory as analogy

It is impossible to have three database properties at once: consistency, availability, and partition-tolerance. This is the basic rule know to all developers. And this is true for a Data Scientist as well, as people envisage a Data Scientist on the overlap of these subjects, but in real life, you cannot find such an ideal candidate. Usually, people tend to lean one or another way in their work and keeping a balance is not always a priority.

In principle one of the solutions that we use ourselves is that a Data Scientist should have business insights, understand the math behind it and work together in tandem with a skilled developer. Of course, a proper Data Scientist should be able to write any semblance of code. But a Data Scientist should work with a Data Engineer, write and realise code being both responsible for the quality and sustainability of this solution.

The classic tragedy of a company that decides to initiate any Data Science research is in when a Data Scientist says I’ve got 40000 lines of Python code on my PC, can you make it work in production? And, of course, this is virtually impossible to do. So you have an issue at hand that all of the research is simply wasted.

Cross-functional teams

Any team that deals with Data Science has to be cross-functional, i.e. it has to cover a whole stack of the solutions it writes. In a normal infrastructure there should be present a DevOps Engineer, Data Scientist, Data Engineer, and a Product Developer writing the web app and/or mobile app. And this is a single team that is responsible for the result. They should work together and solve related tasks that are interconnected in their interactions.

All of this means that the whole team is responsible for the business result. This is also true for the transitionary research done by a Data Scientist which is impossible to use in production on its own.

Old-school vs. Vertical teams

To dig in deeper, let’s take a classic old-school layered company organisation structure when you have a department of Data Scientists, Operations, UI Developers, Big Data department, QA Engineers and so on. In this case, we have every project penetrating most of these teams. And the classic problem is that tickets and tasks are being thrown around by one team to another, and the real business goals are being watered down along the way and not solved in the end. So instead of this horizontal division, we have divided the teams vertically. This allowed us to create teams that see a clear-cut goal they need to achieve. And at the same time, they can improve their cross-skills, and boost their responsibility levels.

As a result, such teams began to deliver, Scrum and Agile began to work properly. It is not directly related to Data Science, but nevertheless, there is a standard mistake of many companies where Data Scientists work somewhere at a university and write mostly academic papers. It is a topic for a whole new article, but for now, you need to distinguish that there is a Data Scientist and a Production Data Scientist. And you should aim at employing the latter one within your teams, and not let a Data Scientist work alone remotely.

What if Machine Learning as a Service Does Not Work?

Key Takeaways

Custom made end-to-end solution as a start

How to do Data Science research?

Interested in what the future will bring? Download our 2024 Technology Trends eBook for free.

CAP theory as analogy

Cross-functional teams

Old-school vs. Vertical teams

The Advantages of IT Staff Augmentation Over Traditional Hiring

The State of Digital Asset Management in 2023

Test Data Management – Implementation Challenges and Tools Available

Recent

Search

Key Takeaways

Custom made end-to-end solution as a start

How to do Data Science research?

Interested in what the future will bring? Download our 2024 Technology Trends eBook for free.

CAP theory as analogy

Cross-functional teams

Old-school vs. Vertical teams

About Maxim Tereschenko

Footer

Recent

Search

Tags