• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

Essential Data Science Job Skills Every Data Scientist Should Know

Nathan Piccini / 7 min read.
September 27, 2019
Datafloq AI Score
×

Datafloq AI Score: 81.67

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/V2uFg

What are Data Scientists?

How do you distinguish a genuine data scientist from a dressed-up business analyst, BI, or other related roles?

Truth be told, the industry does not have a standard definition of a data scientist. You’ve probably heard jokes like a data scientist is a data analyst living in Silicon Valley . Just for fun, take a look at the cartoon below

Data Scientist living in California

Finding an effective data scientist is difficult. Finding people in the role of a data scientist can be equally difficult. Note the use of effective here. I use this word to highlight the fact that there could be people who might possess some of these data science skills yet may not be the best fit in a data science role. The irony is that even the people looking to hire data scientists might not fully understand data science. There are still some job advertisements in the market that describe a traditional data analyst and business analyst roles while labeling it a Data Scientist position.

Instead of giving a list of data science skills with bullet points, I will highlight the difference between some of the data-related roles.

Consider the following scenario:

Shop-Mart and Bulk-Mart are two competitors in the retail setting. Someone high up in the management chain asks this question: How many Shop-Mart customers also go to Bulk-Mart? Replace Shop-Mart and Bulk-Mart with WalMart, Target, Safeway, or any retail outlets that you know of. The question might be of interest to the management of one of these stores or even a third party. The third-party could possibly be a market research or consumer behavior company, interested in gathering actionable insights about consumer behavior.

How Professionals in Different Data-Related Roles will Approach the Problem

Traditional BI/Reporting Professional: The BI professional generates reports from structured data using SQL and some kind of reporting services (SSRS for example) and sends the data back to management. Management asks more questions based on the data that was sent, and the cycle continues. Insights about the data are most likely not included in the reports. A person in this role will be experienced mostly in database-related skills.

Data Analyst: In addition to doing what the BI professional does, a data analyst will also keep other factors like seasonality, segmentation, and visualization in mind. What if certain trends in shopping behavior are tied to seasonality? What if the trends are different across gender, demographics, geography, or product category? A data analyst will slice and dice the data to understand and annotate the report. Aside from database skills, a data analyst will have an understanding of some of the common visualization tools.

Business Analyst: A business analyst possesses the skills of a BI professional and the data analyst, plus they have domain knowledge and an understanding of the business. A business analyst may also have some basic skills in forecasting.

Data Mining or Big Data Engineer: A data miner does the job of the data analyst, possibly from unstructured data if needed, plus possesses MapReduce and other big data skills. An understanding of common issues in running jobs on large scale data and debugging of MapReduce jobs is needed.

Statistician (a traditional One): A statistician pulls data from a database or obtains it from any of the roles mentioned above and performs statistical analysis. This person ensures the quality of data and correctness of the conclusions by using standard practices like choosing the right sample size, confidence level, level of significance, type of test, and so on.

In the past, statisticians did not traditionally come from a computer science background, needed for writing code to implement statistical models. The situation has changed, Stat students now graduating with strong programming skills and decent foundation skills in CS. This enables them to perform the tasks that previous statisticians were not trained for traditionally.

Program/Project Manager: The program or project manager looks at all the data provided by the professionals mentioned so far, aligns these findings with the business, and influences the leadership to take appropriate action. This person possesses communication skills, presentation skills, and can influence without authority.

Ironically, a PM is influencing business decisions using the data and insights provided by others. If the person does not have a knack for understanding data, chances are that they will not be able to influence others to make the best decisions.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

Putting It All Together

The rise of online services has brought a paradigm shift in the software development life cycle and business iteration over successive features and products. Having a different data puller, analyst, statistician, and project manager is just not possible anymore. Now the mantra is: ship, experiment, and learn, adapt, ship, experiment, and learn. This situation has resulted in the birth of a new role, a data scientist.

A data scientist should have the skills of all the individuals mentioned so far. In addition to the skills mentioned above, a data scientist should have rapid prototyping and programming, machine learning, visualization, and hacking skills.

Domain Knowledge and Soft Skills Are Equally Important As Technical Skills

The importance of domain knowledge and soft skills, like communication and influencing without authority, are severely underestimated both by hiring managers and aspiring data scientists. Insights without domain knowledge can potentially mislead the consumers of these insights. Correct insights without the ability to influence decision making are just as bad as having no insights.

All of what I have said above is based on my own tenure as a data scientist at a major search engine and later with the advertising platform within the same company. I learned that sometimes people asking the question may not understand what they want to know. This sounds preposterous yet it happens way too often. Very often a bozo will start digging into something that is not related to the issue at hand just to prove that he/she is relevant. A data scientist encounters such HIPPOs (Highly Paid Person’s Opinions) that are somewhat unrelated to the problem and are very often a big distraction from the problem at hand.

A data scientist should possess the right soft skills to manage situations such as people asking irrelevant, distracting questions that are outside the scope of the task at hand. This is hard, especially in situations where the person asking the question is several levels up the corporate ladder and is known to have an ego. It is a data scientist’s responsibility to manage up and around while presenting and communicating insights.

Suggested Skills a Data Scientist Should Possess

Curiosity About Data and Passion For Domain: If you are not passionate about the domain or business, and if you are not curious about data, then it is unlikely that you will succeed in a data scientist role. If you are working as a data scientist with an online retailer, you should be hungry to crunch and munch from the smorgasbord (of data of course) to know more. If your curiosity does not keep you awake, no skill in the world can help you succeed.

Soft Skills: Communication and influencing without authority are necessary skills. Understand the minimum action that has the maximum impact. Too many findings are as bad as no findings at all. The ability to scoop information out of partners and customers, even from the unwilling ones, is extremely important. The data you are looking for may not be sitting in one single place. You may have to beg, borrow, steal, and do whatever it takes to get the data.

Being a good storyteller is also something that helps. Sometimes the insights obtained from data are counter-intuitive. If you’re not a good storyteller, it will be difficult to convince your audience.

Math/Theory: Machine Learning algorithms, statistics, and probability 101 are fundamental to data science. This includes understanding probability distributions, linear regression, statistical inference, hypothesis testing, and confidence intervals. Learning optimization, such as gradient descent, would be the icing on the cake.

Computer Science/Programming: You should know at least one scripting language or a statistical tool such as R.There are plenty of resources to get started. Data Science Dojo provides numerous, free tutorials on getting started with Python and R to go along with its data science bootcamp. You can also learn programming basics from sites like CodeAcademy and LearnPython.

It’s necessary to possess decent algorithms and DS skills in order to write code that can analyze a lot of data efficiently. You may not be a production code developer, but you should be able to write decent code.

Database management and SQL skills are also helpful, as this is where you will be fetching your data to build models. It also doesn’t hurt to understand Microsoft Excel or another spreadsheet software.

Big Data and Distributed Systems: You need to understand basic MapReduce concepts, Hadoop and Hadoop file system, and at least one language like Hive/Pig. Some companies have their own proprietary implementations of these languages. Knowledge of tools like Mahout and any of the XaaS, like Azure and AWS, would be helpful. Once again, big companies have their own XaaS, so you may be working on variants of any of these.

Data Visualization: Possess the ability to create simple yet elegant and meaningful visualization. Personally, R packages like ggplot, lattice, and others have helped me in most cases, but there are other packages that you can use. In some cases, you might want to use D3.

In Summary

There are many skills needed to become a full-fledged data scientist. In reality, a data scientist should be a well-rounded data machine with the skills to take on just about any project. It may take years for you to learn all the concepts, and even longer to master them. Make sure you are able to check off each of the skills listed above, and you’ll be well on your way to data science stardom.

Categories: Big Data
Tags: Big Data, big data scientist, data science, jobs, skills

About Nathan Piccini

I'm a marketing manager at Data Science Dojo (DSD) where we teach the fundamentals of data science in 5 days. I went to school at Montana State University and received my bachelor's in business (accounting and marketing).

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post
Host your website with Managed WordPress for $1.00/mo with GoDaddy!

Related Articles

The Advantages of IT Staff Augmentation Over Traditional Hiring

May 4, 2023 By Mukesh Ram

The State of Digital Asset Management in 2023

May 3, 2023 By pimcoremkt

Test Data Management – Implementation Challenges and Tools Available

May 1, 2023 By yash.mehta262

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics app application Artificial Intelligence BI Big Data business China Cloud Companies company costs crypto customers Data design development digital environment experience future Google+ government information learning machine learning market mobile Musk news Other public research sales security share social social media software strategy technology twitter

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Oracle Cloud Data Management Foundations Workshop
  • Data Science at Scale
  • Statistics with Python
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • 5 Reasons Why Modern Data Integration Gives You a Competitive Advantage
  • 5 Most Common Database Structures for Small Businesses
  • 6 Ways to Reduce IT Costs Through Observability
  • How is Big Data Analytics Used in Business? These 5 Use Cases Share Valuable Insights
  • How Realistic Are Self-Driving Cars?

Search

Tags

AI Amazon analysis analytics app application Artificial Intelligence BI Big Data business China Cloud Companies company costs crypto customers Data design development digital environment experience future Google+ government information learning machine learning market mobile Musk news Other public research sales security share social social media software strategy technology twitter

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!