• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

6 Benefits of Data Modeling in the Age of Big Data

Jason Parms / 9 min read.
October 8, 2015
Datafloq AI Score
×

Datafloq AI Score: 58

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/Td0dh

In dealing with Big Data, youve probably heard the term data modeling being used as a core concept in dealing with large amounts of data. Its a term you should be familiar with, and in this column, well explore what it is and, more critically, why its important.

Data modeling is a branch of specialization in and of itself, and while this article will introduce you to the basic concepts, the practice and methodologies are mature, well-developed technologies and processes that are generally executed by experts in the field.

There are a number of tools that can be used to assist in data modeling, and we would encourage you to explore these options. However, it is not the goal of this column to compare data modeling tools; so much as it is our goal to give you the concepts and vocabulary to understand the process.

What is Data Modeling?

Data modeling is a process that will help you make sense of your data by defining and categorizing it, and establishing standard definitions and descriptors so that your data can be consumed by all information systems in your organization.

There are two primary reasons for performing data modeling.

  1. There is strategic data modeling, which you might do as part of creating an overall information systems strategy.
  2. You also may need to do data modeling during systems or data analysis as part of the development of new databases.

Generally, data modelling for strategic planning means determining what kind of data you will need for your business processes, while modeling in the context of analysis is more focused on describing data that exists and finding ways to categorize it.

In the case of Big Data, that process probably requires finding similarities between data from disparate sources, and confirming that they in fact describe the same thing. In either case, the end goal is to create a representation of your data that can be replicated in your database architecture.


Are you looking for Big Data Jobs or Candidates? Please go to our WORK section


How is Data Modeling Done?

In order to build a database that accurately classifies your data, it is important to have an in-depth understanding of your data types and descriptors, particularly when the data is drawn from multiple sources. This is how you arrive at interoperability in an environment that involves many systems and many sources of data.

In arriving at a data model that can be used for architecting your database, you must first choose a methodology; then you typically will produce a series of data models that lead from business-oriented requirements to technical requirements.

Two common methodologies are used in data modeling, and before beginning the process, your data modeling team will decide on which method is more appropriate, based on your existing data, your business goals, and the architectural requirements of the database.

  • Bottom-up models are more appropriate when re-engineering your database than when you are developing a new strategic approach. These models usually start with existing data structures, whether from an existing database, or from forms, reports, spreadsheets, or other existing material. Data may be coming from proprietary systems and data tools that probably were not originally engineered with the idea of data-sharing or Big Data analytics in mind. This is often what data engineers refer to as siloed data. Your existing data structures are foundational, and you build your data model on that existing foundation.
  • Top-down models can be used for strategic data modeling, without having to reference previously-existing systems. Data experts and subject-matter experts can work together to define the business needs of the organization and create logical data models that support those needs.

Most often, data models must be created by mixing the two methods: by integrating the defined organizational needs with existing data structures, some of which are application-specific and which must be extended to encompass additional information systems. This could be viewed as a third methodology, which we will call the mixed-method approach.

Conceptual, Logical and Physical Models

Once a methodology has been determined, there are three key activities that will be performed, with three distinct types of outputs from each activity. Each output is a type of data model, and the process can be seen as a progression from a conceptual data model, to a logical data model, and finally the physical data model that is used to architect your database. These three types of data models were formally described by ANSI in 1975, and remain relevant, useful ways to describe the data-modeling process.

The Conceptual Data Model.

Your team will analyze the data requirements that support business processes and workflows. In this step, input is gathered from business stakeholders as well as from data experts, and involves conceptualizing the data non-technically that is, in the context of business processes rather than database architecture. In this model, the semantics of your schema and the scope of the model are defined. If you consider the area of interest of a business as a domain, the conceptual data model describes areas of interest in the domain, relationships and associations between data entities within the domain, and classes of entities.

To use a simple example, if the domain is sales, customer information is an area of interest that will be consumed by the sales department. Entities within that domain may include contact information (addresses, emails, phone numbers) as well as personal information (birthdays, family members) and purchasing information (size and dates of orders). Included in the conceptual data model are the needs of the business stakeholders: what kind of queries need to be run against the data, what reports may be needed, and so on.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

The Logical Data Model.

Your technical experts describe the data for your information systems in a standard way that supports those business processes and which is able to be used in your database architecture. This is the logical data model, arrived at by taking the conceptual data model and applying some technical definitions to it. It is no longer only a description of the data, it is now also a description of the structure of the data. At this point, the primary key is defined, XML entities and classes are described, and all attributes are elaborated. Relationships between data are defined and resolved, and normalization of data from different systems occurs at this point.

The Physical Data Model.

The final stage of data modeling is to produce the physical data model, which describes the physical means used to store and retrieve data. Tables and columns are specified, XML tags are converted into column headings, and relationships are converted into foreign keys. The physical data model tries to adhere to the logical data model as closely as possible, but database architecture may require adjustment in the logic.

It is important to note that relational databases are structurally different from NoSQL databases, and thus may require considerable adaptation. In either case, the physical data model is then architected into the database using a Data Definition language. The database, after all, is the usual intended outcome of data modeling.

While the database is the ultimate deliverable for data modeling, effective modeling cannot be done independently of business process modeling, as the goal of data storage, retrieval, and manipulation is to support business processes. The process, therefore, must include an analysis not only of the physical structure and attributes of the database, but the usage requirements as well. Not only must the data model includes detailed attributes for every entity included within it, it must also elaborate the forms and queries that will be used to interact with the database to supply business stakeholders with the information described in their conceptual requirements.

Data Model is a flowchart that explains the relation between the data by analyzing the data requirement needs to support business processes. However, banks and insurance companies require to send files to other companies without knowing to third parties. These files generally include payment details, credit card details or social security numbers. To make a smooth and safe transfer of these files, Encryption can be of useful protocol that encrypts the data files to avoid interception of any third party.

The most common challenge businesses face when implementing a data model is a failure to capture the business requirements at the beginning of the process. Business stakeholders rarely can provide requirements in technical language, and it takes a skilled analyst to know which questions to ask and how to ask them in order to produce requirements that are meaningful to database architects.

Besides requiring expertise, time is often a factor; its important to allow enough time for requirements to be thoroughly explained and vetted before moving on to database design steps. Describing data is an iterative process, in which a data modeler takes information provided by business stakeholders, creates the name and a detailed definition of each entity and its attributes, and returns it to the business for revision and approval. Names and definitions must be meaningful to all parties concerned.

Benefits of Data Modeling

To Manage Data as a Resource

Data modeling allows you to normalize your data and to define it in terms of what it is and what attributes it can possess. Data modeling also provides gives you the tools to query the database and derive reports from it. Without a good data model, you can find yourself in the possession of a great deal of data, and with no efficient way or no way at all to make use of it. With a good data model and well-designed database, business users can have access to information that perhaps they didnt even realize was being collected.

To Integrate Existing Information Systems

Many businesses find themselves in the position of having data in a variety of systems that do not communicate with each other. By modeling the data in each of these systems, you can see relationships and redundancies, resolve discrepancies, and integrate disparate systems so they can work together.

To Design Databases and Repositories

Modeling your data is critical in designing a well-functioning database as we have discussed, that usually is the primary outcome of embarking on a data-modeling project. However, by modeling your data, you can also drive better decisions about data warehousing and repositories. Having a clear view of your data can tell you whether you need a global warehouse, an independent data mart, or a series of interconnected data marts. It can help you decide whether you need a relational database or a NoSQL database. Describing your data is the best way to understand what your business needs in terms of data storage and service.

Understanding the Business

The process of data modeling requires you and your teams to understand detail how the business works in order to define the data that drives it. In order to build a customer database, for instance, you need to understand what data is gathered on customers and how it is used. The data and relationships represented in a data model provide a foundation on which to build an understanding of business processes.

Business Intelligence

If your requirements gathering were complete and included merging of data from multiple sources, as well as query and reporting requirements, you have business intelligence opportunities that were nonexistent when your data existed in silos, or in haphazardly-designed databases. Using proper modeling and reporting, you can spot business trends, spending patterns, and make predictions that will help your business navigate challenges and opportunities.

Knowledge Transfer

Data modeling is a form of documentation, both for business stakeholders and technical experts. Starting with providing a common vocabulary that different job role can share, and continuing on to providing newcomers with a well-thought-out business glossary, your ability to document and convey information about your business is greatly enhanced. As a training aid, a data dictionary built from a well-executed data modeling exercise can be irreplaceable.

Data modeling may sound like a nerd vocabulary, but if you are dealing with Big Data in any meaningful way, it is a subject you want and need to be literate in.

Image: Clarkson.edu

Categories: Big Data
Tags: benefits, Big Data, Business intelligence, data model, databases, information, knowledge, modeling

About Jason Parms

Jason Parms is customer service manager at SSL2BUY Inc. He assist to build customer service policy, managing staff and handling head to head inquiries. His key responsibilities are to suggest right product and service, investigating problems and give right solutions, staff enrollment and assessment, handling major incidents such as security issues, developing procedures to get fast customer's feedback or complaints, educate team to bring a high standard support, etc. Follow Jason Parms to receive latest updates.

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

How BlaBlaCar Built a Practical Data Mesh to Support Self-Service Analytics at Scale

March 23, 2023 By Barr Moses

The need for extensive data to make decisions more effectively and quickly

March 23, 2023 By Rosalind Desai

A Beginner’s Guide to Reverse ETL: Concept and Use Cases

March 22, 2023 By Tehreem Naeem

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics app application Artificial Intelligence BI Big Data blockchain business China Cloud Companies company costs crypto Data development digital environment experience finance financial future Google+ government information machine learning market mobile Musk news public research security share skills social social media software startup strategy technology twitter

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Webinar – How to harness financial data to help drive improved analytics and insights with Envestnet & AWS
  • Digital Transformation and the Impact on Business Models
  • World Data & Analytics Show Singapore
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • How BlaBlaCar Built a Practical Data Mesh to Support Self-Service Analytics at Scale
  • How Blockchain Technology Can Enhance Fintech dApp Development
  • How to leverage novel technology to achieve compliance in pharma
  • The need for extensive data to make decisions more effectively and quickly
  • How Is Robotic Micro Fulfillment Changing Distribution?

Search

Tags

AI Amazon analysis analytics app application Artificial Intelligence BI Big Data blockchain business China Cloud Companies company costs crypto Data development digital environment experience finance financial future Google+ government information machine learning market mobile Musk news public research security share skills social social media software startup strategy technology twitter

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!