• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

Consider this: Big Data and the Analytics Data Store

Martyn Jones / 8 min read.
January 21, 2015
Datafloq AI Score
×

Datafloq AI Score: 73

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/nT5na

To begin at the beginning

Hold this thought: If Data Warehousing was Tesco then Big Data would be the try something different.

Since the publication of the article Aligning Big Data, which basically laid out a draft view of DW 3.0 Information Supply Framework and placed Big Data within a larger framework, I have been asked on a number of occasions recently to go into a little more detail with regards to the Analytics Data Store (ADS) component. This is an initial response to those requests.

To recap, the overall architecture consists of 3 major components: Data Sources; Core Data Warehousing; and, Core Analytics.

Data Sources  This element covers all the current sources, varieties and volumes of data available which may be used to support processes of challenge identification, option definition, decision making, including statistical analysis and scenario generation.

Core Data Warehousing This is a suggested evolution path of the DW 2.0 model. It faithfully extends the Inmon paradigm to not only include unstructured and complex data but also the information and outcomes derived from statistical analysis performed outside of the Core Data Warehousing landscape.

Core Statistics  This element covers the core body of statistical competence, especially but not only with regards to evolving data volumes, data velocity and speed, data quality and data variety.

ADS1

Fig.1 3 components of the Information Supply Framework  

This piece will focus on the Core Statistics segment and in particular the Analytics Data Store, which is specifically designed to support professional statistical analysis and at the same time to support the speculative use of data.

ADS2

Fig.2 Core Statistics Analytics Data Store  

The Analytics Data Store

Daniel Keys Moran once stated that You can have data without information, but you cannot have information without data., well deal with that nonsense at another time.

The Analytics Data Store is the reference data store collection for the entire Core Statistics segment.

The following is a high-level diagram of the Analytics Data Store together some of its major option features:

ADS3

Fig.3 Inside the Analytics Data Store  

Operating System Platform Typically the operating system platform will be a flavor of UNIX (Linux or some other flavor).

The standard UNIX distributions can support parallel file manipulation commands, for mapping and reducing data in files that can be theoretically in the order of zebibytes.

Additionally, Hadoop Distributed File System can be overlaid on the UNIX platform to leverage the underlying UNIX primitives giving it access and control over the underlying devices, whether that device is a file, disk, cluster, node or anything else (but these files cannot be manipulated using regular UNIX primitives unless using something like FUSE)..

Hadoop Hadoop is a set of algorithms (an organised collection of code) for distributed storage and processing of data sets on clusters of commodity computer hardware. The modules in Hadoop are designed with the idea that hardware failures are commonplace and should be automatically handled by the software. This is not however unique to Hadoop as there are UNIX distributions that also fulfill these functions, and then some. However, the attraction of open source software running on commodity hardware cannot be dismissed lightly.

Relational DBMS This is the database model that most people who know anything about databases are familiar with. RDBMS is based on the relational data model. The relational data model provides an uncomplicated view of data to all users by representing data in two-dimensional tables of rows and columns. These tables are called relational tables. A relational database is a collection of relational tables. RDBMS is the data manager for relational databases.

Relational DBMS users use Structured Query Language (SQL), the industry-standard relational database management language, and with typically some extensions to SQL, to interact with the databases.

Document DBMS This is a class of database management system oriented towards the management of unstructured, semi-structured and complexly structured documents, primarily digital textual documents. Examples of what might be labeled Document-oriented DBMS include Documentum EDMS and MongoDB.

Graph DBMS Also known as Semantic Data Model Databases (back in the day). According to Wikipedia a graph database is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. One of the features of some of the early Graph DBMS (my first contact with this technology was at Unisys in the late eighties with a product called InfoExec,) was that the query languages allowed for structured queries to be stated in more business-like terms.

Key-Value DBMS One can either view this type of database as an innovative reuse of the design of simple programmatic collections (trust Microsoft to be the only ones to name a simple thing with a simple name,) used to structure data, then applied to the realm of database management, or as a mental aberration invented by bodgers and hackers. At the end of the day Key Value DBMS simply provides a simple means to store in memory associative array stores on disk. If there is more to it than that then please let me know?

Object DBMS Object-oriented database management system stores information is represented in the form of objects as used in object-oriented programming.

Object-relational databases are a hybrid of both the object oriented and relational approaches. I have found use for object-relational in operational applications, but never in MIS reporting, OLAP, Data Warehousing, Business Intelligence or Statistics. Does anyone have an alternative perspective?

Column Oriented DBMS This refers to how data is stored. Typically we now view data as being stored in rows or records, but its not the only way of storing data.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

Column Oriented DBMS store data first by values in columns, hence the name.

Examples of this type of database implementation go from Apache HBase as a distributed NoSQL column-oriented store built on top of HDFS, to EXASOL, currently the worlds fastest in-memory database management system.

As you see, the Analytics Data Store is fast becoming a super-fantastic mix of artefacts, gadgets and toys which should satisfy everyone; from the most experienced and knowledgeable statisticians, passing by the data creatives, the data scientists and the data data-users to the most game oriented of data plumbers and punters.

The ADS is above all about quality over quantity, the now over the maana, and the just do it over the can we?.

But, also remember these words from Colin Powell: Experts often possess more data than judgment. So, be forewarned and forearmed.

Using the Analytics Data Store

What are the applications that the Analytics Data Store might be used to support?

Here is a non-exhaustive list (first described in the mid eighties) of the potential applications:

Interpretation Inferring situation descriptions from the analysis of a variety of data.

Prediction Inferring likely consequences based on situational data.

Diagnosis Inferring deviations and malfunctions from observables from data.

Design Analysing data and configuring objects under constraints.

Planning Designing actions based on data feedback and analysis.

Monitoring Comparing observations to known plan vulnerabilities.

Debugging Prescribing remedies for malfunctions based on the analysis of data.

Repair Devising and executing a plan to administer a prescribed remedy.

Instruction Diagnosing, debugging and repairing behavioural patterns captured in data.

Control Interpreting, predicting, repairing and monitoring systems behaviour.

Given the availability and quality of data to support the activities listed above, the Analytics Data Store can provide a sound source of data for a wide range of statistical analysis, forensic and speculative activities.

The Analytics Data Store is developed iteratively to support the data needs of a range of activities, from main stream statistical analysis, and formal data mining to creative and eclectic exercises in speculative analytics and non-traditional data correlation. This will ensure that business value can be assessed sooner rather than later.

The Analytics Data Store is essentially technology implementation agnostic and that it has a clear mission and business objectives within an overall Information Supply Framework.

The choices of technology products are based on best fit criteria, so the use of technology should not be driven by the old commercial approach of solutions in search of problems approach, which failed so miserably time after time again, but on what are the most appropriate artefacts, resources and technologies to use in approaching this problem or testing this hypothesis?

Thats all folks

The description of the Analytics Data Store has been necessarily terse. But I hope that it gives a flavour of where the ADS fits into an overall Information Supply Framework that extends the enterprise Data Warehousing paradigm (DW 2.0) without disrupting business as usual or by destructively distorting the purpose, architecture and management principles of Data Warehousing.

The Analytics Data Store and the much broader DW 3.0 Information Supply Framework are also aligned with the much longer term objective of addressing the knowledge, information and data needs of organisations.

What follows is a highly synthesised view of the long term. Which I will leave for now without further comment.

ADS4

Fig.4 The Iniciativa Knowledge Management Pyramid  

Thank you so much for reading.

Categories: Big Data
Tags: Big Data

About Martyn Jones

Martyn's range of knowledge, skills and experience span executive management, organisational strategy, strategic business performance and information management, leadership, business analysis, business and data architectures, data management, and executive and team coaching.

Martyn has worked with and advised many of the world's best-known organisations including Adidas, Banco Santander, Bank of China, BBVA, Boston Consulting Group, British Telecom, La Caixa, Central Statistical Office (UK), Central Statistical Office of Poland, Citco, Citigroup, Credit Suisse, E.On, Eroski, European Union, Fnac, France Telecom, Hewlett Packard, Iberdrola, IBM, Iberia, Infineon, T rkiye ' , Metropolitan Police, Movistar, NCR, National Health Service (UK), Office of the Governor - State of California, Oracle, The Home Office (UK), Rolls-Royce Marine Power Operations, the Royal Navy, Shell, Swiss Life, TSB, UBS, Unisys, the United Nations and Xerox, among many others.

He currently focuses on helping clients to:

-' Create relevant, understandable and actionable information
-' Plan, manage, design, develop and deliver information supply frameworks for the timely, appropriate and adequate supply of information
-' Design, develop and deliver beneficial, tangible and usable strategic performance and information frameworks
-' Design, develop and deliver relevant and coherent performance models, indicators and metrics
-' Plan, manage, design, develop and deliver information and data analytic strategies
-' Design, develop and deliver management informational insight and dynamic feedback solutions
-' Coach teams in measuring and managing performance
-' Align people, competencies, processes and practices with strategy
-' Prepare clients for the next big thing in Information Management and Analytics
-' Help IT suppliers to better align with the needs and nature of clients and prospects
-' Help clients capitalise on tangible benefits derived from advanced information architectures and management

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

How to Build Microservices with Node.js

March 30, 2023 By Annie Qureshi

How to Validate OpenAI GPT Model Performance with Text Summarization (Part 1)

March 29, 2023 By mark

What is Enterprise Application Integration (EAI), and How Should Your Company Approach It?

March 29, 2023 By Terry Wilson

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics app application applications Artificial Intelligence BI Big Data blockchain business China Cloud Companies company costs crypto Data design development digital environment experience future Google+ government Group health information machine learning market mobile news public research security services share skills social social media software strategy technology

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Big Data & AI World, Singapore
  • Velocity Data and Analytics Summit, UAE
  • Intel AI Fundamentals
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • 12 Data Quality Metrics That ACTUALLY Matter
  • How to Build Microservices with Node.js
  • How to Validate OpenAI GPT Model Performance with Text Summarization (Part 1)
  • What is Enterprise Application Integration (EAI), and How Should Your Company Approach It?
  • 5 Best Data Engineering Projects & Ideas for Beginners

Search

Tags

AI Amazon analysis analytics app application applications Artificial Intelligence BI Big Data blockchain business China Cloud Companies company costs crypto Data design development digital environment experience future Google+ government Group health information machine learning market mobile news public research security services share skills social social media software strategy technology

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!