• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

Preparing for Disaster: The Importance of Big Data Disaster Recovery

Limor Wainstein / 4 min read.
August 22, 2018
Datafloq AI Score
×

Datafloq AI Score: 69

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/pcP4N

Disaster recovery encompasses procedures, policies, and techniques aimed at ensuring the swift and smooth recovery of organizational data and infrastructure after a disaster, whether human or natural-induced.

The best way to gain a true understanding of the importance of disaster recovery is to read about some real-life disaster recovery stories in which the lack of clear strategy and tests caused problems:

  • In May 2017, an IT worker accidentally switched off British Airways’ power supply, causing a prolonged outage. The outage resulted in the cancellation of at least 800 flights and compensation costs of over $70 million.

  • Pixar almost lost all of its files for hit movie Toy Story 2 when an employee accidentally deleted them, and the company realized its backups hadn’t functioned for two months because they hadn’t been properly tested. The only reason the project was retrieved was that an employee had taken home a copy of the movie’s files completely by chance the previous week.

Cloud computing has made disaster recovery more affordable and accessible for business with the availability of disaster recovery-as-a-service. Major vendors include Azure Site Recovery, IBM Resilience Services, and Veritas. Other service providers, such as N2WS, recognize that cloud environments can also be vulnerable, and they offer a service that backs up cloud-based workloads with disaster recovery solutions for AWS and more. After all, Amazon Web services (AWS) and other cloud service providers have experienced major outages before.

Find out in this post why Big Data disaster recovery is becoming increasingly important for businesses.

How Disaster Recovery Is Relevant For Big Data

Big Data analytics tools, software, and workloads are now mission-critical for enterprises looking to derive insights and a competitive edge from the large stores of information that inundates their systems daily. As far back as 2013, 70 percent of IT executives deemed Big Data mission-critical.

Businesses typically use a range of tools to analyze and process Big Data, such as Apache Spark, Hadoop, Kafka, and Apache Storm. The mission-critical nature of Big Data workloads shows that it’s vital to include Big Data environments in any disaster recovery plan.

Since what qualifies as Big Data is often high-velocity data gathered in real-time (IoT sensors, web clicks) the speedy restoration of environments after a disaster takes on arguably even greater importance with this type of data. Lose an hour of streaming information and you might be ok, but lose half a day or more, and you’ll likely miss out on some critical business insights.

Disaster recovery provides insurance against unexpected outages; insurance that is invaluable for the data-driven businesses that operate in the modern world.

Big Data Disaster Recovery Tips

Establish and Agree Upon An RPO

In the context of disaster recovery, the recovery point objective (RPO) is the interval that equates to the maximum acceptable amount of data loss after an unplanned outage occurs. The RPO is so crucial because it determines the frequency at which you need to back up your data.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

There is often confusion about what the RPO actually means, and this confusion mostly arises due to quite a variation in definitions. To clarify RPO, ask yourself how many hours of data loss as a result of Big Data systems going down can you afford. If your RPO is two hours of data, then you perform backups every two hours.

Make sure all important stakeholders, including IT executives, management, IT operators, etc., are all clear on what the RPO means and that everyone agrees on the same figure.

Conduct Regular Disaster Recovery Tests

You can’t have confidence in any disaster recovery plan without conducting regular testing of it. The Pixar story previously mentioned in this article is a case in point ‘the team neglected to test their backup procedures, which form a pivotal part of any disaster recovery strategy, and they almost paid a high price.

Only you can decide a testing frequency that works for your business, but a safe approach is to test at least semi-annually for newly implemented disaster recovery plans and then yearly going forward.

The tests you conduct should verify that your disaster recovery procedures and strategies can restore Big Data workloads within pre-defined RPO and RTO (recovery time objective). Your RTO is the target maximum time for resuming your Big Data workloads. The mission-critical nature of Big Data points to a short RTO, and the shorter the RTO, the more resources required to achieve it.

Back up Off-Premise

Whether in low-cost cloud storage services or in a second physical location, it’s always prudent to back up any data to an off-premise location. While backup is not synonymous with disaster recovery, creating data backups at specified intervals is a huge part of any successful disaster recovery strategy.

For Big Data, cloud-based backup is probably the best option because it is cheap and easy to back up your data to the cloud, particularly batch data, which is large and static.

Seek Out a Converged Data Platform

Backups alone aren’t enough for disaster recovery, and it makes sense to seek out some kind of converged data platform for your Big Data disaster recovery. On a converged platform, you can manage multiple Big Data clusters across several locations and infrastructure types (cloud/on-premise) independent of service provider, ensuring data remains consistent and up-to-date between all clusters.

For example, IBM Big Replicate is one such platform that replicates data across multiple clusters and unifying Hadoop clusters running on different vendor distributions and versions. The active replication of IBM Big Replicate drives both RTO and RPO to near-zero durations.

Conclusion

The first step in ensuring you’re prepared for disaster with Big Data workloads is to recognize the mission-critical nature of the insights these workloads can deliver.

Adapt your disaster recovery plan to include a section on Big Data, and make sure to implement some of the tips in this article to increase the chances of returning Big Data environments to an acceptable service level as quickly as possible after a disaster.

Categories: Big Data
Tags: Big Data, cloud computing, disaster, recovery

About Limor Wainstein

I'm a technical writer and editor with over 10 years' experience writing technical articles and documentation for various audiences, including technical on-site content, software documentation, and dev guides. I specialize in big data analytics, computer/network security, middleware, software development and APIs. And I love coffee!

I've published my work on major publications such as DZone or Wordtracker, and I'm also a volunteer writer for some universities, where I write about data science, big data, data warehousing, and related topics. Here are some of my recent articles:

Soft Computing vs. Hard Computing (UoPeople)
An Overview of Amazon Redshift (DZone)
Managing Telehealth's Big Data with Data Warehousing (Arizona University)
Facial Features, Illnesses, and Computer Vision (Pompeu Fabra University)
Tools for a Deeper Understanding of User Data (Wordtracker)

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

The Advantages of IT Staff Augmentation Over Traditional Hiring

May 4, 2023 By Mukesh Ram

The State of Digital Asset Management in 2023

May 3, 2023 By pimcoremkt

Test Data Management – Implementation Challenges and Tools Available

May 1, 2023 By yash.mehta262

Related Jobs

  • Software Engineer | South Yorkshire, GB - February 07, 2023
  • Software Engineer with C# .net Investment House | London, GB - February 07, 2023
  • Senior Java Developer | London, GB - February 07, 2023
  • Software Engineer – Growing Digital Media Company | London, GB - February 07, 2023
  • LBG Returners – Senior Data Analyst | Chester Moor, GB - February 07, 2023
More Jobs

Tags

AI Amazon analysis analytics application Artificial Intelligence BI Big Data business China Cloud Companies company crypto customers Data design development digital engineer engineering environment experience future Google+ government Group health information learning machine learning mobile news public research security services share skills social social media software solutions strategy technology

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Oracle Cloud Data Management Foundations Workshop
  • Data Science at Scale
  • Statistics with Python
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • 5 Reasons Why Modern Data Integration Gives You a Competitive Advantage
  • 5 Most Common Database Structures for Small Businesses
  • 6 Ways to Reduce IT Costs Through Observability
  • How is Big Data Analytics Used in Business? These 5 Use Cases Share Valuable Insights
  • How Realistic Are Self-Driving Cars?

Search

Tags

AI Amazon analysis analytics application Artificial Intelligence BI Big Data business China Cloud Companies company crypto customers Data design development digital engineer engineering environment experience future Google+ government Group health information learning machine learning mobile news public research security services share skills social social media software solutions strategy technology

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!