• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Articles
  • News
  • Events
  • Advertize
  • Jobs
  • Courses
  • Contact
  • (0)
  • LoginRegister
    • Facebook
    • LinkedIn
    • RSS
      Articles
      News
      Events
      Job Posts
    • Twitter
Datafloq

Datafloq

Data and Technology Insights

  • Categories
    • Big Data
    • Blockchain
    • Cloud
    • Internet Of Things
    • Metaverse
    • Robotics
    • Cybersecurity
    • Startups
    • Strategy
    • Technical
  • Big Data
  • Blockchain
  • Cloud
  • Metaverse
  • Internet Of Things
  • Robotics
  • Cybersecurity
  • Startups
  • Strategy
  • Technical

Managing High Availability in PostgreSQL Part I

Dharshan Rangegowda / 9 min read.
November 30, 2018
Datafloq AI Score
×

Datafloq AI Score: 88.33

Datafloq enables anyone to contribute articles, but we value high-quality content. This means that we do not accept SEO link building content, spammy articles, clickbait, articles written by bots and especially not misinformation. Therefore, we have developed an AI, built using multiple built open-source and proprietary tools to instantly define whether an article is written by a human or a bot and determine the level of bias, objectivity, whether it is fact-based or not, sentiment and overall quality.

Articles published on Datafloq need to have a minimum AI score of 60% and we provide this graph to give more detailed information on how we rate this article. Please note that this is a work in progress and if you have any suggestions, feel free to contact us.

floq.to/mKaNZ

Managing high availability in your PostgreSQL hosting is very important to ensuring your clusters maintain exceptional uptime and strong operational performance so your data is always available to your application. In an earlier blog post, we introduced you to configuring high availability for PostgreSQL using streaming replication, and now we’re going to show you how to best manage PostgreSQL high availability.

There are multiple tools available for managing the high availability of your PostgreSQL clusters using streaming replication. These solutions offer automatic failover capabilities, monitoring, replication, and other useful administrative tasks. Some of the prominent open source solutions include:

  1. PostgreSQL Automatic Failover by ClusterLabs

  2. Replication Manager for PostgreSQL Clusters by repmgr (2ndQuadrant)

  3. Patroni by Zalando

Each of these tools provides their own way of managing the clusters. In our three-part series of posts on high availability for PostgreSQL, we’ll share an overview, the prerequisites, and the working and test results for each of these three tools. Here in Part 1, we’ll deep dive into the PostgreSQL Automatic Failover (PAF) solution by ClusterLabs.

PostgreSQL Automatic Failover

PostgreSQL Automatic Failover (PAF) is a high availability management solution for PostgreSQL by ClusterLabs. PAF makes use of the popular, industry-standard Pacemaker and Corosync stack. With Pacemaker and Corosync together, you’ll be able to detect failures in the system and act accordingly.


Interested in what the future will bring? Download our 2023 Technology Trends eBook for free.

Consent

Pacemaker is capable of managing many resources, and does so with the help of their resource agents. Resource agents then have the responsibility of handling a specific resource, how they should behave, and inform Pacemaker of their results.

Your resource agent implementation must comply to the Open Cluster Framework (OCF) specification. This specification defines resource agents’ behavior and implementation of methods like stop, start, promote, demote and interaction with Pacemaker.

PAF is an OCF resource agent for PostgreSQL written in Perl. Once your PostgreSQL cluster is built using internal streaming replication, PAF is able to expose to Pacemaker the current status of the PostgreSQL instance on each node: master, slave, stopped, catching up, etc.

How it Works

PAF communicates with Pacemaker regarding the cluster status and monitors the PostgreSQL functioning. In the event of a failure, it informs Pacemaker, and if there’s no chance of the current master being recovered, it will trigger an election between the current standby servers. With the robust Pacemaker in place, PAF will perform management actions like start, stop, monitor, and failover on all the PostgreSQL nodes.

Are There Any Setup Requirements?

  • PAF supports PostgreSQL version 9.3 and higher.
  • PAF is not responsible for PostgreSQL master/standby creation or its setup – you must create and setup streaming replication before using PAF.
  • PAF doesn’t edit any configuration of Postgres. However, it requires users to follow a few prerequisites like:
    • Slave must be configured as hot standby.
    • A recovery template file (default: <postgresql_data_location>/recovery.conf.pcmk) has to be provided with below parameters:
      • standby_mode = on
      • recovery_target_timeline = latest’
      • primary_conninfo must have the application_name parameter defined and set to local node name as in Pacemaker.
  • PAF exposes multiple parameters related to the management of a PostgreSQL resource. This can be configured to suit one’s requirement. Below are the parameters:
    • bindir: location of the PostgreSQL binaries (default: /usr/bin)
    • pgdata: location of the PGDATA of your instance (default: /var/lib/pgsql/data)
    • datadir: path to the directory set in data_directory from your postgresql.conf file
    • pghost: the socket directory or IP address to use to connect to the local instance (default: /tmp)
    • pgport: the port to connect to the local instance (default: 5432)
    • recovery_template: the local template that will be copied as the PGDATA/recovery.conf file. This template file must exists on all node (default: $PGDATA/recovery.conf.pcmk)
    • start_opts: Additional arguments given to the Postgres process on startup. See postgres “help for available options. Useful when the postgresql.conf file is not in the data directory (PGDATA), eg.: –c config_file=/etc/postgresql/9.3/main/postgresql.conf
    • system_user: the system owner of your instance’s process (default: postgres)
    • maxlag: maximum lag allowed on a standby before we set a negative master score on it

PAF Pros

  • PAF provides the user a free hands-on configuration and setup of PostgreSQL.
  • PAF can handle node failures and trigger elections when the master goes down.
  • Quorum behavior can be enforced in PAF.
  • It will provide a complete high availability management solution for the resource, including start, stop, and monitor, and handle network isolation scenarios.
  • It’s a distributed solution, which enables the management of any node from another node.

PAF Cons

  • PAF doesn’t detect if a standby is misconfigured with an unknown or non-existent node in recovery configuration. Node will be shown as slave, even if standby is running without connecting to the master/cascading standby node.
  • Requires an extra port (Default 5405) to be opened for the Pacemaker and Corosync components’ communication using UDP.
  • Does not support NAT-based configuration.
  • No pg_rewind support.

High Availability Test Scenarios

We conducted a few tests to determine the capability of the PostgreSQL HA management using PAF. All of these tests were run while the application was running and inserting data to the PostgreSQL database. The application was written using PostgreSQL Java JDBC Driverleveraging the connection failover capability.

Standby Server Tests

Sl. No Test Scenario Observation
1 Kill the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. There was no disruption in writer application.
2 Stop the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. There was no disruption in writer application.
3 Reboot the server Standby server was marked offline initially. Once the server came up after reboot, PostgreSQL was started by Pacemaker and the server was marked as online. If fencing was enabled, the node wouldn’t have been added automatically to cluster. There was no disruption in writer application.
4 Stop the Pacemaker process It will stop the PostgreSQL process also, and the server will be marked offline. There was no disruption in writer application.

Master/Primary Server Tests

Sl. No Test Scenario Observation
1 Kill the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. Primary was recovered within the threshold time and, hence, election was not triggered. The writer application was down for about 26 seconds.
2 Stop the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. Primary was recovered within the threshold time and, hence, election was not triggered. There was a downtime in writer application for about 26 seconds.
3 Reboot the server Election was triggered by Pacemaker after the threshold time for which master was not available. The most eligible standby server was promoted as the new master. Once the old master came up after reboot, it was added back to the cluster as a standby. If fencing was enabled, the node wouldn’t have been added automatically to cluster. The writer application was down for about 26 seconds.
4 Stop the Pacemaker process It will stop the PostgreSQL process also and server will be marked offline. Election will be triggered and new master will be elected. There was downtime in writer application.

Network Isolation Tests

Sl. No Test Scenario Observation
1 Network isolate the standby server from other servers Corosync traffic was blocked on the standby server. The server was marked offline and PostgreSQL service was turned off due to quorum policy. There was no disruption in the writer application.
2 Network isolate the master server from other servers (split-brain scenario) Corosync traffic was blocked on the master server. PostgreSQL service was turned off and master server was marked offline due to quorum policy. A new master was elected in the majority partition. There was a downtime in the writer application.

Miscellaneous Tests

Sl. No Test Scenario Observation
1 Degrade the cluster by turning off all the standby servers. When all the standby servers went down, PostgreSQL service on master was stopped due to quorum policy. After this test, when all the standby servers was turned on, a new master was elected. There was a downtime in the writer application.
2 Randomly turn off all the servers one after the other, starting with the master, and bring them all back at same time All the servers came up and joined the cluster. New master was elected. There was a downtime in the writer application.

Inference

PostgreSQL Automatic Failover provides several advantages in handling PostgreSQL high availability. PAF uses IP address failover instead of rebooting the standby to connect to the new master during a failover event. This proves advantageous in scenarios where the user does not want to restart the standby nodes. PAF also needs very little manual intervention and manages the overall health of all the resources. The only case where manual intervention is a requirement is in the event of a timeline divergence where the user can elect to use pg_rewind.

In Part 1, we’ve discussed the capabilities and workings of PostgreSQL Automatic Failover (PAF) by ClusterLabs, and in Part 2, we’ll discuss the same high availability aspects using the Replication Manager for PostgreSQL clusters (repmgr) by 2ndQuadrant. Make sure to check back for Part 3, where we’ll also cover Patroni by Zalando and compare all three open source solutions to help you determine the best fit for your application.

Categories: Cloud, Technical
Tags: Cloud, database, database performance, Failover, open source, sql, structured data

About Dharshan Rangegowda

ScaleGrid offers a fully managed MongoDB, MySQL, PostgreSQL and Redis on AWS, Azure, and DigitalOcean. Bring your own cloud account or host with us. Deploy standalone, replica sets or shards.

Primary Sidebar

E-mail Newsletter

Sign up to receive email updates daily and to hear what's going on with us!

Publish
AN Article
Submit
a press release
List
AN Event
Create
A Job Post

Related Articles

Applications Of Data Science In Decision-Making

March 17, 2023 By vc454071

Workflow Automation For Small Business

March 17, 2023 By yanakhain

5 Key Components Of IT Automation

March 16, 2023 By Nikola Sekulic

Related Jobs

  • 100% Remote Senior Software Engineer – Python, C++ | Northlake, GA, USA - March 20, 2023
  • Principal Software Engineer | Durham, NC, USA - March 20, 2023
  • Senior Software Engineer | Westlake, TX, USA - March 20, 2023
More Jobs

Tags

AI Amazon analysis analytics application applications Artificial Intelligence benefits BI Big Data business China Cloud Companies company costs crypto Data design development digital engineer environment experience future Google+ government Group health information learning machine learning mobile news public research security share skills social social media software solutions strategy technology

Related Events

  • 6th Middle East Banking AI & Analytics Summit 2023 | Riyadh, Saudi Arabia - May 10, 2023
  • Data Science Salon NYC: AI & Machine Learning in Finance & Technology | The Theater Center - December 7, 2022
  • Big Data LDN 2023 | Olympia London - September 20, 2023
More events

Related Online Courses

  • Using Prometheus for Monitoring on Google Cloud: Qwik Start
  • Cloud Transformation
  • Cloud Technologies & Services
More courses

Footer


Datafloq is the one-stop source for big data, blockchain and artificial intelligence. We offer information, insights and opportunities to drive innovation with emerging technologies.

  • Facebook
  • LinkedIn
  • RSS
  • Twitter

Recent

  • How Data Analytics is Revolutionizing Talent Acquisition Leadership
  • Storing the World in a Sugar Cube: The DNA Data Revolution Unfolds
  • Optimizing Traditional Agricultural Practices with AI
  • Visual AI: The Shiny Technological Object That Glitters Like Gold
  • Applications Of Data Science In Decision-Making

Search

Tags

AI Amazon analysis analytics application applications Artificial Intelligence benefits BI Big Data business China Cloud Companies company costs crypto Data design development digital engineer environment experience future Google+ government Group health information learning machine learning mobile news public research security share skills social social media software solutions strategy technology

Copyright © 2023 Datafloq
HTML Sitemap| Privacy| Terms| Cookies

  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp

In order to optimize the website and to continuously improve Datafloq, we use cookies. For more information click here.

settings

Dear visitor,
Thank you for visiting Datafloq. If you find our content interesting, please subscribe to our weekly newsletter:

Did you know that you can publish job posts for free on Datafloq? You can start immediately and find the best candidates for free! Click here to get started.

Not Now Subscribe

Thanks for visiting Datafloq
If you enjoyed our content on emerging technologies, why not subscribe to our weekly newsletter to receive the latest news straight into your mailbox?

Subscribe

No thanks

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Marketing cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!