Setting up a PostgreSQL Cluster with Patroni using Spilo Image

In the world of database management, PostgreSQL stands tall as a powerful and reliable open-source relational database system. However, as database workloads grow, efficient connection management becomes crucial for maintaining performance and scalability. This is where connection poolers like Pgbouncer come into play.

In this blog post, we’ll explore the setup and configuration of Pgbouncer — a lightweight connection pooler—for PostgreSQL on Amazon EC2 instances. Pgbouncer acts as an intermediary between your application and the database, efficiently managing database connections.

Prerequisites

Make sure you have followed the steps in our previous blog post, Setting up a highly available PostgreSQL Cluster with Patroni using Spilo Image, to set up PostgreSQL on your EC2 instances. We’ll be using the same instances for this blog post.

Why Use Pgbouncer?

Managing database connections can be resource-intensive. Each connection consumes memory on the database server. When your application requires numerous connections, this can lead to performance bottlenecks. Pgbouncer addresses this issue by pooling and reusing database connections, resulting in significant improvements in efficiency and scalability.

Before we dive into the technical details, let’s take a look at the configuration file and systemd unit file that will enable us to set up Pgbouncer for PostgreSQL on Fedora CoreOS.

The Configuration File

The following is the systemd unit configuration for our Pgbouncer service, which can be added to the Butane configuration file from the previous blog post:

Copy to Clipboard

# ... Configurations from the previous blog post

- name: pgbouncer.service
  enabled: true
  contents: |
    [Unit]
    Description=Pgbouncer Connection pooling with Podman

[Service]
    User=patroni-user
    Restart=always
    RestartSec=5s
    TimeoutStartSec=0
    LimitNOFILE=40000

ExecStartPre=/usr/local/bin/prepare_pgbouncer_auth.sh patroni-container postgres postgres
    ExecStartPre=/usr/bin/podman rm -f pgbouncer-container
    ExecStart=/usr/bin/podman \
      run \
      --rm \
      --net=host \
      --name pgbouncer-container \
      --volume /etc/ssl/self-certs:/etc/ssl/self-certs \
      --volume /usr/local/share/userlist.txt:/etc/ssl/self-certs/userlist.txt \
      --env DB_HOST="localhost" \
      --env DB_NAME="postgres" \
      --env DB_PORT="5432" \
      --env AUTH_USER="pgbouncer" \
      --env AUTH_QUERY="SELECT username, password FROM pgbouncer.get_auth($1)" \
      --env STATS_USERS="stats_collector" \
      --env AUTH_FILE="/etc/ssl/self-certs/userlist.txt" \
      --env AUTH_TYPE="scram-sha-256" \
      --env LISTEN_PORT="6432" \
      --env CLIENT_TLS_SSLMODE="require" \
      --env CLIENT_TLS_KEY_FILE="/etc/ssl/self-certs/proventa-client-cert-key.pem" \
      --env CLIENT_TLS_CERT_FILE="/etc/ssl/self-certs/proventa-client-cert.pem" \
      --env CLIENT_TLS_CA_FILE="/etc/ssl/self-certs/proventa-root-ca.pem" \
      docker.io/edoburu/pgbouncer:1.20.1-p0

ExecStop=/usr/bin/podman rm -f pgbouncer-container

[Install]
    WantedBy=multi-user.target

# ... Configurations from the previous blog post

The above configuration will run the Pgbouncer container, and expose the Pgbouncer service on port 6432. The Pgbouncer itself will connect to the PostgreSQL DB with the parameters we specified in DB_HOST, DB_NAME, and DB_PORT. Connections from the end-user to the Pgbouncer will be secured using TLS with the self-signed certificates generated with a script that can be found in our Github repository here.

On the first step, we run a script called prepare_pgbouncer_auth.sh. Here, we are creating a database user that will be used by Pgbouncer to verify whether incoming connections are allowed or not. The script is as follows:

Copy to Clipboard

First it waits until the PostgreSQL instance is ready to accept connections. Then it creates a database user called pgbouncer. The user pgbouncer is granted with a very limited set of privileges. It can only connect to the database and execute the get_auth function. This function is used by Pgbouncer to verify whether incoming connections are allowed or not.

After the script is successfully executed, we can specify the pgbouncer user as an AUTH_USER in the systemd unit configuration.

In the systemd unit configuration, we are specifying the AUTH_QUERY with the value SELECT username, password FROM pgbouncer.get_auth($1). This means that Pgbouncer will execute the pgbouncer.get_auth function with the username provided by the client. If the function returns a row, then the connection is allowed. Otherwise, the connection is rejected.

Then we have the STATS_USERS parameter. This parameter specifies the users that are allowed to retrieve statistics about the Pgbouncer instance, e.g SHOW STATS command. The information returned will be useful and can be used to monitor the Pgbouncer service. In our case we are using a user called stats_collector.

Finally we have AUTH_FILE. This parameter specifies the location of the userlist.txt file. This file usually contains the list of users and their passwords that are allowed to connect to the database. In our setup (which is a very simple setup), the file contains the following:

Copy to Clipboard

Users in this file are allowed to connect to Pgbouncer’s internal database pgbouncer and will not be queried against the pgbouncer.get_auth function. In addition, the user pgbouncer is used to authenticate against the database.

Now that we have covered the systemd unit configuration, we are ready to deploy the EC2 instances with our new configuration file. However, we still need to change an entry in our Security Group task. Since we are now using Pgbouncer on port 6432, we wouldn’t want to expose port 5432 that is used by the Postgres database. Instead, we will expose the port 6432. Here’s the updated version of the Security Group task in the network_conf-task.yml file:

Copy to Clipboard

We now have everything set up and ready to go. Let’s deploy the instances by using the same Ansible script from the previous blog post to deploy the instances.

Verifying the Connection Pooler

Once the instances are up and running, we can SSH into one of the instances and check the status of the Pgbouncer service by running the following command:

Copy to Clipboard

You should see a response similar to the following:

Copy to Clipboard

Now that we have verified that the Pgbouncer service is running, we can test the connection by connecting to the database using the psql command with postgres as the username and zalando as the (default) password:

Copy to Clipboard

Or you can also connect to the database pgbouncer as the stats_collector user to monitor the Pgbouncer statistics. The password of this user is defined in the userlist.txt file (In our case now it’s collecter_stats). We can do this by running the following command:

Copy to Clipboard

After entering the correct password, you should see a response similar to the following:

Copy to Clipboard

With that we have verified that the Pgbouncer service is running and that we can connect to the database using the Pgbouncer service.

Wrapping Up

In this blog post, we’ve explored the essential concept of using Pgbouncer, a connection pooler, to enhance the efficiency and scalability of your PostgreSQL database on EC2 instances. We’ve learned how to set up and configure Pgbouncer to act as a intermediary layer between your application and the database server, managing connections seamlessly.

By following the detailed steps and configurations outlined in this blog post, you’ve taken a significant step towards optimizing your PostgreSQL setup for high-performance and scalability. With Pgbouncer in place, you can confidently manage your database connections while maintaining a secure and efficient PostgreSQL environment.

We hope this guide has been informative and that it simplifies your journey in harnessing the full potential of PostgreSQL!

Farouq Abdurrahman

Farouq Abdurrahman ist Praktikant bei der Proventa AG und studiert Informatik. Sein Schwerpunkt liegt auf Cloud Computing. Er hat großes Interesse an der Digitalen Transformation und Cloud Computing. Aktuell beschäftigt er sich mit PostgreSQL Datenbanken in der Cloud.

Setting up a Connection Pooler for PostgreSQL on EC2 Instances

Setting up a Connection Pooler for PostgreSQL on EC2 Instances

Prerequisites

Why Use Pgbouncer?

The Configuration File

Verifying the Connection Pooler

Wrapping Up

Farouq Abdurrahman