SQLAlchemy and PgBouncer

At work we use SQLAlchemy as our ORM, and we were having issues with some connections getting stuck in the pool. We use PgBouncer as our connection pooler, so I started looking into the issue and I found that there's not a lot of information out there about how to configure SQLAlchemy and PgBouncer together. I found a few blog posts and StackOverflow answers, but they were all incomplete or outdated. So I decided to write this post to document what I've learned.

The problem

By default SQLAlchemy uses a connection pool, which is a good thing. It means that you don't have to open a new connection to the database every time you need to execute a query. Instead, you can reuse an existing connection from the pool. This is much faster and more efficient than opening a new connection every time. This is fine and dandy if you are not using PgBouncer, but if you are, then you need to be careful about how you configure SQLAlchemy.

The problem is that if you use a connection pool with PgBouncer (another connection pooler), then you can end up with connections that are stuck in a funny state in SA pool and the next time you try to use them you'll get an error like sqlalchemy.exc.ResourceClosedError: This Connection is closed or psycopg2.OperationalError) server closed the connection unexpectedly. This is due to SQLAlchemy's default settings which make you use QueuePool as the default pool class. These settings are not compatible with PgBouncer, so you need to use a different pool class.

SQLAlchemy configuration

The first thing I would recommend is to set application_name in your SQLAlchemy engine. This will make it easier to debug issues in PgBouncer, as you'll be able to see which application is using which connection. You can do this by adding application_name to your connection string:

from sqlalchemy import create_engine

engine = create_engine("postgresql://user:pass@host:port/db?application_name=your-app-name")

Connection pooling

As you are using an external connection pooler (PgBouncer), you need to ensure that your SQLAlchemy's connection pool settings are compatible with it. The default settings are not, so you need to tweak them. The main setting you need to change is the pool class. The default pool class is QueuePool, which is not compatible with PgBouncer. You need to pick a different pool class, and you also need to tweak some of the settings depending on your workload.

The recommended approach is to disable connection pooling in SQLAlchemy altogether. For that you just need to configure NullPool as your pool class. This will disable the connection pool and open a new connection every time you need to execute a query:

from sqlalchemy import create_engine, pool

engine = create_engine("postgresql://user:pass@host:port/db", poolclass=pool.NullPool)

If you're using Flask-SQLAlchemy you can configure it like this:

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config.update({
    "SQLALCHEMY_DATABASE_URI": "postgresql://user:pass@host:port/db",
    "SQLALCHEMY_ENGINE_OPTIONS": {"poolclass": "sqlalchemy.pool.NullPool"},
})
db = SQLAlchemy(app)

PgBouncer configuration

Despite PgBouncer supporting multiple pool modes, the recommended mode to use with SQLAlchemy is transaction. This mode might seem more resource hungry than session mode, but it's the only mode that works well with SQLAlchemy. Don't bother with query mode, it's not compatible with SQLAlchemy at all.

The problem

SQLAlchemy configuration

Connection pooling

PgBouncer configuration

Links