At work we use SQLAlchemy as our ORM, and we were having issues with some connections getting stuck in the pool. We use PgBouncer as our connection pooler, so I started looking into the issue and I found that there's not a lot of information out there about how to configure SQLAlchemy and PgBouncer together. I found a few blog posts and StackOverflow answers, but they were all incomplete or outdated. So I decided to write this post to document what I've learned.

The problem

By default SQLAlchemy uses a connection pool, which is a good thing. It means that you don't have to open a new connection to the database every time you need to execute a query. Instead, you can reuse an existing connection from the pool. This is much faster and more efficient than opening a new connection every time. This is fine and dandy if you are not using PgBouncer, but if you are, then you need to be careful about how you configure SQLAlchemy.

The problem is that if you use a connection pool with PgBouncer (another connection pooler), then you can end up with connections that are stuck in a funny state in SA pool and the next time you try to use them you'll get an error like sqlalchemy.exc.ResourceClosedError: This Connection is closed or psycopg2.OperationalError) server closed the connection unexpectedly. This is due to SQLAlchemy's default settings which make you use QueuePool as the default pool class. These settings are not compatible with PgBouncer, so you need to use a different pool class or tweak the settings.

SQLAlchemy configuration

The first thing I would recommend is to set application_name in your SQLAlchemy engine. This will make it easier to debug issues in PgBouncer, as you'll be able to see which application is using which connection. You can do this by adding application_name to your connection string:

from sqlalchemy import create_engine

engine = create_engine("postgresql://user:pass@host:port/db?application_name=your-app-name")

Connection pooling

As you are using an external connection pooler (PgBouncer), you need to ensure that your SQLAlchemy's connection pool settings are compatible with it. The default settings are not, so you need to tweak them. The main setting you need to change is the pool class. The default pool class is QueuePool, which is not compatible with PgBouncer. You need to pick a different pool class, and you also need to tweak some of the settings depending on your workload.

The optimal solution depends on your workload and the version of SQLAlchemy you are using. The simplest and safest approach is to disable connection pooling in SQLAlchemy altogether. For that you just need to configure NullPool as your pool class. This will disable the connection pool and open a new connection every time you need to execute a query. This is the safest option and easiest option, although slightly less performant than other alternatives:

from sqlalchemy import create_engine, pool

engine = create_engine("postgresql://user:pass@host:port/db", poolclass=pool.NullPool)

Depending on your workload you have more options though.

Web applications

For web applications just use NullPool, this is the safest option and there is no performance hit as your web server opens a connection once and then reuses it for the lifetime of the process. If you're using Flask-SQLAlchemy you can configure it like this:

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config.update({
    "SQLALCHEMY_DATABASE_URI": "postgresql://user:pass@host:port/db",
    "SQLALCHEMY_ENGINE_OPTIONS": {"poolclass": "sqlalchemy.pool.NullPool"},
})
db = SQLAlchemy(app)

Long-running processes

If you have workers/background jobs using PgBouncer then you definitely want to use NullPool in your SQLAlchemy engine. This is the safest option and there is no performance hit as your worker opens a connection once and then reuses it for the lifetime of the process.

If you're using SQLAlchemy 1.4+ the new autobegin behavior is exactly what you need. If you are using SQLAlchemy 1.3 or earlier, I recommend you set autocommit=True in your engine. This will ensure that the connection is always in autocommit mode, avoiding leaving transactions stuck open.

The downside of using autocommit=True is that you need to explicitly start a transaction if you want to use one. This is what you want anyway, the 1.3 behavior of starting a transaction automatically under the hood is a bad idea that doesn't work well with connection pooling.

PgBouncer configuration

The PgBouncer configuration will again depend on your workload, but most likely you want to set pool_mode to transaction. If you're using QueuePool make sure that pool_recycle is set to a value lower than the PgBouncer server_lifetime setting. This will ensure that the connections are recycled before PgBouncer closes them.