SQLAlchemy and PgBouncer
At work we use SQLAlchemy as our ORM, and we were having issues with some connections getting stuck in the pool. We use PgBouncer as our connection pooler, so I started looking into the issue and I found that there's not a lot of information out there about how to configure SQLAlchemy and PgBouncer together. I found a few blog posts and StackOverflow answers, but they were all incomplete or outdated. So I decided to write this post to document what I've learned.
The problem
By default SQLAlchemy uses a connection pool, which is a good thing. It means that you don't have to open a new connection to the database every time you need to execute a query. Instead, you can reuse an existing connection from the pool. This is much faster and more efficient than opening a new connection every time. This is fine and dandy if you are not using PgBouncer, but if you are, then you need to be careful about how you configure SQLAlchemy.
The problem is that if you use a connection pool with PgBouncer (another connection pooler), then you can end up with
connections that are stuck in a funny state in SA pool and the next time you try to use them you'll get an error like
sqlalchemy.exc.ResourceClosedError: This Connection is closed
or
psycopg2.OperationalError) server closed the connection unexpectedly
. This is due to SQLAlchemy's default settings
which make you use QueuePool as the
default pool class. These settings are not compatible with PgBouncer, so you need to use a different pool class.
SQLAlchemy configuration
The first thing I would recommend is to set application_name
in your SQLAlchemy engine. This will make it easier to
debug issues in PgBouncer, as you'll be able to see which application is using which connection. You can do this by
adding application_name
to your connection string:
from sqlalchemy import create_engine
engine = create_engine("postgresql://user:pass@host:port/db?application_name=your-app-name")
Connection pooling
As you are using an external connection pooler (PgBouncer), you need to ensure that your SQLAlchemy's connection pool
settings are compatible with it. The default settings are not, so you need to tweak them. The main setting you need to
change is the pool class. The default pool class is QueuePool
, which is not compatible with PgBouncer. You need to
pick a different pool class, and you also need to tweak some of the settings depending on your workload.
The optimal solution depends on your workload and the version of SQLAlchemy you are using. The recommended approach is to disable connection pooling in SQLAlchemy altogether. For that you just need to configure NullPool as your pool class. This will disable the connection pool and open a new connection every time you need to execute a query:
from sqlalchemy import create_engine, pool
engine = create_engine("postgresql://user:pass@host:port/db", poolclass=pool.NullPool)
If you're using Flask-SQLAlchemy you can configure it like this:
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config.update({
"SQLALCHEMY_DATABASE_URI": "postgresql://user:pass@host:port/db",
"SQLALCHEMY_ENGINE_OPTIONS": {"poolclass": "sqlalchemy.pool.NullPool"},
})
db = SQLAlchemy(app)
auto-begin and autocommit
If you're a web app, you are most likely using Flask-SQLAlchemy or something similar. The default configuration (open an implicit transaction at the beginning of the request) is correct and you shouldn't need to change it.
However, if you're using SQLAlchemy in other contexts (like workers, long running scripts, etc.) you need to tweak the
autobegin
or autocommit
settings.
If you're using SQLAlchemy 1.4+ the
new autobegin behavior
is exactly what you need. If you are using SQLAlchemy 1.3 or earlier, I recommend you set autocommit=True
in your
engine. This will ensure that the connection is always in autocommit mode, avoiding leaving transactions stuck open.
The downside of using autocommit=True
is that you need to explicitly start a transaction if you want to use one. This
is what you want anyway, the 1.3 behavior of starting a transaction automatically under the hood is a bad idea that
doesn't work well with connection pooling.
PgBouncer configuration
Despite PgBouncer supporting multiple pool modes, the recommended mode to use with SQLAlchemy is transaction
. This
mode might seem more resource hungry than session
mode, but it's the only mode that works well with SQLAlchemy. Don't
bother with query
mode, it's not compatible with SQLAlchemy.