Describe the bug
SQLAlchemy 2.0.42 changed the behavior regarding JSONB subscripting.
Unfortunately, this causes indexes on JSONB subscripts NOT to used anymore.
Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected
No response
SQLAlchemy Version in Use
2.0.43
DBAPI (i.e. the database driver)
psycopg2
Database Vendor and Major Version
postgresql 17
Python Version
3.13
Operating system
linux
To Reproduce
We will use this example code with different PG versions:
#!/usr/bin/env python3
# imports
from sqlalchemy import Column, Index, Integer, func, select
from sqlalchemy.dialects import postgresql
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.engine import create_engine
from sqlalchemy.orm import declarative_base, sessionmaker
# model declaration
Base = declarative_base()
class Entity(Base):
__tablename__ = "entity"
id = Column(Integer, primary_key=True)
data = Column(JSONB)
Index("ix_entity_json_abc_text", data["a"]["b"]["c"].astext)
engine = create_engine("postgresql://")
engine = engine.execution_options(isolation_level="AUTOCOMMIT")
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
# --- create tables
Base.metadata.create_all(engine)
# --- add a few entities
if session.scalars(select(func.count()).select_from(Entity)).one() == 0:
for i in range(100000):
session.add(Entity(data={"a": {"b": {"c": f"something {i}"}}}))
session.flush()
# --- show query
q = select(Entity).where(Entity.data["a"]["b"]["c"].astext == "something 42")
print(q.compile(dialect=postgresql.dialect()))
Let's spawn an empty PG server:
% docker run --name pgsqlajsonindex --rm -e POSTGRES_PASSWORD=pg -d -p 3333:5432 docker.io/postgres:17
Start with a prior sqlalchemy version:
% pip install SQLAlchemy==2.0.41
Populate database and print the query filtering on the indexed data:
% PGDATABASE=postgres PGUSER=postgres PGPASSWORD=pg PGHOST=127.0.0.1 PGPORT=3333 python3 sqla-json-index.py
SELECT entity.id, entity.data
FROM entity
WHERE ((((entity.data -> %(data_1)s) -> %(param_1)s)) ->> %(param_2)s) = %(param_3)s
Let's see how our index was created, and also EXPLAIN the query:
% PGDATABASE=postgres PGUSER=postgres PGPASSWORD=pg PGHOST=127.0.0.1 PGPORT=3333 psql
postgres^# \d+ entity
Table "public.entity"
Column │ Type │ Collation │ Nullable │ Default │ Storage │ Compression │ Stats target │ Description
────────┼─────────┼───────────┼──────────┼────────────────────────────────────┼──────────┼─────────────┼──────────────┼─────────────
id │ integer │ │ not null │ nextval('entity_id_seq'::regclass) │ plain │ │ │
data │ jsonb │ │ │ │ extended │ │ │
Indexes:
"entity_pkey" PRIMARY KEY, btree (id)
"ix_entity_json_abc_text" btree ((((data -> 'a'::text) -> 'b'::text) ->> 'c'::text))
Access method: heap
postgres^# explain analyze SELECT entity.id, entity.data
FROM entity
WHERE ((((entity.data -> 'a') -> 'b')) ->> 'c') = 'something 99';
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using ix_entity_json_abc_text on entity (cost=0.42..8.44 rows=1 width=64) (actual time=0.032..0.033 rows=1 loops=1)
Index Cond: ((((data -> 'a'::text) -> 'b'::text) ->> 'c'::text) = 'something 99'::text)
Planning Time: 0.147 ms
Execution Time: 0.046 ms
(4 rows)
Upgrade sqlalchemy:
% pip install SQLAlchemy==2.0.43
Re-run the script (which won't populate database this time):
% PGDATABASE=postgres PGUSER=postgres PGPASSWORD=pg PGHOST=127.0.0.1 PGPORT=3333 python3 sqla-json-index.py
SELECT entity.id, entity.data
FROM entity
WHERE ((entity.data[%(data_1)s][%(param_1)s]) ->> %(param_2)s) = %(param_3)s
As expected by reading the changelog, the query changed.
But now the index isn't used anymore:
% PGDATABASE=postgres PGUSER=postgres PGPASSWORD=pg PGHOST=127.0.0.1 PGPORT=3333 psql
postgres^# explain analyze SELECT entity.id, entity.data
FROM entity
WHERE ((entity.data['a']['b']) ->> 'c') = 'something 99';
QUERY PLAN
───────────────────────────────────────────────────────────────────────────────────────────────────────
Seq Scan on entity (cost=0.00..2725.00 rows=500 width=64) (actual time=0.042..18.784 rows=1 loops=1)
Filter: ((data['a'::text]['b'::text] ->> 'c'::text) = 'something 99'::text)
Rows Removed by Filter: 99999
Planning Time: 0.234 ms
Execution Time: 18.810 ms
(5 rows)
I suspect there's a problem in postgres itself not using the index. Or maybe [] and -> aren't equivalent after all?
Anyway, the change in sqlalchemy sort of broke the index usage.
Error
postgres^# explain analyze SELECT entity.id, entity.data
FROM entity
WHERE ((entity.data['a']['b']) ->> 'c') = 'something 99';
QUERY PLAN
───────────────────────────────────────────────────────────────────────────────────────────────────────
Seq Scan on entity (cost=0.00..2725.00 rows=500 width=64) (actual time=0.042..18.784 rows=1 loops=1)
Filter: ((data['a'::text]['b'::text] ->> 'c'::text) = 'something 99'::text)
Rows Removed by Filter: 99999
Planning Time: 0.234 ms
Execution Time: 18.810 ms
(5 rows)
Additional context
No response
Describe the bug
SQLAlchemy 2.0.42 changed the behavior regarding JSONB subscripting.
Unfortunately, this causes indexes on JSONB subscripts NOT to used anymore.
Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected
No response
SQLAlchemy Version in Use
2.0.43
DBAPI (i.e. the database driver)
psycopg2
Database Vendor and Major Version
postgresql 17
Python Version
3.13
Operating system
linux
To Reproduce
We will use this example code with different PG versions:
Let's spawn an empty PG server:
Start with a prior sqlalchemy version:
Populate database and print the query filtering on the indexed data:
Let's see how our index was created, and also
EXPLAINthe query:Upgrade sqlalchemy:
Re-run the script (which won't populate database this time):
As expected by reading the changelog, the query changed.
But now the index isn't used anymore:
I suspect there's a problem in postgres itself not using the index. Or maybe
[]and->aren't equivalent after all?Anyway, the change in sqlalchemy sort of broke the index usage.
Error
Additional context
No response