Hello,
Apologies for the wall of text, but I am hoping someone can help with this issue and, ideally, please release a fix for Hibernate 6.6 as soon as possible. Providing as much relevant information as I can.
Upgrading from 5.6 to 6.6 has resulted in some extremely slow running queries despite perfectly reasonable HQL or even simply fetching a class from the session by ID.
The problem is of this pattern, which occurs across many of our entities as a result of several different but even trivial HQL and entity patterns that all rendered efficient SQL in 5.6:
from ClassB cb1_0
join (select *
from AbstractClass t
where t.DTYPE in (āAAā, āABā, āACā)) t1_0
on t1_0.id=cb1_0.abstractClass_id
join ā¦
In some cases, itās a OneToOne relationship and I was able to work around the issue by moving the table with a discriminator column into the from clause instead of the join. For example, this produces good SQL:
from AbstractClass ac
join ac.oneToOneClass otoc
Whereas this produces the subselect:
from OneToOneClass otoc
join otoc.abstractClass
In some cases, I was able to work around the issue by setting a OneToOne or ManyToOne relationship to FetchType=LAZY, introducing otherwise unnecessary and undesirable n+1 problems that are nevertheless still 95-99.9% faster because the subselect is just that bad.
In some cases, I was able to change
select classC
from ClassA classA
join classA.classB.classC classC
where ...
to
select classA.classB.classC
from ClassA classA
where ...
and this removed the subselect.
In some cases, I think I have no choice but to rewrite numerous queries as native SQL where these subselects are being rendered from joins that I cannot move them from, such as where Iām retrieving a DTO projection, which feels awful and defeats the purpose of the otherwise simple projection.
Somewhat obfuscated explains below showing the difference in performance for a result set of <150 rows is well over a minute vs ~40 milliseconds. This is bad even on tables with 40k rows in the derived table where it actually selects a key because itās such that it will get loaded many many times and the relatively minor (to the 15 mil table) overhead adds up to be very significant.
Explain with the subselect:
+----+-------------+------------+------------+--------+------------------------------------------------------------------------------------+------------------------------+---------+--------------------+----------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+------------------------------------------------------------------------------------+------------------------------+---------+--------------------+----------+----------+------------------------------+
| 1 | PRIMARY | p1_0 | NULL | ref | PRIMARY,UK_asdfasdfasdfasdfasdfasdfa | UK_asdfasdfasdfasdfasdfasdfa | 4 | const | 1 | 100.00 | Using index; Using temporary |
| 1 | PRIMARY | ba1_0 | NULL | ref | PRIMARY,FK_1234aasdfasdfasdfasdfasdf | FK_1234aasdfasdfasdfasdfasdf | 4 | db.p1_0.id | 6 | 100.00 | Using index |
| 1 | PRIMARY | tt1_0 | NULL | ref | UK_87asdfasdfasdfasdfasdfff,IX_ClassA_field,FK_ClassA_field | FK_ClassA_field | 4 | db.ba1_0.id | 5678 | 23.32 | Using where |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key1> | <auto_key1> | 4 | db.tt1_0.classc_id | 15 | 100.00 | NULL |
| 1 | PRIMARY | s1_0 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | t1_0.classd_id | 1 | 100.00 | NULL |
| 2 | DERIVED | t | NULL | ALL | IX_ClassC_field,ClassC_DTYPE_index | NULL | NULL | NULL | 15420167 | 34.29 | Using where |
+----+-------------+------------+------------+--------+------------------------------------------------------------------------------------+------------------------------+---------+--------------------+----------+----------+------------------------------+
Explain with the same query but a logically simplified join in place of the subselect:
+----+-------------+-------+------------+--------+------------------------------------------------------------------------------------+------------------------------+---------+----------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+------------------------------------------------------------------------------------+------------------------------+---------+----------------------+------+----------+------------------------------+
| 1 | SIMPLE | p1_0 | NULL | ref | PRIMARY,UK_asdfasdfasdfasdfasdfasdfa | UK_asdfasdfasdfasdfasdfasdfa | 4 | const | 1 | 100.00 | Using index; Using temporary |
| 1 | SIMPLE | ba1_0 | NULL | ref | PRIMARY,FK_1234aasdfasdfasdfasdfasdf | FK_1234aasdfasdfasdfasdfasdf | 4 | db.p1_0.id | 6 | 100.00 | Using index |
| 1 | SIMPLE | tt1_0 | NULL | ref | UK_87asdfasdfasdfasdfasdfff,IX_ClassA_field,FK_ClassA_field | FK_ClassA_field | 4 | db.ba1_0.id | 5678 | 23.32 | Using where |
| 1 | SIMPLE | t1_0 | NULL | eq_ref | PRIMARY,IX_ClassC_field,FK_21yasdfakjsdfgasdhfghsajd | PRIMARY | 4 | db.tt1_0.classc_id | 1 | 50.00 | Using where |
| 1 | SIMPLE | s1_0 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | db.t1_0.classd_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+------------------------------------------------------------------------------------+------------------------------+---------+----------------------+------+----------+------------------------------+
Ref/related:
Another subselect instead of join issue
the subselect is needed to ensure correct join semantics when using discriminator-based inheritance
ā mbladel
The subselect simplifies logically, regardless of semantics that Hibernate would like to maintain, to:
join AbstractClass t1_0
On t1_0.id=cb1_0.abstractClass_id
Explicitly restricting by DTYPE only to then restrict that resulting derived table by ID makes absolutely zero sense to me. Hibernate has all the information it needs to render the correct and simplified form of SQL at the time it sends it to the database and is going out of its way to generate poor performing SQL. What it receives back is guaranteed to be of the correct DTYPE in both cases because itās restricted by a unique primary/foreign key pair supplied by the joined entities. MySQL doesnāt care about Hibernateās underlying/internal semantics.
⦠we need actual evidence that this makes a difference. From a pure relational perspective it does not make any sense that the performance is different, because filters can be pushed down into table scans across inner joins. Every database leverages this optimization, otherwise the performance of many queries, including ones that rely on SQL views would suffer dramatically.
ā beikov
I do not believe it is appropriate to assume that a database will optimize this unnecessary (and explicitly requested) derived table on Hibernateās behalf. In fact, MySQL 8.0.41 (at least in the environment that Iām testing against) is not doing so and the difference in performance is quite literally staggering. I need to upgrade in stages, so even if this is fixed in Mysql 8.4/9, Hibernate 7, or whatever, that wonāt help quite yet.
I donāt know if there are MySQL/Hibernate settings I can apply to help, or have applied historically that are causing this lack of down-pushing filters mentioned. But this should be irrelevant given that Hibernate is intentionally sending poorly rendered SQL to the database, explicitly requesting an unnecessary derived table, and should just stop doing that.
Workarounds aside, my general expectation of Hibernate is that it produces SQL in its most simplified/optimized form possible and not rely on a database to fix redundant/non-performant SQL. It has been generally great until this most unexpected migration blocker.
Is there something Iām missing, Hibernate fix aside, that could help get performance back as it was (and better, in some of the few cases I have already fixed by ātrickingā Hibernate as above)?
Any help with this would be greatly appreciated, thank you.