-
-
Notifications
You must be signed in to change notification settings - Fork 705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IMP] fill_stock_move_bom_line_id query performance #3569
base: 10.0
Are you sure you want to change the base?
Conversation
cab9f0f
to
d630007
Compare
@MiquelRForgeFlow fixed |
@MiquelRForgeFlow have you checked that the result query is the same? |
No, but it seems it has to be the same result by just comparing the code of the two queries :S |
I don't get why 2 subselect is more optimal than the previous one, and no stats or explain analyze is shared for comparing them. |
Me neither, but I suppose @thomaspaulb can give us some numbers, right? |
@pedrobaeza I sent a message to the person who actually ran the query, maybe he can share an EXPLAIN ANALYZE. The stats were that on roughly 1.8 million stock moves, the original query took 5+ hours, and the new query was done in a few minutes, but the person mentioned above can confirm also. |
I can confirm what @thomaspaulb described - original query took at least 5 hours before I manually aborted it (it never completed), new query took a few minutes. I will try to get an EXPLAIN ANALYZE if it's helpful - but might take a bit before I do. Will post here. |
The important thing also is the postgres version. Maybe the problem is a too old PG version. |
The version is:
|
I have tried on a v10 DB with PG 13 both queries removing the NULL condition for applying to all records (and with a decent number or records), with very similar results in time, and more convoluted with your query (but probably PG 13 optimizes correctly, not sure about older ones): WITH EXISTING QUERY
WITH YOUR QUERY
so there's still something weird in you statement about the big difference in times. Maybe a missing index when running one query or the other? |
To give some context, this is all part of our migration from V8.0 to (eventually) V14.0, and I just happen to be around to do it, but my knowledge here is very limited. So I can't say if anything is 'missing' - but here are the indexes that exist in V8 Original Indexes
V9 Indexes
V10 Indexes
I would actually love to hear that there's a missing index, as I'm currently running |
New query:
Old query:
In other words, there seems to be a speed difference for me here, but not with the same order of magnitude compared with thomas'/(gal)amit's database. In my personal test situation, it didn't seem too significant. |
Not really difference, as the time gained on execution is spent on planning. |
My postgres version is the same as @thomaspaulb mentioned:
Following a comment from @ddejong-therp, regarding an index that exists on his DB but not on mine, I've added the following index:
which dramatically improved the resulting query time, so good lead @pedrobaeza ! See below the detailed results. The new query still beats the existing one, being about 4 times as fast - but it becomes a non-issue in the overall process now, so I can imagine we might not want to change it to the new one as it is less tested, and I might have a missing index or a different discrepancy on this table or Happy to provide more info if needed for the decision. Existing Query
New Query
Indexes on mrp_production
|
Okay, I think while my new query does indeed lead to better performance, the real issue was the missing index. @ddejong-therp We are by default running |
I think it's better to do -u all |
AND mbl.product_id = sm.product_id | ||
) | ||
WHERE sm.raw_material_production_id = mp.id | ||
AND mbl.bom_line_id IS NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this bom_line_id IS NOT NULL
condition is void here. Is this where the original sm_update.bom_line_id IS NULL
landed? (I'm not sure why that was there in the previous version because the field is new in version 10 and here is the only place where it is populated).
-- edit -- oh, it's because the subquery that provides it is left-joined. Makes me think you can take the condition out and apply an inner join.
It's encouraging that this is still improving performance, but I am not sure that the readability is improved. Do you think a CTE version like this will have the same performance improvement and do you agree it is more readable? Warning: untested query. with bom_line_products as (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
),
move_bom_lines as (
SELECT sm.id as move_id, blp.bom_line_id
FROM stock_move sm
JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
JOIN bom_line_products blp
ON blp.bom_id = mp.bom_id
AND blp.product_id = sm.product_id
)
UPDATE stock_move sm
SET bom_line_id = mbl.bom_line_id
FROM move_bom_lines mbl
WHERE sm.id = mbl.move_id; |
We got performance problems on this query for a big
stock.move
table, the alternative is a lot faster.Needs a review to verify if the results are 100% the same, though.
@ddejong-therp