Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMP] fill_stock_move_bom_line_id query performance #3569

Open
wants to merge 1 commit into
base: 10.0
Choose a base branch
from

Conversation

thomaspaulb
Copy link

We got performance problems on this query for a big stock.move table, the alternative is a lot faster.

Needs a review to verify if the results are 100% the same, though.

@ddejong-therp

@MiquelRForgeFlow MiquelRForgeFlow added this to the 10.0 milestone Sep 28, 2022
@thomaspaulb thomaspaulb force-pushed the 10.0-mrp-optimize-query branch from cab9f0f to d630007 Compare September 28, 2022 10:43
@thomaspaulb
Copy link
Author

@MiquelRForgeFlow fixed

@pedrobaeza
Copy link
Member

@MiquelRForgeFlow have you checked that the result query is the same?

@MiquelRForgeFlow
Copy link
Contributor

No, but it seems it has to be the same result by just comparing the code of the two queries :S

@pedrobaeza
Copy link
Member

I don't get why 2 subselect is more optimal than the previous one, and no stats or explain analyze is shared for comparing them.

@MiquelRForgeFlow
Copy link
Contributor

MiquelRForgeFlow commented Sep 29, 2022

I don't get why 2 subselect is more optimal than the previous one

Me neither, but I suppose @thomaspaulb can give us some numbers, right?

@thomaspaulb
Copy link
Author

@pedrobaeza I sent a message to the person who actually ran the query, maybe he can share an EXPLAIN ANALYZE.

The stats were that on roughly 1.8 million stock moves, the original query took 5+ hours, and the new query was done in a few minutes, but the person mentioned above can confirm also.

@galamit86
Copy link

I can confirm what @thomaspaulb described - original query took at least 5 hours before I manually aborted it (it never completed), new query took a few minutes.

I will try to get an EXPLAIN ANALYZE if it's helpful - but might take a bit before I do. Will post here.

@pedrobaeza
Copy link
Member

The important thing also is the postgres version. Maybe the problem is a too old PG version.

@thomaspaulb
Copy link
Author

The version is:

/usr/lib/postgresql/13/bin/postgres --version
postgres (PostgreSQL) 13.8 (Ubuntu 13.8-1.pgdg18.04+1)

@pedrobaeza
Copy link
Member

I have tried on a v10 DB with PG 13 both queries removing the NULL condition for applying to all records (and with a decent number or records), with very similar results in time, and more convoluted with your query (but probably PG 13 optimizes correctly, not sure about older ones):

WITH EXISTING QUERY

EXPLAIN ANALYZE UPDATE stock_move sm_update
        SET bom_line_id = mbl.bom_line_id
        FROM stock_move sm
            INNER JOIN mrp_production mp
            ON sm.raw_material_production_id = mp.id
            LEFT JOIN (
                SELECT min(id) AS bom_line_id, bom_id, product_id
                FROM mrp_bom_line
                GROUP BY bom_id, product_id
                ) mbl ON (
                    mbl.bom_id = mp.bom_id
                    AND mbl.product_id = sm.product_id)
        WHERE sm_update.raw_material_production_id = mp.id
            AND sm_update.id = sm.id;
                                                                          QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on stock_move sm_update  (cost=68615.23..112151.30 rows=23 width=380) (actual time=27933.209..27933.216 rows=0 loops=1)
   ->  Hash Left Join  (cost=68615.23..112151.30 rows=23 width=380) (actual time=1145.381..5529.130 rows=331251 loops=1)
         Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
         ->  Nested Loop  (cost=52906.67..96442.63 rows=23 width=348) (actual time=551.805..4635.343 rows=331251 loops=1)
               ->  Hash Join  (cost=52906.25..96432.12 rows=23 width=342) (actual time=551.721..1487.570 rows=331251 loops=1)
                     Hash Cond: ((sm_update.raw_material_production_id = sm.raw_material_production_id) AND (sm_update.id = sm.id))
                     ->  Seq Scan on stock_move sm_update  (cost=0.00..38474.90 rows=962090 width=328) (actual time=53.167..746.456 rows=615840 loops=1)
                     ->  Hash  (cost=38474.90..38474.90 rows=962090 width=18) (actual time=495.553..495.555 rows=331251 loops=1)
                           Buckets: 1048576  Batches: 1  Memory Usage: 25014kB
                           ->  Seq Scan on stock_move sm  (cost=0.00..38474.90 rows=962090 width=18) (actual time=0.034..387.277 rows=615840 loops=1)
               ->  Index Scan using mrp_production_pkey on mrp_production mp  (cost=0.42..0.46 rows=1 width=14) (actual time=0.009..0.009 rows=1 loops=331251)
                     Index Cond: (id = sm.raw_material_production_id)
         ->  Hash  (cost=14636.73..14636.73 rows=71455 width=48) (actual time=593.118..593.120 rows=430396 loops=1)
               Buckets: 524288 (originally 131072)  Batches: 1 (originally 1)  Memory Usage: 37721kB
               ->  Subquery Scan on mbl  (cost=13207.63..14636.73 rows=71455 width=48) (actual time=280.643..460.069 rows=430396 loops=1)
                     ->  HashAggregate  (cost=13207.63..13922.18 rows=71455 width=12) (actual time=280.630..386.938 rows=430396 loops=1)
                           Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
                           Batches: 1  Memory Usage: 65553kB
                           ->  Seq Scan on mrp_bom_line  (cost=0.00..9775.79 rows=457579 width=12) (actual time=0.267..64.367 rows=457493 loops=1)
 Planning Time: 2.019 ms
 Trigger for constraint stock_move_bom_line_id_fkey: time=400.128 calls=10741
 JIT:
   Functions: 34
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 4.723 ms, Inlining 0.000 ms, Optimization 3.774 ms, Emission 49.385 ms, Total 57.882 ms
 Execution Time: 28470.381 ms

WITH YOUR QUERY

EXPLAIN ANALYZE UPDATE stock_move sm_update
        SET bom_line_id = q.bom_line_id
        FROM (
            SELECT sm.id, mbl.bom_line_id
            FROM stock_move sm
            INNER JOIN mrp_production mp
            ON sm.raw_material_production_id = mp.id
            LEFT JOIN (
                SELECT min(id) AS bom_line_id, bom_id, product_id
                FROM mrp_bom_line
                GROUP BY bom_id, product_id
            ) mbl ON (
                mbl.bom_id = mp.bom_id
                AND mbl.product_id = sm.product_id
            )
            WHERE sm.raw_material_production_id = mp.id
        ) q
        WHERE sm_update.id = q.id;
                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on stock_move sm_update  (cost=77582.77..130574.52 rows=684522 width=380) (actual time=25668.488..25668.506 rows=0 loops=1)
   ->  Hash Left Join  (cost=77582.77..130574.52 rows=684522 width=380) (actual time=1311.538..2827.397 rows=331251 loops=1)
         Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
         ->  Hash Join  (cost=61874.22..111272.22 rows=684522 width=348) (actual time=765.058..2011.946 rows=331251 loops=1)
               Hash Cond: (sm.raw_material_production_id = mp.id)
               ->  Hash Join  (cost=54360.75..101961.85 rows=684522 width=342) (actual time=635.989..1665.379 rows=615840 loops=1)
                     Hash Cond: (sm_update.id = sm.id)
                     ->  Seq Scan on stock_move sm_update  (cost=0.00..45804.22 rows=684522 width=328) (actual time=0.040..662.032 rows=615840 loops=1)
                     ->  Hash  (cost=45804.22..45804.22 rows=684522 width=18) (actual time=632.795..632.797 rows=615840 loops=1)
                           Buckets: 1048576  Batches: 1  Memory Usage: 38354kB
                           ->  Seq Scan on stock_move sm  (cost=0.00..45804.22 rows=684522 width=18) (actual time=0.027..438.542 rows=615840 loops=1)
               ->  Hash  (cost=5679.32..5679.32 rows=146732 width=14) (actual time=128.130..128.131 rows=146549 loops=1)
                     Buckets: 262144  Batches: 1  Memory Usage: 8917kB
                     ->  Seq Scan on mrp_production mp  (cost=0.00..5679.32 rows=146732 width=14) (actual time=42.048..83.243 rows=146549 loops=1)
         ->  Hash  (cost=14636.73..14636.73 rows=71455 width=48) (actual time=546.018..546.020 rows=430396 loops=1)
               Buckets: 524288 (originally 131072)  Batches: 1 (originally 1)  Memory Usage: 37721kB
               ->  Subquery Scan on mbl  (cost=13207.63..14636.73 rows=71455 width=48) (actual time=257.731..427.076 rows=430396 loops=1)
                     ->  HashAggregate  (cost=13207.63..13922.18 rows=71455 width=12) (actual time=257.720..357.283 rows=430396 loops=1)
                           Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
                           Batches: 1  Memory Usage: 65553kB
                           ->  Seq Scan on mrp_bom_line  (cost=0.00..9775.79 rows=457579 width=12) (actual time=0.031..42.753 rows=457493 loops=1)
 Planning Time: 3.334 ms
 Trigger for constraint stock_move_bom_line_id_fkey: time=437.679 calls=10741
 JIT:
   Functions: 34
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 5.496 ms, Inlining 0.000 ms, Optimization 3.364 ms, Emission 38.766 ms, Total 47.627 ms
 Execution Time: 26135.226 ms

so there's still something weird in you statement about the big difference in times. Maybe a missing index when running one query or the other?

@galamit86
Copy link

To give some context, this is all part of our migration from V8.0 to (eventually) V14.0, and I just happen to be around to do it, but my knowledge here is very limited. So I can't say if anything is 'missing' - but here are the indexes that exist in stock_move along the way:

V8 Original Indexes

Indexes:
    "stock_move_pkey" PRIMARY KEY, btree (id)
    "stock_move_company_id_index" btree (company_id)
    "stock_move_create_date_index" btree (create_date)
    "stock_move_date_expected_index" btree (date_expected)
    "stock_move_date_index" btree (date)
    "stock_move_invoice_state_index" btree (invoice_state)
    "stock_move_location_dest_id_index" btree (location_dest_id)
    "stock_move_location_id_index" btree (location_id)
    "stock_move_move_dest_id_index" btree (move_dest_id)
    "stock_move_name_index" btree (name)
    "stock_move_picking_id_index" btree (picking_id)
    "stock_move_product_id_index" btree (product_id)
    "stock_move_product_location_index" btree (product_id, location_id, location_dest_id, company_id, state)
    "stock_move_production_id_index" btree (production_id)
    "stock_move_purchase_line_id_index" btree (purchase_line_id)
    "stock_move_raw_material_production_id_index" btree (raw_material_production_id)
    "stock_move_state_index" btree (state)

V9 Indexes

Indexes:
    "stock_move_pkey" PRIMARY KEY, btree (id)
    "stock_move_company_id_index" btree (company_id)
    "stock_move_create_date_index" btree (create_date)
    "stock_move_date_expected_index" btree (date_expected)
    "stock_move_date_index" btree (date)
    "stock_move_location_dest_id_index" btree (location_dest_id)
    "stock_move_location_id_index" btree (location_id)
    "stock_move_move_dest_id_index" btree (move_dest_id)
    "stock_move_name_index" btree (name)
    "stock_move_openupgrade_legacy_9_0_invoice_state_index" btree (openupgrade_legacy_9_0_invoice_state)
    "stock_move_picking_id_index" btree (picking_id)
    "stock_move_product_id_index" btree (product_id)
    "stock_move_product_location_index" btree (product_id, location_id, location_dest_id, company_id, state)
    "stock_move_production_id_index" btree (production_id)
    "stock_move_purchase_line_id_index" btree (purchase_line_id)
    "stock_move_raw_material_production_id_index" btree (raw_material_production_id)
    "stock_move_state_index" btree (state)

V10 Indexes

Indexes:
    "stock_move_pkey" PRIMARY KEY, btree (id)
    "stock_move_company_id_index" btree (company_id)
    "stock_move_create_date_index" btree (create_date)
    "stock_move_date_expected_index" btree (date_expected)
    "stock_move_date_index" btree (date)
    "stock_move_location_dest_id_index" btree (location_dest_id)
    "stock_move_location_id_index" btree (location_id)
    "stock_move_move_dest_id_index" btree (move_dest_id)
    "stock_move_name_index" btree (name)
    "stock_move_openupgrade_legacy_9_0_invoice_state_index" btree (openupgrade_legacy_9_0_invoice_state)
    "stock_move_picking_id_index" btree (picking_id)
    "stock_move_product_id_index" btree (product_id)
    "stock_move_product_location_index" btree (product_id, location_id, location_dest_id, company_id, state)
    "stock_move_purchase_line_id_index" btree (purchase_line_id)
    "stock_move_state_index" btree (state)

I would actually love to hear that there's a missing index, as I'm currently running stock_account/migrations/13.0.1.1/post-migration.py (from this PR) for 24h+, on 1.8M stock_moves with no end in site.

@ddejong-therp
Copy link

ddejong-therp commented Oct 7, 2022

New query:

EXPLAIN ANALYSE UPDATE stock_move sm_update
        SET bom_line_id = q.bom_line_id
        FROM (
            SELECT sm.id, mbl.bom_line_id
            FROM stock_move sm
            INNER JOIN mrp_production mp
            ON sm.raw_material_production_id = mp.id
            LEFT JOIN (
                SELECT min(id) AS bom_line_id, bom_id, product_id
                FROM mrp_bom_line
                GROUP BY bom_id, product_id
            ) mbl ON (
                mbl.bom_id = mp.bom_id
                AND mbl.product_id = sm.product_id
            )
            WHERE sm.raw_material_production_id = mp.id
            AND mbl.bom_line_id IS NOT NULL
        ) q
        WHERE sm_update.id = q.id;
                                                                               QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on stock_move sm_update  (cost=507.11..17103.77 rows=493 width=386) (actual time=3661.297..3661.301 rows=0 loops=1)
   ->  Nested Loop  (cost=507.11..17103.77 rows=493 width=386) (actual time=15.258..638.819 rows=48895 loops=1)
         ->  Hash Join  (cost=506.69..16816.23 rows=493 width=56) (actual time=14.608..293.031 rows=48895 loops=1)
               Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
               ->  Hash Join  (cost=282.36..14910.94 rows=320182 width=24) (actual time=5.012..250.555 rows=163082 loops=1)
                     Hash Cond: (sm.raw_material_production_id = mp.id)
                     ->  Seq Scan on stock_move sm  (cost=0.00..13787.82 rows=320182 width=18) (actual time=0.247..193.144 rows=320182 loops=1)
                     ->  Hash  (cost=204.38..204.38 rows=6238 width=14) (actual time=4.720..4.721 rows=6238 loops=1)
                           Buckets: 8192  Batches: 1  Memory Usage: 355kB
                           ->  Seq Scan on mrp_production mp  (cost=0.00..204.38 rows=6238 width=14) (actual time=0.261..3.281 rows=6238 loops=1)
               ->  Hash  (cost=206.72..206.72 rows=1174 width=48) (actual time=9.580..9.581 rows=5332 loops=1)
                     Buckets: 8192 (originally 2048)  Batches: 1 (originally 1)  Memory Usage: 481kB
                     ->  Subquery Scan on mbl  (cost=183.18..206.72 rows=1174 width=48) (actual time=6.202..8.366 rows=5332 loops=1)
                           ->  HashAggregate  (cost=183.18..194.98 rows=1174 width=12) (actual time=6.190..7.223 rows=5332 loops=1)
                                 Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
                                 Filter: (min(mrp_bom_line.id) IS NOT NULL)
                                 Batches: 1  Memory Usage: 721kB
                                 ->  Seq Scan on mrp_bom_line  (cost=0.00..129.59 rows=5359 width=12) (actual time=0.594..2.514 rows=5359 loops=1)
         ->  Index Scan using stock_move_pkey on stock_move sm_update  (cost=0.42..0.58 rows=1 width=334) (actual time=0.006..0.006 rows=1 loops=48895)
               Index Cond: (id = sm.id)
 Planning Time: 25.015 ms
 Trigger for constraint stock_move_bom_line_id_fkey: time=3.612 calls=308
 Execution Time: 3665.446 ms
(23 rows)

Old query:

EXPLAIN ANALYSE UPDATE stock_move sm_update
        SET bom_line_id = mbl.bom_line_id
        FROM stock_move sm
            INNER JOIN mrp_production mp
            ON sm.raw_material_production_id = mp.id
            LEFT JOIN (
                SELECT min(id) AS bom_line_id, bom_id, product_id
                FROM mrp_bom_line
                GROUP BY bom_id, product_id
                ) mbl ON (
                    mbl.bom_id = mp.bom_id
                    AND mbl.product_id = sm.product_id)
        WHERE sm_update.raw_material_production_id = mp.id
            AND sm_update.id = sm.id
            AND sm_update.bom_line_id IS NULL;
                                                                          QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on stock_move sm_update  (cost=20678.92..60506.90 rows=26 width=386) (actual time=6599.045..6599.048 rows=0 loops=1)
   ->  Hash Left Join  (cost=20678.92..60506.90 rows=26 width=386) (actual time=490.282..1305.841 rows=102883 loops=1)
         Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
         ->  Nested Loop  (cost=20467.83..60295.68 rows=26 width=354) (actual time=471.067..1236.679 rows=102883 loops=1)
               ->  Hash Join  (cost=20467.55..60287.83 rows=26 width=348) (actual time=387.341..877.659 rows=102883 loops=1)
                     Hash Cond: ((sm_update.raw_material_production_id = sm.raw_material_production_id) AND (sm_update.id = sm.id))
                     ->  Seq Scan on stock_move sm_update  (cost=0.00..13787.82 rows=259326 width=334) (actual time=0.801..242.594 rows=259580 loops=1)
                           Filter: (bom_line_id IS NULL)
                           Rows Removed by Filter: 60602
                     ->  Hash  (cost=13787.82..13787.82 rows=320182 width=18) (actual time=385.398..385.399 rows=163082 loops=1)
                           Buckets: 65536  Batches: 8  Memory Usage: 1561kB
                           ->  Seq Scan on stock_move sm  (cost=0.00..13787.82 rows=320182 width=18) (actual time=0.010..340.770 rows=320182 loops=1)
               ->  Index Scan using mrp_production_pkey on mrp_production mp  (cost=0.28..0.30 rows=1 width=14) (actual time=0.003..0.003 rows=1 loops=102883)
                     Index Cond: (id = sm.raw_material_production_id)
         ->  Hash  (cost=193.38..193.38 rows=1180 width=48) (actual time=19.155..19.156 rows=5332 loops=1)
               Buckets: 8192 (originally 2048)  Batches: 1 (originally 1)  Memory Usage: 481kB
               ->  Subquery Scan on mbl  (cost=169.78..193.38 rows=1180 width=48) (actual time=10.956..16.098 rows=5332 loops=1)
                     ->  HashAggregate  (cost=169.78..181.58 rows=1180 width=12) (actual time=10.924..13.178 rows=5332 loops=1)
                           Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
                           Batches: 1  Memory Usage: 721kB
                           ->  Seq Scan on mrp_bom_line  (cost=0.00..129.59 rows=5359 width=12) (actual time=0.326..3.181 rows=5359 loops=1)
 Planning Time: 35.661 ms
 Trigger for constraint stock_move_bom_line_id_fkey: time=3.780 calls=255
 Execution Time: 6603.923 ms

In other words, there seems to be a speed difference for me here, but not with the same order of magnitude compared with thomas'/(gal)amit's database. In my personal test situation, it didn't seem too significant.
I'm using postgres 13.8 myself. What version are you using @galamit86 ?

@pedrobaeza
Copy link
Member

Not really difference, as the time gained on execution is spent on planning.

@galamit86
Copy link

My postgres version is the same as @thomaspaulb mentioned:

/usr/lib/postgresql/13/bin/postgres --version
postgres (PostgreSQL) 13.8 (Ubuntu 13.8-1.pgdg18.04+1)

Following a comment from @ddejong-therp, regarding an index that exists on his DB but not on mine, I've added the following index:

"stock_move_location_id_location_dest_id_product_id_state" btree (product_id, state, location_id, location_dest_id)

which dramatically improved the resulting query time, so good lead @pedrobaeza !

See below the detailed results. The new query still beats the existing one, being about 4 times as fast - but it becomes a non-issue in the overall process now, so I can imagine we might not want to change it to the new one as it is less tested, and I might have a missing index or a different discrepancy on this table or mrp_production (see below those indexes for completeness). I leave it to the experts to decide :)

Happy to provide more info if needed for the decision.

Existing Query

EXPLAIN ANALYZE UPDATE stock_move sm_update
        SET bom_line_id = mbl.bom_line_id
        FROM stock_move sm
            INNER JOIN mrp_production mp
            ON sm.raw_material_production_id = mp.id
            LEFT JOIN (
                SELECT min(id) AS bom_line_id, bom_id, product_id
                FROM mrp_bom_line
                GROUP BY bom_id, product_id
                ) mbl ON (
                    mbl.bom_id = mp.bom_id
                    AND mbl.product_id = sm.product_id)
        WHERE sm_update.raw_material_production_id = mp.id
            AND sm_update.id = sm.id
            AND sm_update.bom_line_id IS NULL
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on stock_move sm_update  (cost=313252.00..725043.04 rows=164 width=358) (actual time=202893.685..202893.690 rows=0 loops=1)
   ->  Hash Left Join  (cost=313252.00..725043.04 rows=164 width=358) (actual time=3410.207..16275.697 rows=1400594 loops=1)
         Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
         ->  Nested Loop  (cost=302182.57..713972.74 rows=164 width=326) (actual time=3039.606..11192.597 rows=1400594 loops=1)
               ->  Hash Join  (cost=302182.28..713703.49 rows=164 width=320) (actual time=3039.363..8643.307 rows=1400594 loops=1)
                     Hash Cond: ((sm_update.raw_material_production_id = sm.raw_material_production_id) AND (sm_update.id = sm.id))
                     ->  Seq Scan on stock_move sm_update  (cost=0.00..265070.31 rows=1558459 width=306) (actual time=546.962..2687.483 rows=1582529 loops=1)
                           Filter: (bom_line_id IS NULL)
                           Rows Removed by Filter: 228189
                     ->  Hash  (cost=265070.31..265070.31 rows=1779131 width=18) (actual time=2488.035..2488.036 rows=1628783 loops=1)
                           Buckets: 32768  Batches: 128  Memory Usage: 903kB
                           ->  Seq Scan on stock_move sm  (cost=0.00..265070.31 rows=1779131 width=18) (actual time=0.011..2032.264 rows=1810718 loops=1)
               ->  Index Scan using mrp_production_pkey on mrp_production mp  (cost=0.29..1.64 rows=1 width=14) (actual time=0.001..0.001 rows=1 loops=1400594)
                     Index Cond: (id = sm.raw_material_production_id)
         ->  Hash  (cost=10853.56..10853.56 rows=14392 width=48) (actual time=370.000..370.001 rows=143883 loops=1)
               Buckets: 16384 (originally 16384)  Batches: 16 (originally 1)  Memory Usage: 1183kB
               ->  Subquery Scan on mbl  (cost=9160.21..10853.56 rows=14392 width=48) (actual time=147.276..319.263 rows=143883 loops=1)
                     ->  HashAggregate  (cost=9160.21..10709.64 rows=14392 width=12) (actual time=147.270..284.995 rows=143883 loops=1)
                           Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
                           Batches: 33  Memory Usage: 1393kB  Disk Usage: 7248kB
                           ->  Seq Scan on mrp_bom_line  (cost=0.00..3656.24 rows=143924 width=12) (actual time=0.184..33.980 rows=143924 loops=1)
 Planning Time: 5.459 ms
 JIT:
   Functions: 40
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 5.572 ms, Inlining 79.877 ms, Optimization 317.485 ms, Emission 192.228 ms, Total 595.162 ms
 Execution Time: 202900.788 ms

New Query

EXPLAIN ANALYZE UPDATE stock_move sm_update
        SET bom_line_id = q.bom_line_id
        FROM (
            SELECT sm.id, mbl.bom_line_id
            FROM stock_move sm
            INNER JOIN mrp_production mp 
            ON sm.raw_material_production_id = mp.id
            LEFT JOIN (                           
                SELECT min(id) AS bom_line_id, bom_id, product_id
                FROM mrp_bom_line          
                GROUP BY bom_id, product_id                
            ) mbl ON (                               
                mbl.bom_id = mp.bom_id      
                AND mbl.product_id = sm.product_id
                AND mbl.bom_line_id IS NOT NULL
            )
            WHERE sm.raw_material_production_id = mp.id 
            AND mbl.bom_line_id IS NOT NULL
        ) q
        WHERE sm_update.id = q.id;                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on stock_move sm_update  (cost=18488.14..322749.50 rows=5535 width=412) (actual time=47441.125..47441.129 rows=0 loops=1)
   ->  Nested Loop  (cost=18488.14..322749.50 rows=5535 width=412) (actual time=1971.757..29514.071 rows=227535 loops=1)
         ->  Hash Join  (cost=18487.71..312777.15 rows=5535 width=56) (actual time=1971.667..4286.512 rows=227535 loops=1)
               Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
               ->  Hash Join  (cost=6702.26..291480.92 rows=1811577 width=24) (actual time=107.229..3071.688 rows=1628783 loops=1)
                     Hash Cond: (sm.raw_material_production_id = mp.id)
                     ->  Seq Scan on stock_move sm  (cost=0.00..258612.77 rows=1811577 width=18) (actual time=32.917..2358.605 rows=1810718 loops=1)
                     ->  Hash  (cost=6063.78..6063.78 rows=36678 width=14) (actual time=74.158..74.159 rows=36678 loops=1)
                           Buckets: 32768  Batches: 2  Memory Usage: 1132kB
                           ->  Seq Scan on mrp_production mp  (cost=0.00..6063.78 rows=36678 width=14) (actual time=25.975..67.058 rows=36678 loops=1)
               ->  Hash  (cost=11571.74..11571.74 rows=14248 width=48) (actual time=246.002..246.003 rows=143883 loops=1)
                     Buckets: 16384 (originally 16384)  Batches: 16 (originally 1)  Memory Usage: 1183kB
                     ->  Subquery Scan on mbl  (cost=9879.83..11571.74 rows=14248 width=48) (actual time=79.662..206.653 rows=143883 loops=1)
                           ->  HashAggregate  (cost=9879.83..11429.26 rows=14248 width=12) (actual time=79.657..176.290 rows=143883 loops=1)
                                 Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
                                 Filter: ((min(mrp_bom_line.id) IS NOT NULL) AND (min(mrp_bom_line.id) IS NOT NULL))
                                 Batches: 33  Memory Usage: 1393kB  Disk Usage: 7248kB
                                 ->  Seq Scan on mrp_bom_line  (cost=0.00..3656.24 rows=143924 width=12) (actual time=0.004..9.685 rows=143924 loops=1)
         ->  Index Scan using stock_move_pkey on stock_move sm_update  (cost=0.43..1.80 rows=1 width=360) (actual time=0.109..0.109 rows=1 loops=227535)
               Index Cond: (id = sm.id)
 Planning Time: 2.549 ms
 JIT:
   Functions: 37
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 5.684 ms, Inlining 0.000 ms, Optimization 2.307 ms, Emission 33.151 ms, Total 41.142 ms
 Execution Time: 47446.693 ms

Indexes on mrp_production

Indexes:
    "mrp_production_pkey" PRIMARY KEY, btree (id)
    "mrp_production_company_id_index" btree (company_id)
    "mrp_production_date_deadline_index" btree (date_deadline)
    "mrp_production_date_finished_index" btree (date_finished)
    "mrp_production_date_planned_finished_index" btree (date_planned_finished)
    "mrp_production_date_planned_start_index" btree (date_planned_start)
    "mrp_production_date_start_index" btree (date_start)
    "mrp_production_message_main_attachment_id_index" btree (message_main_attachment_id)
    "mrp_production_name_uniq" UNIQUE CONSTRAINT, btree (name, company_id)
    "mrp_production_reservation_state_index" btree (reservation_state)
    "mrp_production_state_index" btree (state)

@thomaspaulb
Copy link
Author

Okay, I think while my new query does indeed lead to better performance, the real issue was the missing index.

@ddejong-therp We are by default running upgrade base on the pre-migration version, right? This should try to recreate any missing indexes, if I'm not mistaken. Could we perhaps add a script that checks the output of that for any error messages about why it could not create that index, and if so, error out? In that case, we would have spotted the missing index almost immediately and could have created it.

@pedrobaeza
Copy link
Member

I think it's better to do -u all

AND mbl.product_id = sm.product_id
)
WHERE sm.raw_material_production_id = mp.id
AND mbl.bom_line_id IS NOT NULL
Copy link
Member

@StefanRijnhart StefanRijnhart Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this bom_line_id IS NOT NULL condition is void here. Is this where the original sm_update.bom_line_id IS NULL landed? (I'm not sure why that was there in the previous version because the field is new in version 10 and here is the only place where it is populated).

-- edit -- oh, it's because the subquery that provides it is left-joined. Makes me think you can take the condition out and apply an inner join.

@StefanRijnhart
Copy link
Member

StefanRijnhart commented Oct 13, 2022

It's encouraging that this is still improving performance, but I am not sure that the readability is improved. Do you think a CTE version like this will have the same performance improvement and do you agree it is more readable? Warning: untested query.

with bom_line_products as (
    SELECT min(id) AS bom_line_id, bom_id, product_id
    FROM mrp_bom_line
    GROUP BY bom_id, product_id
),
move_bom_lines as (
     SELECT sm.id as move_id, blp.bom_line_id
     FROM stock_move sm
     JOIN mrp_production mp
         ON sm.raw_material_production_id = mp.id
     JOIN bom_line_products blp
         ON blp.bom_id = mp.bom_id
         AND blp.product_id = sm.product_id
)
UPDATE stock_move sm
SET bom_line_id = mbl.bom_line_id
FROM move_bom_lines mbl
WHERE sm.id = mbl.move_id;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants