You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An alternative to creating a temporary table and then loading it into catalog and catalog_to_X is to COPY the query to a temporary CSV file and then COPY it back to both tables.
This probably would require create the temporary file with all the necessary columns and then select which ones to actualy copy depending on the destination table.
I think this would be useful for phases 1 and 3, probably not for 2.
To test this, we can compare it with some CREATE TABLE AS + INSERT. Using enable_hashjoin=false, this query
CREATE TEMPORARY TABLE IF NOT EXISTS "6cbada9d" AS SELECT DISTINCT ON ("t1"."object_id") "t1"."object_id" AS "target_id", "t2"."catalogid", (first_value("t1"."object_id") OVER (PARTITION BY "t2"."catalogid" ORDER BY "t1"."object_id" ASC) = "t1"."object_id") AS "best" FROM "catalogdb"."skymapper_dr2" AS "t1" INNER JOIN "catalogdb"."tic_v8" AS "t3" ON ("t1"."gaia_dr2_id1" = "t3"."gaia_int") INNER JOIN "catalogdb"."catalog_to_tic_v8" AS "t2" ON ("t2"."target_id" = "t3"."id") WHERE ((("t2"."version_id" = 25) AND ("t2"."best" IS true)) AND NOT EXISTS(SELECT 1 FROM "catalogdb"."catalog_to_skymapper_dr2" AS "t4" WHERE (("t4"."version_id" = 25) AND (("t4"."target_id" = "t1"."object_id") OR ("t4"."catalogid" = "t2"."catalogid")))))
takes ~4:30h while
INSERT INTO "catalogdb"."catalog_to_skymapper_dr2" ("target_id", "catalogid", "version_id", "best") SELECT "t1"."target_id", "t1"."catalogid", 25, "t1"."best" FROM "6cbada9d" AS "t1"
takes 2.5h. We can replace it with a
COPY (SELECT DISTINCT ON ("t1"."object_id") "t1"."object_id" AS "target_id", "t2"."catalogid", (first_value("t1"."object_id") OVER (PARTITION BY "t2"."catalogid" ORDER BY "t1"."object_id" ASC) = "t1"."object_id") AS "best" FROM "catalogdb"."skymapper_dr2" AS "t1" INNER JOIN "catalogdb"."tic_v8" AS "t3" ON ("t1"."gaia_dr2_id1" = "t3"."gaia_int") INNER JOIN "catalogdb"."catalog_to_tic_v8" AS "t2" ON ("t2"."target_id" = "t3"."id") WHERE ((("t2"."version_id" = 25) AND ("t2"."best" IS true))) TO '<tempfile>'
and then COPY to an empty table. Note that we need to remove the NOT EXISTS because if we repeat the query all the entries will already be inserted.
The text was updated successfully, but these errors were encountered:
An alternative to creating a temporary table and then loading it into
catalog
andcatalog_to_X
is toCOPY
the query to a temporary CSV file and thenCOPY
it back to both tables.This probably would require create the temporary file with all the necessary columns and then select which ones to actualy copy depending on the destination table.
I think this would be useful for phases 1 and 3, probably not for 2.
To test this, we can compare it with some
CREATE TABLE AS + INSERT
. Usingenable_hashjoin=false
, this querytakes ~4:30h while
takes 2.5h. We can replace it with a
and then
COPY
to an empty table. Note that we need to remove theNOT EXISTS
because if we repeat the query all the entries will already be inserted.The text was updated successfully, but these errors were encountered: