-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault
in st_contains
spatial join
#438
Comments
Hi! Happy to hear DuckDB spatial is useful for you and thanks for opening this issue! R-Tree indexes are not used for spatial joins yet, only when doing spatial filters, so it makes sense that you wouldn't see it used in the plan. Adding support for joins is of course high on the TODO list but I can't provide an exact timeline. All the functions you use, If you convert and store the geometries into a table first (so that you don't have to call ST_GeomFromHEXEWKB in the main query), does it still crash? |
@Maxxen, thank you for your reply. Unfortunately, I can not share the proprietary data, but the suggestions were right-on:
Yet, in another query, using the same data from the first query, I'm trying to perform a spatial join between a 142M row subset table and a 3M row subset table. Same geometries. Using the same Spatial extension functions as the first, passing query:
and it is using up to 57GiB RAM, but eventually hitting the initial I then tried dialing down the precision of my spatial join with a:
Realizing I only really care how records match to the larger table, I changed the I really appreciate you taking the time to answer my questions, and the work you're doing to improve the Spatial extension. |
Looking at it more:
|
Problem
Good morning, again!
When performing an
ST_ContainsProperly
orST_Within
of a centroid in another boundary (the centroids of a 152M record table compared against the large township boundaries of only a 2M record table) -- I'm getting asegmentation fault
.As an aside, I want to say that we are very close to getting DuckDB to work for processing all our spatial data. This spatial join is the last blocker. DuckDB is performing faster than PostGIS. The work done here has been a blessing. I am grateful for everyone here at DuckDB!
Traceback
Steps to re-create
Running this query against a row table stored in
.duckdb
file, locally.geom
column of the large, 152M record table,parcel_batch_1
.segmentation fault
.Initial Query Plan
Alternate Attempt (no
segmentation fault
, but query hangs at 50%)centroid
column onparcel_batch_1
so thest_centroid
needn't be calculated in thejoin
condition, and materialized the Parquet file as a table. I added an R-Tree index on bothparcel_batch_1.centroid
and thegeom
column of the township (formerly the Parquet file) table. Still, no usage in query plan.ST_ContainsProperly
toST_Contains
, as well.Environment:
My Machine:
The text was updated successfully, but these errors were encountered: