arrow2
does _not_ refcount schema metadata
#1805
Labels
🏹 arrow
concerning arrow
🚀 performance
Optimization, memory use, etc
⛃ re_datastore
affects the datastore itself
All
arrow2
arrays are defined roughly as the following:When you clone/slice/index an
Array
, you get anotherArray
in roughlyO(1)
thanks to both thevalues
andvalidity
bitmaps being refcounted behind the scenes:Well... not really, turns out the
DataType
is not refcounted, and it can get huge: it's a massive heap-recursive enum potentially filled with strings and such.Say you have a
ListArray
that contains a bunch ofStructArray
s (i.e. a column of component data) and you want to extract references to the individualStructArray
s in that list (i.e. the individualDataCell
s): each of these arrays is now going to carry a full copy of theStructArray
's schema.For tiny
DataCell
s (which are very common in Rerun), the overhead is enormous.The text was updated successfully, but these errors were encountered: