Revert to using arrow tables for full
valid values grid in check_tbl_values_required()
#37
Labels
full
valid values grid in check_tbl_values_required()
#37
Previously I have been using arror tables which seem more memory efficient and generally more performant to optimise
check_tbl_values_required()
which can be slow with larger files.In d1e2861 I reverted this because I discovered joins using
arrow
did not considerNA
values as matches (asdplyr
does by default), resulting in data being lost during inner joins that includedNA
values. (see issue reported here: apache/arrow#14907)Hopefully, this will at some point be resolved. Once it is, changes in d1e2861 will need reverting to make the function more performant again.
The text was updated successfully, but these errors were encountered: