You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We got more feedback on Matrix on our parquet export:
I also wonder: it looks to me like the respective last parquet file will keep increasing content-wise until its "full", is that right? If that's true, it would be nice to be able to avoid having to redownload everything if it didn't change - something like hashes in the manifest or such (but there's probably a better way even)
Although I guess I could also include the last file downloaded and redownload that as well when updating... everything except the last one I guess shouldn't change over time (short of schema changes :-))?
It would be nice to be able to see in the manifest if a file changed after the last download.
Possible additions to the manifest to achieve this:
schema version
timestamp of the last change of the file
parquet file hash
row count
I am also not sure if the parquet files are generated incrementally and the files are generated in order. Would be good to have a look into this and if only rows to the last file are added.
Tasks
Investigate if parquet files are only updated incrementally
We got more feedback on Matrix on our parquet export:
It would be nice to be able to see in the manifest if a file changed after the last download.
Possible additions to the manifest to achieve this:
I am also not sure if the parquet files are generated incrementally and the files are generated in order. Would be good to have a look into this and if only rows to the last file are added.
Tasks
Related
#1668
The text was updated successfully, but these errors were encountered: