Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: smol datatypes #1847

Closed
wants to merge 3 commits into from
Closed

wip: smol datatypes #1847

wants to merge 3 commits into from

Conversation

@teh-cmc teh-cmc force-pushed the cmc/smol_datatypes branch 3 times, most recently from 6ccfe18 to 48f06b1 Compare April 17, 2023 08:42
@teh-cmc teh-cmc force-pushed the cmc/smol_datatypes branch from 48f06b1 to 0e5138b Compare April 17, 2023 09:21
@teh-cmc
Copy link
Member Author

teh-cmc commented Apr 17, 2023

Running the new extended memory_usage example:

Short EntityPath uses 78 bytes in RAM

--- 1 rows each containing 1 points (packed=false) ---
Arrow payload containing 1x Pos2 uses 2.9 kiB bytes in RAM
Arrow LogMsg containing 1x Pos2 uses 7.6 kiB-10.1 kiB bytes in RAM, and 794 B bytes encoded

--- 1 rows each containing 1 points (packed=true) ---
Arrow payload containing 1x Pos2 uses 2.5 kiB bytes in RAM
Arrow LogMsg containing 1x Pos2 uses 7.6 kiB-9.7 kiB bytes in RAM, and 746 B bytes encoded

--- 1 rows each containing 1000 points (packed=false) ---
Arrow payload containing 1000x Pos2 uses 10.3 kiB bytes in RAM
Arrow LogMsg containing 1000x Pos2 uses 15.4 kiB-17.5 kiB bytes in RAM, and 7.9 kiB bytes encoded

--- 1 rows each containing 1000 points (packed=true) ---
Arrow payload containing 1000x Pos2 uses 10.3 kiB bytes in RAM
Arrow LogMsg containing 1000x Pos2 uses 15.4 kiB-17.5 kiB bytes in RAM, and 7.8 kiB bytes encoded

--- 100000 rows each containing 1 points (packed=false) ---
Arrow payload containing 1x Pos2 uses 90.6 MiB bytes in RAM
Arrow LogMsg containing 1x Pos2 uses 4.8 MiB-95.8 MiB bytes in RAM, and 1.6 MiB bytes encoded

--- 100000 rows each containing 1 points (packed=true) ---
Arrow payload containing 1x Pos2 uses 54.0 MiB bytes in RAM
Arrow LogMsg containing 1x Pos2 uses 4.8 MiB-59.3 MiB bytes in RAM, and 1.6 MiB bytes encoded

--- 100000 rows each containing 1000 points (packed=false) ---
Arrow payload containing 1000x Pos2 uses 851 MiB bytes in RAM
Arrow LogMsg containing 1000x Pos2 uses 767 MiB-1.6 GiB bytes in RAM, and 683 MiB bytes encoded

--- 100000 rows each containing 1000 points (packed=true) ---
Arrow payload containing 1000x Pos2 uses 816 MiB bytes in RAM
Arrow LogMsg containing 1000x Pos2 uses 767 MiB-1.5 GiB bytes in RAM, and 683 MiB bytes encoded

@teh-cmc
Copy link
Member Author

teh-cmc commented Jun 9, 2023

Closing -- will get fixed automagically by #2354

@teh-cmc teh-cmc closed this Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement minimal datatype registry for cross-batch deduplication arrow2 does _not_ refcount schema metadata
1 participant