Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient storage of integer G3Map and G3Vector objects #140

Merged
merged 6 commits into from
Feb 20, 2024
Merged

Conversation

arahlin
Copy link
Member

@arahlin arahlin commented Feb 6, 2024

Use int64_t for in-memory representation of integer values, but store as 8-, 16-, 32- or 64-bit integers on disk, depending on the bit depth of the underlying data. Use a consistent ABI for G3Vector and G3Map serialization, and ensure backwards compatibility for v1 int32_t objects on disk.

Closes #122.

Use int64_t for in-memory representation of integer values, but store as 8-, 16-
or 32- bit integers on disk, depending on bit depth of the underlying data.
Backwards compatible with v1 int32 G3Map objects.

Closes #122.
@arahlin arahlin requested a review from nwhitehorn February 6, 2024 23:04
@arahlin arahlin self-assigned this Feb 6, 2024
@arahlin arahlin requested a review from mhasself February 6, 2024 23:37
@arahlin
Copy link
Member Author

arahlin commented Feb 6, 2024

Tagging @mhasself here, in case you have opinions on this implementation, since you wrote the G3VectorInt implementation this is copied from.

Copy link
Member

@mhasself mhasself left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tag! Looks good to me. There's a lot of repetitive code now ... but if the alternative is template mazes then that's not always better.

Couple suggestions in line, but don't stop this train for me.

core/src/G3Map.cxx Outdated Show resolved Hide resolved
core/src/G3Map.cxx Outdated Show resolved Hide resolved
Compute store_bits separately for each vector in G3MapVectorInt
@arahlin arahlin changed the title Efficient storage of G3MapInt and G3MapVectorInt objects Efficient storage of integer G3Map and G3Vector objects Feb 8, 2024
Copy link
Member

@nwhitehorn nwhitehorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor code comments in line. How much time does bit_count() take? Is there any meaningful run time impact from this change?

The code is a little spaghetti-ish, but I don't see a viable way to avoid that.

core/src/G3Map.cxx Outdated Show resolved Hide resolved
core/src/G3Map.cxx Outdated Show resolved Hide resolved
core/src/G3Map.cxx Show resolved Hide resolved
@arahlin
Copy link
Member Author

arahlin commented Feb 20, 2024

I haven't profiled this -- @mhasself do you know how efficient the bit_count function is?

@arahlin arahlin merged commit cc46e8d into master Feb 20, 2024
1 check passed
@arahlin arahlin deleted the map_int64 branch February 20, 2024 15:16
@mhasself
Copy link
Member

I haven't profiled this -- @mhasself do you know how efficient the bit_count function is?

For the record, no I do not.

But it's probably like, 3 assembly instructions (in the loop), and seems like the sort of thing that would pipeline pretty well (despite the ifs). There might be a faster version where you accumulate the +ve and -ve data into separate masks and combine them at the end. An interesting question for a less busy day.

@mhasself
Copy link
Member

Against my better schedule management ... here is a (relative) profiling. The bit_count function takes 10 ns / sample on my laptop. Tests done with 1k to 1M elements. It seems to take about 10% more time than this simple accumulation function:

int not_bit_count_v2(const std::vector<int64_t> &d) {
    int sum = 0;
    for (auto c: d)
        sum += c;
    return sum;
}

So I think it's close enough to maxed out.

@arahlin
Copy link
Member Author

arahlin commented Feb 20, 2024

Cool, good to know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

G3MapVectorInt and G3MapInt should use int64
3 participants