-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Use a compact hash table for RubyHash instead of the buckets strategy #3172
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass. Good start :)
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
int signMask = h >> 31; | ||
// make positive | ||
h = h ^ signMask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would h & ((1<<31)-1)
do the same? (if so it seems more efficient)
And if isn't the h & (max - 1)
already enough on its own to make it positive?
Maybe we can even use return h & (max - 2)
to do all of it at once? And assert max >= 4 && isPowerOfTwo(4)
.
There is a definition of isPowerOfTwo()
in IntegerNodes, you could move it to RubyGuards to reuse it.
I'm not sure we should discard the lowest bit of the hash though, it might be better to (h & ((index.length>>1) - 1)) << 1
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would h & ((1<<31)-1) do the same?
Hmmm, ((1 << 31) - 1) is the bitmask <0><1 {31 times}> right ? I had this idea too, although I thought of it more directly in terms of the mask shutting down the sign bit of the 2 comp representation.
I rejected the idea at first because I thought it would be "less fair" than the shifted absolute value function I'm using, meaning that it results in some indices having more collisions than others. I don't remember my reasoning, but it was wrong! I checked now and that '&' actually have the same property as the absolute value, it just maps each negative numbers to exactly one positive number, just not its abs. Yup, checks out, it would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isn't the h & (max - 1) already enough on its own to make it positive?
Also yes, thanks for pointing that out. Although I'm not sure that actually have the same semantics of "Make positive, THEN take the mod", but I don't think anybody cares about the mod operation per se, all we want is just a number between 0 and max-1, and h & (max - 1) does that nicely.
return h & (max - 2)
Brilliant fusion of operations !
assert max >= 4 && isPowerOfTwo(4)
I think you meant putting max inside the isPowerOfTwo call, not 4. Also, 2 is technically a valid power of 2 right ? it will always result in 0 which is the only valid index anyway. Regardless, max will always be >= 8 by the unstated precondition of the initialCapacity constructor, which is that the passed capacity be greater than or equal to the default initial capacity (8). I can assert that in the constructor itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should discard the lowest bit of the hash though, it might be better to (h & ((index.length>>1) - 1)) << 1 instead
Hmm, can you elaborate further ? The current logic is not discarding the lowest bit of the hash, just the resulting mod (or mod-like) remainder. As far as I can see, that's the simplest way to ensure an even array index. What are some bad consequences of this in your opinion ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant putting max inside the isPowerOfTwo call, not 4. Also, 2 is technically a valid power of 2 right ? it will always result in 0 which is the only valid index anyway. Regardless, max will always be >= 8 by the unstated precondition of the initialCapacity constructor, which is that the passed capacity be greater than or equal to the default initial capacity (8). I can assert that in the constructor itself.
Yes, it needs be a power of 2 >= 4, because if it's 2 then it would & 0
and only cause collisions.
I would assert it here because this is the place it needs to hold. You can also assert it in other places of course, and e.g. create a helper method to asserts those 2 conditions.
Hmm, can you elaborate further ? The current logic is not discarding the lowest bit of the hash, just the resulting mod (or mod-like) remainder. As far as I can see, that's the simplest way to ensure an even array index. What are some bad consequences of this in your opinion ?
It is keeping keeping the mod remainder, what is above is discarded. But if we do return h & (max - 2)
, then we also discard the lowest bit, which seems not ideal.
Suppose I have some hash values like 6 and 7, if we discard the lowest bit, then those collide. But if we didn't discard it then they wouldn't collide.
We can't take all 32 bits of the hash, because our array will never have that much length, so I think the typical approach is to preserve all lower bits, because lower bits are most likely to differ for objects that are "close" in values to each other (e.g. if int
hash is just itself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of code to work through here. I haven't done a comprehensive pass on the core logic yet. I think if you apply some of the feedback I suggested, it'll be easier to follow the logic.
As a couple of other general notes:
- You have several branches without a profile. That's fine for a first pass, but will likely limit your performance, unless all branches are equally likely.
- You're missing some necessary boundaries.
An easy way to check if you're missing a necessary boundary is to run jt build --env native
. It's slow, but if you end up with output like:
Deepest level call tree path (47 calls):
62 ; com.sun.org.apache.xerces.internal.impl.xpath.regex.ParserForXMLSchema.getRange(ParserForXMLSchema.java:381) ; com.sun.org.apache.xerces.internal.impl.xpath.regex.ParserForXMLSchema.setupRange(Token, String)
489 ; com.sun.org.apache.xerces.internal.impl.xpath.regex.ParserForXMLSchema.getTokenForShorthand(ParserForXMLSchema.java:321) ; com.sun.org.apache.xerces.internal.impl.xpath.regex.ParserForXMLSchema.getRange(String, boolean)
145 ; com.sun.org.apache.xerces.internal.impl.xpath.regex.RegexParser.parseAtom(RegexParser.java:783) ; com.sun.org.apache.xerces.internal.impl.xpath.regex.ParserForXMLSchema.getTokenForShorthand(int) ; com.sun.org.apache.xerces.internal.impl.xpath.regex.RegexParser.getTokenForShorthand(int)
440 ; com.sun.org.apache.xerces.internal.impl.xpath.regex.RegexParser.parseFactor(RegexParser.java:677) ; com.sun.org.apache.xerces.internal.impl.xpath.regex.RegexParser.parseAtom()
then you'll know you missed one. The output is letting you know the Native Image attempted to compile too many methods for runtime compilation. Now, that's a slow process, so you may find it more advantageous to go through the code and try to reason about it or add boundaries everywhere and work to eliminate them.
Another option available is to modify mx.truffleruby/native-image.properties to add the PrintRuntimeCompilationCallTree
host option:
diff --git a/mx.truffleruby/native-image.properties b/mx.truffleruby/native-image.properties
index 5be3cebd3b..72ade59fda 100644
--- a/mx.truffleruby/native-image.properties
+++ b/mx.truffleruby/native-image.properties
@@ -4,6 +4,7 @@
Requires = language:nfi language:llvm language:regex
Args = -H:MaxRuntimeCompileMethods=5400 \
+ -H:+PrintRuntimeCompilationCallTree \
-H:+AddAllCharsets \
--initialize-at-build-time=org.truffleruby,org.jcodings,org.joni
That'll modify the output of jt build --env native
to include the call tree for each method made available for runtime compilation. It's a ton of output, but if you see a deeply nested entry and work your way back to its root you can identify areas that need boundaries that way. Alternatively, since all of this code lives in CompactHashStore
, you can search for that and look for long call trees stemming from that entry.
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
// This tracks the next valid insertion position into kvStore, it starts at 0 and always increases by 2 | ||
// We can't use the size of the RubyHash for that, because deletion reduces its hash.size | ||
// Whereas the insertion pos into kvStore can never decrease, we don't reuse empty slots | ||
private int kvStoreInsertionPos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for an initial implementation, but unbounded memory growth sounds bad. It might even be a DoS vector (I need to think on that a bit more).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with you on that one, there is a method resizeKvIfNeeded() that is currently responsible for expanding the KV, I could also make it shrink the KV as well, triggering a compaction pass if hash.size is too small relative to the Kv.size.
The problem is that compaction probably implies both a rehash and touching the entire index array, there might be a clever way to avoid both but I'm not sure I got all the details right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is something we did not handle in BucketsHashStore either, and java.util.HashMap does not seem to do it either.
But yes maybe we should have a lower threshold too, maybe if falling below 0.20 * capacity entries or maybe even a lower factor. It needs to not trigger though e.g. when we create a new Hash with capacity=8 and add the first entries.
I think something to revisit later.
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private static void assertConstructorPreconditions(int capacity) { | ||
if (capacity < 1 || capacity > (1 << 28)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 28? I realize we need the number to be effectively half the max array size so we can store pairs of values, but this seems smaller than I'd expect to be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that number will be rounded upwards to the nearest power of 2. So 2^28 + 1 wil be rounded upwards to 2^29. That will be the kv capacity, the index capacity is 2 times that, so 2^30. That's a problem, because the capacity is in entries, so we allocate twice 2^30 array entries for the index, so 2^31, but that's negative !
The thing we can remove to mitigate that is the requirement that the kv size be a multiple of 2, there is no real reason for that, we technically only require that index.length is a power of 2 for efficient mod. So we technically can use the supplied capacity as-is for the KV, and then round upwards for the index capacity.
But there's a pathological edge case there, what if the supplied capacity was already a power of 2 ? rouding "upwards" will yield the same power of 2, and then you end up with a KV and an index array of the same size, which is a too crowded index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so 2^28 is correct then, could you add a comment explaining that?
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java
Outdated
Show resolved
Hide resolved
e42b62c
to
e6d0a47
Compare
e6d0a47
to
e4daec5
Compare
int[] oldIndex = index; | ||
this.index = new int[oldIndex.length]; | ||
|
||
for (int i = 0; tillArrayEnd.inject(node, i < oldIndex.length); i += 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This usage of InlinedLoopConditionProfile is not correct (it never reports the number of iterations done), it needs to be like 4ec8e4d & in other places? Could you fix all InlinedLoopConditionProfile to match that?
} | ||
|
||
@TruffleBoundary | ||
private static void resizeIndexIfNeeded(CompactHashStore store, int size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heavy logic with the loop below, that's good to be behind a TruffleBoundary, there is almost no value to partial evaluate that, and it would make warmup slower.
But the early check if (indexResizingIsNotNeeded.profile(node, size < store.numSlotsForIndexRebuild)) {
that must be moved to each caller, we don't want to make a boundary call if it's not needed.
} | ||
} | ||
|
||
private static void relocateHashIfPossible(int currPos, int relocationPos, int[] index, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do, could you document it? Is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just reuses deleted slots along the lookup chain. If we didn't do this, lots of deleted keys will slow down lookup considerably by lengthening the chain.
The way it works is that : hash lookup logic keeps track of the first deleted slot encountered, and returns it along with the position of the hash it's looking for (avoiding allocation via the IntPair clever hack, packing 2 ints in a long). The caller of hash lookup logic is then responsible for calling the relocation logic after it uses the hash position to lookup the hash.
Yes it's not very pretty ¯_(ツ)_/¯
fa3944a
to
eef1320
Compare
private static final boolean bigHashTypeIsCompact; | ||
static { | ||
String hashType = System.getProperty("BigHashStrategy"); | ||
bigHashTypeIsCompact = hashType != null && hashType.equalsIgnoreCase("compact"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update this so it defaults to your new CompactHashStore? (i.e. use it unless BigHashStrategy == "buckets"
)
Then we'll be sure the new strategy passes the whole CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same logic is duplicated in src/main/java/org/truffleruby/core/hash/HashLiteralNode.java, could you deduplicate?
* 130k -> 145k for buckets-lookup.rb
…ire explicit config
…Store from a Hash literal
* Use a single loop&node instead of two nested loops&nodes. * Remove the need for IntPair, just return the indexPos instead. * Reuse deleted slots in the index array for []=. * Does not handle the index array being full.
* Notably frequent with Truffle::ThreadOperations.detect_pair_recursion.
* I tried making that a node but some Truffle DSL bug prevents using the node in PackedHashStoreLibrary.
The optimization to decrease kvStoreInsertionPos if deleting at the end is done. |
…_from_java_jump()
I am sorry this is taking so long to merge. |
This implements a "Compact" approach to hash tables, described at a high level
here. TLDR : This is a representation strategy that efficiently (hopefully!) allows insertion-order preserving hash tables. It aims to replace the BucketHashStorage strategy.
Currently appears to pass all the tests.