-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Ruby string interning #3185
Conversation
@@ -62,4 +91,51 @@ public String toString() { | |||
return TruffleString.fromByteArrayUncached(bytes, encoding, false).toString(); | |||
} | |||
|
|||
private static int hashCode(InternalByteArray byteArray) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice if we could upstream this into Truffle's ArrayUtils
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in InternalByteArray would be a better place, TruffleString has the byte[] intrinsics nowadays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be fine, too. There had been some concerns previously about exposing public APIs in something named "internal", but I don't really care where it goes.
|
||
public final class TBytesKey { | ||
|
||
private final byte[] bytes; | ||
private final int offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we cannot construct TruffleString's InternalByteArray
, we basically have to recreate it here. It'd be nice if there were a cleaner option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the plus side this actually consumes less memory (i.e., we don't keep the InternalByteArray
instance around).
94de08d
to
b8efd14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right it makes sense to be able to lookup without copying bytes.
OTOH when adding an entry in the cache it should copy if not a perfect fit, to avoid holding onto extra hidden bytes "outside the substring". The PR already handles that, nice.
|
||
public final class TBytesKey { | ||
|
||
private final byte[] bytes; | ||
private final int offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the plus side this actually consumes less memory (i.e., we don't keep the InternalByteArray
instance around).
if (a.isPerfectFit() && b.isPerfectFit()) { | ||
return Arrays.equals(a.bytes, b.bytes); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be a big perf advantage here so I'd just remove this and use the general case below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will Graal optimize both cases? Array.equals(byte[], byte[])
is annotated as @IntrinsicCandidate
, while the variant with specified offsets and end points is not. I thought that would be useful for the interpreter, at least. Granted, I didn't benchmark the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They both end up in vectorizedMismatch
and that's intrinsified by Graal, yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arrays.equals(byte[], byte[])
is also intrinsified by Graal so maybe it's a tiny bit better.
I guess the only way to know for sure is to benchmark it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine as it is, no need to spend too much time on it, either way is fine.
} | ||
|
||
@TruffleBoundary | ||
public TruffleString getTString(byte[] bytes, RubyEncoding rubyEncoding) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should still be @TruffleBoundary
, it's counter-productive to allocate the TBytesKey in PE code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wish we had a reason
field on @TruffleBoundary
because it's easy to look at this and say "there's no reason this can't be compiled, so lets do away with the boundary". I've been operating under the assumption that anything that could run without a boundary should and let Truffle's heuristics sort out the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to add a code comment about it. Here it's a case of "no value to PE this code, and worse due to forcing an extra allocation that might not be needed otherwise". We should only PE what can benefit from PE, there is a warmup cost to PE too much code.
@@ -62,4 +91,51 @@ public String toString() { | |||
return TruffleString.fromByteArrayUncached(bytes, encoding, false).toString(); | |||
} | |||
|
|||
private static int hashCode(InternalByteArray byteArray) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in InternalByteArray would be a better place, TruffleString has the byte[] intrinsics nowadays.
|
||
public final class TBytesKey { | ||
|
||
private final byte[] bytes; | ||
private final int offset; | ||
private final int length; | ||
private final boolean isImmutable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be tracked externally and passed to makeCacheable()
, to save footprint.
I'm curious, did you notice the current approach was a bottleneck in some benchmark? |
Improving The savings wasn't as pronounced as I had hoped for. Looking at the SVG from
In MRI, the string interning process accounts for a trivial amount of total execution. We have more going on with the concurrent collection, but I wanted to try to reduce our overhead as much as possible. Reducing memory copies looked like a straightforward improvement. Setting aside the i18n YAML loading, I noticed that RubyGems will intern a few small substrings repeatedly. The impact of that is much harder to measure. I'm assuming doing less work will yield better results in this case. |
Previously, we always extracted the precise range of bytes we needed when looking up an entry in the frozen string table. If an entry already existed, we'd discard that extracted range in favor of what was already in the cache. This change defers making a copy of the string's bytes until we need to insert into the cache. For situations with many cache hits this approach can be much faster.
…ble. By restructuring to work with `InternalByteArray`, we can now support working with native strings as well. The `InternalByteArray` will make a copy of the native memory into a Java `byte[]`, which won't change behind our backs.
… cache key to save memory.
b8efd14
to
35817af
Compare
import org.truffleruby.core.encoding.RubyEncoding; | ||
import org.truffleruby.core.encoding.TStringUtils; | ||
|
||
public final class TBytesKey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you document here or somewhere else why we support offset and length? (to avoid extra copying for substrings during lookup, but not when inserting a new entry, ref #makeCacheable)
return offset == 0 && length == bytes.length; | ||
} | ||
|
||
public TBytesKey makeCacheable(boolean isImmutable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you document this? (it creates a perfect fit key, so it does not hold on any extra bytes that it does not need when inserting a new entry to avoid leaking bytes outside the substring)
Feedback address in #3216. I messed up the PR management and GitHub won't let me re-open this PR. |
This PR improves the Ruby string interning process by allowing look-ups using substrings without having to extract their bytes. It also switches to using TruffleString's
InternalByteArray
, allowing string interning to work on native strings (I haven't run into a case requiring this yet).