Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SpanHelpers.SequenceCompareTo for String.CompareOrdinal #111586

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

huoyaoyuan
Copy link
Member

Helps reusing centralized code under SpanHelpers. There shouldn't be much change for performance, just ensuring no regression.

Benchmark:

        [Arguments("Foo", "FooBar")]
        [Arguments("Foo", "Bar")]
        [Benchmark]
        public int Compare(string left, string right) => string.Compare(left, right, StringComparison.Ordinal);

        [Arguments("Foo", "FooBar")]
        [Arguments("Foo", "Bar")]
        [Benchmark]
        public int CompareOridinal(string left, string right) => string.CompareOrdinal(left, right);
Method Job Toolchain left right Mean Error StdDev Ratio RatioSD
Compare Job-RZZOCA \PR\corerun.exe Foo Bar 0.5722 ns 0.0135 ns 0.0113 ns 0.78 0.02
Compare Job-YVLKAM \main\corerun.exe Foo Bar 0.7296 ns 0.0162 ns 0.0151 ns 1.00 0.03
CompareOridinal Job-RZZOCA \PR\corerun.exe Foo Bar 0.5390 ns 0.0155 ns 0.0145 ns 0.98 0.04
CompareOridinal Job-YVLKAM \main\corerun.exe Foo Bar 0.5521 ns 0.0192 ns 0.0180 ns 1.00 0.05
Compare Job-RZZOCA \PR\corerun.exe Foo FooBar 1.2961 ns 0.0053 ns 0.0050 ns 0.89 0.01
Compare Job-YVLKAM \main\corerun.exe Foo FooBar 1.4558 ns 0.0159 ns 0.0149 ns 1.00 0.01
CompareOridinal Job-RZZOCA \PR\corerun.exe Foo FooBar 1.0877 ns 0.0083 ns 0.0077 ns 0.99 0.02
CompareOridinal Job-YVLKAM \main\corerun.exe Foo FooBar 1.0951 ns 0.0194 ns 0.0182 ns 1.00 0.02

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 19, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 19, 2025
@huoyaoyuan huoyaoyuan added area-System.Runtime and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 19, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Jan 19, 2025

You should test perf for longer strings as well.

CompareOrdinalHelper implementation is specifically tuned for string alignment that is only going to show up on longer strings https://github.com/dotnet/runtime/pull/111586/files#diff-79247fffe0432073a5ddd0de692f4beaf015c0c483dd311a921393d05568a3e4L119-L122

@huoyaoyuan
Copy link
Member Author

Method Job Toolchain left right Mean Error StdDev Ratio RatioSD
Compare Job-HORPTE \PR\corerun.exe A qui(...)y dog [39] A qui(...)y hog [39] 2.6878 ns 0.0301 ns 0.0267 ns 1.13 0.01
Compare Job-BZQZHU \main\corerun.exe A qui(...)y dog [39] A qui(...)y hog [39] 2.3763 ns 0.0175 ns 0.0155 ns 1.00 0.01
CompareOridinal Job-HORPTE \PR\corerun.exe A qui(...)y dog [39] A qui(...)y hog [39] 2.5022 ns 0.0301 ns 0.0282 ns 1.32 0.02
CompareOridinal Job-BZQZHU \main\corerun.exe A qui(...)y dog [39] A qui(...)y hog [39] 1.8914 ns 0.0072 ns 0.0064 ns 1.00 0.00
Compare Job-HORPTE \PR\corerun.exe aaaa(...)aaa1 [1001] aaaa(...)aaa2 [1001] 29.3619 ns 0.1420 ns 0.1329 ns 0.89 0.01
Compare Job-BZQZHU \main\corerun.exe aaaa(...)aaa1 [1001] aaaa(...)aaa2 [1001] 33.0267 ns 0.4837 ns 0.4525 ns 1.00 0.02
CompareOridinal Job-HORPTE \PR\corerun.exe aaaa(...)aaa1 [1001] aaaa(...)aaa2 [1001] 28.1294 ns 0.2841 ns 0.2658 ns 0.72 0.01
CompareOridinal Job-BZQZHU \main\corerun.exe aaaa(...)aaa1 [1001] aaaa(...)aaa2 [1001] 39.2964 ns 0.2811 ns 0.2629 ns 1.00 0.01
Compare Job-HORPTE \PR\corerun.exe Foo Bar 0.5410 ns 0.0209 ns 0.0196 ns 0.72 0.03
Compare Job-BZQZHU \main\corerun.exe Foo Bar 0.7480 ns 0.0132 ns 0.0123 ns 1.00 0.02
CompareOridinal Job-HORPTE \PR\corerun.exe Foo Bar 0.3586 ns 0.0023 ns 0.0020 ns 0.65 0.01
CompareOridinal Job-BZQZHU \main\corerun.exe Foo Bar 0.5543 ns 0.0142 ns 0.0126 ns 1.00 0.03
Compare Job-HORPTE \PR\corerun.exe Foo FooBar 1.2951 ns 0.0055 ns 0.0052 ns 1.02 0.01
Compare Job-BZQZHU \main\corerun.exe Foo FooBar 1.2675 ns 0.0147 ns 0.0123 ns 1.00 0.01
CompareOridinal Job-HORPTE \PR\corerun.exe Foo FooBar 1.1112 ns 0.0083 ns 0.0070 ns 0.99 0.01
CompareOridinal Job-BZQZHU \main\corerun.exe Foo FooBar 1.1188 ns 0.0145 ns 0.0135 ns 1.00 0.02

Well there is regression at ~40 length, where SIMD should start to kick in.
Given SequenceCompareTo is already used by other overloads, I think it's better to investigate the performance gap here.

@huoyaoyuan
Copy link
Member Author

@EgorBot -intel -amd

using System;
using System.Collections.Generic;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace BenchmarkGround
{
    public class Bench
    {
        public IEnumerable<object[]> Args =>
            [
                ["Foo", "FooBar"],
                ["Foo", "Bar"],
                ["A quick brown fox jumps over a lazy dog", "A quick brown fox jumps over a lazy hog"],
                [new string([.. Enumerable.Repeat('a', 1000), '1']), new string([.. Enumerable.Repeat('a', 1000), '2'])],
            ];

        [Benchmark]
        [ArgumentsSource(nameof(Args))]
        public int Compare(string left, string right) => string.Compare(left, right, StringComparison.Ordinal);

        [Benchmark]
        [ArgumentsSource(nameof(Args))]
        public int CompareOridinal(string left, string right) => string.CompareOrdinal(left, right);
    }
}

@EgorBo
Copy link
Member

EgorBo commented Jan 23, 2025

@huoyaoyuan I was wondering if I should add that ability to define simple benchmarks without classes too 🤔

@huoyaoyuan
Copy link
Member Author

@huoyaoyuan I was wondering if I should add that ability to define simple benchmarks without classes too 🤔

I was making more mistakes for example missing usings. It may not be trivial to solve. Having a solid requirement should be intuitive.

Actually I'm getting help from bot just because benchmark in Windows Terminal over RDP looks unstable.

@huoyaoyuan
Copy link
Member Author

The benchmark shows small regression on Cascade Lake/small length. It's similar to what I see on Raptor Lake.

What's the preference for the action here?

Copy link
Contributor

@xtqqczze xtqqczze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unsafe context can be removed from CompareOrdinalHelper

@EgorBo
Copy link
Member

EgorBo commented Jan 24, 2025

The benchmark shows small regression on Cascade Lake/small length. It's similar to what I see on Raptor Lake.

What's the preference for the action here?

Since it's Intel only it is most likely JCC Erratum

@@ -84,112 +84,43 @@ private static unsafe int CompareOrdinalHelper(string strA, string strB)
"For performance reasons, callers of this method should " +
"check/short-circuit beforehand if the first char is the same.");

int length = Math.Min(strA.Length, strB.Length);
// Check if the second chars are different here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this implementation effectively depend on this small wrapper to be inlined into the few callers for good perf / no regressions variety of inputs?

I am wondering whether it would be better to mark this small wrapper with AggressiveInlining and move the comparison of the first char (that's manually inlined today) into it as well.

// takes up 8 bytes on 32-bit, 12 on x64. For empty strings the null
// terminator immediately follows, leaving us with an object
// 10/14 bytes in size. Since everything needs to be a multiple
// of 4/8, this will get padded and zeroed out.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interesting that we are not doing any tricky casing like this for string Equals that is presumably much more common operation than string Compare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #111806 to experiment with microbenchmarks 🙂

Copy link
Member

@EgorBo EgorBo Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: is it actually legal to rely on padding after \0 in empty strings to be always zeroed? (the comment mentions it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Runtime community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants