Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circular shifts #6

Closed
simonbyrne opened this issue Jun 3, 2016 · 11 comments
Closed

Circular shifts #6

simonbyrne opened this issue Jun 3, 2016 · 11 comments

Comments

@simonbyrne
Copy link

You had a comment about moving circular shifts to assembly. One alternative is to use llvmcall:

function llvm_rotr(x::UInt64, y::UInt64)
    Base.llvmcall("""
                %3 = lshr i64 %0, %1
                %4 = sub i64 64, %1
                %5 = shl i64 %0, %4
                %6 = or i64 %3, %5
                ret i64 %6
              """,UInt64, Tuple{UInt64,UInt64},x,y)
end

then using 0.4, or 0.5 with -O3 (which is how packages are precompiled), gives the appropriate rorq instruction.

@simonbyrne
Copy link
Author

simonbyrne commented Jun 3, 2016

This seems to work as desired on 0.5 (with -O3), but not on 0.4

function rotr(x,y)
    s = sizeof(x) << 3
    (x >> (y % s)) | (x << (-y % s))
end

@sunoru
Copy link
Member

sunoru commented Jun 3, 2016

Thanks, Simon. The former one seems very nice, I'll use this.

BTW there's a typo in the codes that it should be %4 = sub i64 63, %1

@simonbyrne
Copy link
Author

Why would it be 63, not 64?

@simonbyrne
Copy link
Author

simonbyrne commented Jun 3, 2016

I would suggest the latter one: the generality is well worth it, and hopefully 0.5 should be released by the time we're performance tuning. llvmcall is usually the last resort sort of thing.

@sunoru
Copy link
Member

sunoru commented Jun 3, 2016

Oh I just found myself wrong.

Okay, it's reasonable to take the second function.

@sunoru
Copy link
Member

sunoru commented Jun 3, 2016

Yesterday did you mean that I should just write

@inline function pcg_rotr(x::UIntTypes, y::UIntTypes)
    s = sizeof(x) << 3
    (x >> (y % s)) | (x << (-y % s))
end

instead of the current @eval form?

@simonbyrne
Copy link
Author

Yes, exactly: @eval gets you nothing in this case (each combination of concrete UInt types will still be JIT-ed).

You could even do this for a lot of the other functions, as it seems like the constants can mostly be computed (and so will be evaluated at compile time).

@simonbyrne
Copy link
Author

simonbyrne commented Jun 3, 2016

e.g. for XSH RR, you could include in the function

    s = sizeof(state) << 3
    t = trailing_zeros(s)-1 # log2(s)-1
    p1 = (s>>1 + t)>>1
    p2 = s>>1 - t
    p3 = s-t

which the JIT is smart enough to evaluate as constants.

@simonbyrne
Copy link
Author

Though for 128 bits, this gives a p1 as 35, not 29 as in your code: where did you get the constants from?

@sunoru
Copy link
Member

sunoru commented Jun 3, 2016

I copied the constants in the PCG sources. http://www.pcg-random.org/download.html

@simonbyrne
Copy link
Author

Ah, I see. I opened an issue here:
imneme/pcg-c-basic#7

I guess we see what the author says.

@sunoru sunoru closed this as completed Sep 14, 2016
sunoru pushed a commit that referenced this issue Jan 28, 2021
?-branching also for float32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants