Circular shifts #6

simonbyrne · 2016-06-03T07:37:47Z

You had a comment about moving circular shifts to assembly. One alternative is to use llvmcall:

function llvm_rotr(x::UInt64, y::UInt64)
    Base.llvmcall("""
                %3 = lshr i64 %0, %1
                %4 = sub i64 64, %1
                %5 = shl i64 %0, %4
                %6 = or i64 %3, %5
                ret i64 %6
              """,UInt64, Tuple{UInt64,UInt64},x,y)
end

then using 0.4, or 0.5 with -O3 (which is how packages are precompiled), gives the appropriate rorq instruction.

The text was updated successfully, but these errors were encountered:

simonbyrne · 2016-06-03T08:37:51Z

This seems to work as desired on 0.5 (with -O3), but not on 0.4

function rotr(x,y)
    s = sizeof(x) << 3
    (x >> (y % s)) | (x << (-y % s))
end

sunoru · 2016-06-03T10:53:54Z

Thanks, Simon. The former one seems very nice, I'll use this.

BTW there's a typo in the codes that it should be %4 = sub i64 63, %1

simonbyrne · 2016-06-03T11:17:04Z

Why would it be 63, not 64?

simonbyrne · 2016-06-03T11:18:20Z

I would suggest the latter one: the generality is well worth it, and hopefully 0.5 should be released by the time we're performance tuning. llvmcall is usually the last resort sort of thing.

sunoru · 2016-06-03T11:43:35Z

Oh I just found myself wrong.

Okay, it's reasonable to take the second function.

sunoru · 2016-06-03T11:49:21Z

Yesterday did you mean that I should just write

@inline function pcg_rotr(x::UIntTypes, y::UIntTypes)
    s = sizeof(x) << 3
    (x >> (y % s)) | (x << (-y % s))
end

instead of the current @eval form?

simonbyrne · 2016-06-03T12:04:31Z

Yes, exactly: @eval gets you nothing in this case (each combination of concrete UInt types will still be JIT-ed).

You could even do this for a lot of the other functions, as it seems like the constants can mostly be computed (and so will be evaluated at compile time).

simonbyrne · 2016-06-03T12:31:22Z

e.g. for XSH RR, you could include in the function

    s = sizeof(state) << 3
    t = trailing_zeros(s)-1 # log2(s)-1
    p1 = (s>>1 + t)>>1
    p2 = s>>1 - t
    p3 = s-t

which the JIT is smart enough to evaluate as constants.

simonbyrne · 2016-06-03T12:35:14Z

Though for 128 bits, this gives a p1 as 35, not 29 as in your code: where did you get the constants from?

sunoru · 2016-06-03T14:55:41Z

I copied the constants in the PCG sources. http://www.pcg-random.org/download.html

simonbyrne · 2016-06-03T14:58:39Z

Ah, I see. I opened an issue here:
imneme/pcg-c-basic#7

I guess we see what the author says.

?-branching also for float32

sunoru closed this as completed Sep 14, 2016

sunoru pushed a commit that referenced this issue Jan 28, 2021

Merge pull request #6 from milankl/mk/randfloat

89595a0

?-branching also for float32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Circular shifts #6

Circular shifts #6

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 •

edited

Loading

sunoru commented Jun 3, 2016 •

edited

Loading

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 •

edited

Loading

sunoru commented Jun 3, 2016

sunoru commented Jun 3, 2016 •

edited

Loading

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 •

edited

Loading

simonbyrne commented Jun 3, 2016

sunoru commented Jun 3, 2016

simonbyrne commented Jun 3, 2016

Circular shifts #6

Circular shifts #6

Comments

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 • edited Loading

sunoru commented Jun 3, 2016 • edited Loading

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 • edited Loading

sunoru commented Jun 3, 2016

sunoru commented Jun 3, 2016 • edited Loading

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 • edited Loading

simonbyrne commented Jun 3, 2016

sunoru commented Jun 3, 2016

simonbyrne commented Jun 3, 2016

simonbyrne commented Jun 3, 2016 •

edited

Loading

sunoru commented Jun 3, 2016 •

edited

Loading

simonbyrne commented Jun 3, 2016 •

edited

Loading

sunoru commented Jun 3, 2016 •

edited

Loading

simonbyrne commented Jun 3, 2016 •

edited

Loading