Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement BaselineCompiler #224

Closed
GoogleCodeExporter opened this issue Aug 9, 2015 · 51 comments
Closed

Implement BaselineCompiler #224

GoogleCodeExporter opened this issue Aug 9, 2015 · 51 comments

Comments

@GoogleCodeExporter
Copy link
Contributor

Per jandem,

Subject: Re: off bug from 857845

Hi,

Baseline and IonMonkey share the same MacroAssembler and frame infrastructure. 
Baseline is an easier target mainly because it's a much simpler compiler. Many 
Ion instructions, Ion bailouts/invalidation, etc all require platform-dependent 
code. With Baseline we have *some* platform-dependent code (see 
ion/arm/Baseline*) and some files are shared with Ion, but most of it is shared 
(see BaselineCompiler.cpp, BaselineIC.cpp). Also unlike Ion and JM, Baseline 
code will never bailout or be invalidated.

So I think I'd concentrate on Baseline first (run the shell with --no-jm 
--no-ion). Ion will still be compiled since we don't have separate Ion/Baseline 
#ifdef's, so you will probably want to use JS_NOT_REACHED(...) for anything 
that's Ion-only so that you can implement it later.

What you need for Baseline is:

(*) MacroAssembler-*.{h, cpp}. The x86/x64 macro-assemblers are built on top of 
the JSC assemblers. We don't reuse JSC's macro-assemblers, but we do reuse the 
lower-level instruction-emitting code. The ARM assembler was written from 
scratch I think.

(*) Trampolines:

- EnterJIT is used to enter both Ion and Baseline code. Initially you're only 
interested in the case where type == EnterJitBaseline. For baseline code, the 
trampoline also handles OSR from the interpreter: it reserves size for the 
frame by updating the stack pointer, calls into the VM to initialize that frame 
based on the interpreter StackFrame, then jumps into the JIT code. If I were 
you I'd get the non-OSR case working first, because it's simpler. To do that 
you can run with --baseline-eager and add the following to CanEnterBaselineJIT:

        if (JSOp(*cx->regs().pc) == JSOP_LOOPENTRY)
            return Method_Skipped;

- Invalidator: Ion-only, just add JS_NOT_REACHED or something.

- Arguments rectifier: Used by both Ion and Baseline. If we have a function 
call where the callee has more formal arguments than the caller passes in, the 
JITs call the arguments-rectifier instead. It just pushes some |undefined| 
values and re-pushes the other arguments on top of it (arguments are pushed 
last-to-first).

- Bailout thunk: Ion-only.

- Bailout handler: Ion-only.

- generateVMWrapper: generates some wrapper code Baseline and Ion use when 
calling into C++.

- generatePreBarrier: used by both Baseline and Ion for incremental GC.

- generateDebugTrapHandler: used by Baseline only, in debug mode. It gets 
called when a bytecode op has a breakpoint, or step mode is enabled. Not too 
complicated, but initially I'd just add a check for 
cx->compartment->debugMode() to CanEnterBaselineJIT and implement it after 
everything else works.

(*) Bailouts-*.cpp, CodeGenerator-*.cpp, Lowering-*, MoveEmitter-*, LIR-*: all 
Ion-only.

(*) Architecture-*.h, BaselineRegisters-*.h: you will need to implement these.

(*) IonFrames-*.h you will need, I think you can just copy IonFrames-arm.h We 
should probably combine it with x86/x64 at some point since I don't think 
there's much/any difference.

That's the high-level view. Let me know if you want me to go into anything in 
particular. You can use IONFLAGS=help to see which spew channels are available 
for baseline code. You probably want bl-scripts at least to show which scripts 
are compiled.

Jan

The overwhelming good news is that most of this has already been written.

Original issue reported on code.google.com by [email protected] on 17 May 2013 at 4:38

@GoogleCodeExporter
Copy link
Contributor Author

Original comment by [email protected] on 17 May 2013 at 4:39

@GoogleCodeExporter
Copy link
Contributor Author

Original comment by [email protected] on 19 May 2013 at 9:30

@GoogleCodeExporter
Copy link
Contributor Author

All Baseline code imported into working set. Finished the register allocation. 
Next, ICs and helpers.

Original comment by [email protected] on 28 May 2013 at 3:55

@GoogleCodeExporter
Copy link
Contributor Author

ICs and helpers done.

Original comment by [email protected] on 31 May 2013 at 5:36

@GoogleCodeExporter
Copy link
Contributor Author

Based on https://bug831507.bugzilla.mozilla.org/attachment.cgi?id=755436 we 
need to find all the parts of BC that for ARM push LR because PPC needs to also 
save LR somewhere.

Original comment by [email protected] on 1 Jun 2013 at 3:00

@GoogleCodeExporter
Copy link
Contributor Author

Status update:

- Builds and links.
- Generates bailout thunks, labels sort of work, linking sort of works.
- Generates trampoline (don't know if the trampoline is *correct* but it does 
emit it to memory accurately).
- Asserts during codegen while converting JSOPs; seems to be something about 
how I have the registers set up.

Original comment by [email protected] on 16 Jul 2013 at 8:22

@GoogleCodeExporter
Copy link
Contributor Author

(incidentally we are running with --baseline-eager --no-ti --no-ion)

Original comment by [email protected] on 16 Jul 2013 at 8:22

@GoogleCodeExporter
Copy link
Contributor Author

Status update:

- Codegen succeeds.
- Trampoline appears correct (still not sure about nargs).
- Crashes during execution of the compiled script here:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x000003e5
0x00f93ee0 in ?? ()
(gdb) disas 0xf93e80 0xf93f00
Dump of assembler code from 0xf93e80 to 0xf93f00:
0x00f93e80:     .long 0x0
0x00f93e84:     .long 0x0
0x00f93e88:     .long 0x0
0x00f93e8c:     .long 0x0
0x00f93e90:     .long 0x0
0x00f93e94:     .long 0x0
0x00f93e98:     .long 0x0
0x00f93e9c:     .long 0x2b38038
0x00f93ea0:     stwu    r6,-4(r1)
0x00f93ea4:     stwu    r5,-4(r1)
0x00f93ea8:     stwu    r15,-4(r1)
0x00f93eac:     stwu    r14,-4(r1)
0x00f93eb0:     stwu    r15,-4(r1)
0x00f93eb4:     stwu    r14,-4(r1)
0x00f93eb8:     stwu    r6,-4(r1)
0x00f93ebc:     stwu    r5,-4(r1)
0x00f93ec0:     stwu    r7,-4(r1)
0x00f93ec4:     mr      r5,r13
0x00f93ec8:     addi    r5,r5,-44
0x00f93ecc:     stwu    r5,-4(r1)
0x00f93ed0:     mr      r3,r1
0x00f93ed4:     addi    r3,r3,4
0x00f93ed8:     subf    r3,r1,r3
0x00f93edc:     addi    r4,r3,-24
0x00f93ee0:     stw     r4,-28(r13) <<< HERE
0x00f93ee4:     rlwinm  r3,r3,4,0,27
0x00f93ee8:     ori     r3,r3,1
0x00f93eec:     stwu    r3,-4(r1)
0x00f93ef0:     mflr    r8
0x00f93ef4:     stwu    r8,-4(r1)
0x00f93ef8:     b       0xf93f00
0x00f93efc:     li      r0,4

Annotated trace attached.

Original comment by [email protected] on 25 Jul 2013 at 3:44

Attachments:

@GoogleCodeExporter
Copy link
Contributor Author

- Fixed calls to link properly at the codegen and executableCopy levels.
- Fixed endian error in unboxing.

To do:
Our VM wrapper is calling a bogus address and crashing:

[Codegen] == linkExitFrame ==
[Codegen] bfffdf78 --- lis r12,198 (0xc60000)
[Codegen] bfffdf7c --- ori r12,r12,57384 (0xe028)
[Codegen] bfffdf80 --- stw sp,0(r12)
[Codegen] #label     ((12))
[Codegen] bfffdf84 --- lis r0,-1 (0xffff0000)
[Codegen] bfffdf88 --- ori r0,r0,65535 (0xffff)
[Codegen] bfffdf8c --- stwu r0,-4(sp)
[Codegen] == push(imm) ==
[Codegen] bfffdf90 --- lis r0,139 (0x8b0000)
[Codegen] bfffdf94 --- ori r0,r0,26216 (0x6668)
[Codegen] bfffdf98 --- stwu r0,-4(sp)
[Codegen] == movePtr(immw, reg) ==
[Codegen] bfffdf9c --- lis r3,198 (0xc60000)
[Codegen] bfffdfa0 --- ori r3,r3,57344 (0xe000)
[Codegen] == loadPtr(adr, reg) ==
[Codegen] == load32(adr, reg) ==
[Codegen] bfffdfa4 --- lwz r3,44(r3)
[Codegen] bfffdfa8 --- addi r10,sp,16 (0x10)
[Codegen] == setupUnalignedABICall ==
[Codegen] == movePtr(immw, reg) ==
[Codegen] bfffdfac --- lis r3,198 (0xc60000)
[Codegen] bfffdfb0 --- ori r3,r3,57344 (0xe000)
[Codegen] == loadPtr(adr, reg) ==
[Codegen] == load32(adr, reg) ==
[Codegen] bfffdfb4 --- lwz r3,44(r3)
[Codegen] == passABIArg(mo) ==
state, passABIArg: gprs 0 fprs 0
[Codegen] == passABIArg(mo) ==
state, passABIArg: gprs 1 fprs 0
[Codegen] == passABIArg(mo) ==
state, passABIArg: gprs 2 fprs 0
[Codegen] == passABIArg(mo) ==
state, passABIArg: gprs 3 fprs 0
[Codegen] == callWithABI ==
[Codegen] == add32(imm, reg, reg) ==
[Codegen] bfffdfb8 --- addi r6,r10,8 (0x8)
[Codegen] bfffdfbc --- lwz r5,4(r10)
[Codegen] == add32(imm, reg, reg) ==
[Codegen] bfffdfc0 --- addi r4,r10,0 (0x0)
[Codegen] == freeStack(u32) ==
[Codegen] bfffdfc4 --- andi. r0,sp,15 (0xf)
[Codegen] bfffdfc8 --- or r16,sp,sp
[Codegen] bfffdfcc --- subf sp,r0,sp
[Codegen] bfffdfd0 --- subi sp,sp,256
[Codegen] == checkStackAlignment() ==
[Codegen] bfffdfd4 --- andi. r0,sp,15 (0xf)
[Codegen] bfffdfd8 --- bc 12, 2, 8
[Codegen] ##setInt32 (bfffdfd8 -> ffffffff)

[Codegen] bfffdfe0 --- trap
[Codegen] #label     ((108))
[Codegen] ##linkPendedJump @ bfffdfd8 bc bo=12 bi=2

[Codegen] ##link2    ((0xbfffdfd8)) jumps to ((0xbfffdfe4))
[Codegen] #label     ((108))
[Codegen] bfffdfe4 --- bl .+8
[Codegen] ##link2    ((0xbfffdfe4)) jumps to ((0xc05887a8))

The 0xc... function is totally wrong. I think it's an absolute address that 
somehow got treated as an offset? It's the only thing that makes sense here.

Original comment by [email protected] on 26 Jul 2013 at 4:53

@GoogleCodeExporter
Copy link
Contributor Author

Fixed by changing EmitCallVM. The first VM wrapped call goes to

js::ion::DefVarOrConst (cx=0x1e097a0, dn=<incomplete type>, attrs=45236288, 
scopeChain=<incomplete type>) at 
/Volumes/BruceDeuce/src/mozilla-24a/js/src/ion/VMFunctions.cpp:112

scopeChain is in Baseline R0 (!= r0). I need to look at what carries the cx. It 
doesn't seem to need access to the stack directly.

Original comment by [email protected] on 26 Jul 2013 at 6:06

@GoogleCodeExporter
Copy link
Contributor Author

Status update:

Testing --baseline-eager --no-ti --no-ion -e 'var i=0'

- We pass the first VM call completely and correctly (DEFVAR).
- BINDGNAME and ZERO appear to be no-ops in the test case.
- We crash in the fourth (SETGNAME), which internally maps to SETPROP.
- We are not patching the branch in the stub, instead off by four ahead. This 
isn't the problem, but needs to be fixed.
- The crash occurs within the stub fallback:

static bool
DoSetPropFallback(JSContext *cx, BaselineFrame *frame, ICSetProp_Fallback 
*stub, HandleValue lhs,
                  HandleValue rhs, MutableHandleValue res)
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000069
js::ion::BaselineFrame::script (this=0x41) at BaselineFrame.h:320
320             return flags_ & EVAL;
1: x/i $pc  0x1680e0 <_ZNK2js3ion13BaselineFrame6scriptEv+20>:  lwz     
r0,40(r3)
(gdb) q


Original comment by [email protected] on 4 Aug 2013 at 11:56

@GoogleCodeExporter
Copy link
Contributor Author

Prologue looks like this:

6272    {
1: x/i $pc  0x3d4e0c <DoSetPropFallback>:       mflr    r0
(gdb) disas 0x3d4e0c 0x3d4f00
Dump of assembler code from 0x3d4e0c to 0x3d4f00:
0x003d4e0c <DoSetPropFallback+0>:       mflr    r0
0x003d4e10 <DoSetPropFallback+4>:       stw     r20,-48(r1)
0x003d4e14 <DoSetPropFallback+8>:       bcl-    20,4*cr7+so,0x3d4e18 
<DoSetPropFallback+12>
0x003d4e18 <DoSetPropFallback+12>:      mr      r20,r8
0x003d4e1c <DoSetPropFallback+16>:      mfcr    r12
0x003d4e20 <DoSetPropFallback+20>:      stw     r21,-44(r1)
0x003d4e24 <DoSetPropFallback+24>:      mr      r21,r7
0x003d4e28 <DoSetPropFallback+28>:      stw     r25,-28(r1)
0x003d4e2c <DoSetPropFallback+32>:      mr      r25,r6
0x003d4e30 <DoSetPropFallback+36>:      stw     r0,8(r1)
0x003d4e34 <DoSetPropFallback+40>:      stw     r27,-20(r1)
0x003d4e38 <DoSetPropFallback+44>:      mr      r27,r5
0x003d4e3c <DoSetPropFallback+48>:      stw     r29,-12(r1)
0x003d4e40 <DoSetPropFallback+52>:      mr      r29,r3
0x003d4e44 <DoSetPropFallback+56>:      mr      r3,r4
0x003d4e48 <DoSetPropFallback+60>:      stw     r30,-8(r1)
0x003d4e4c <DoSetPropFallback+64>:      stw     r31,-4(r1)
0x003d4e50 <DoSetPropFallback+68>:      mflr    r31
0x003d4e54 <DoSetPropFallback+72>:      stw     r18,-56(r1)
0x003d4e58 <DoSetPropFallback+76>:      stw     r19,-52(r1)
0x003d4e5c <DoSetPropFallback+80>:      stw     r22,-40(r1)
0x003d4e60 <DoSetPropFallback+84>:      stw     r23,-36(r1)
0x003d4e64 <DoSetPropFallback+88>:      stw     r24,-32(r1)
0x003d4e68 <DoSetPropFallback+92>:      stw     r26,-24(r1)
0x003d4e6c <DoSetPropFallback+96>:      stw     r28,-16(r1)
0x003d4e70 <DoSetPropFallback+100>:     stw     r12,4(r1)
0x003d4e74 <DoSetPropFallback+104>:     stwu    r1,-416(r1)

Original comment by [email protected] on 7 Aug 2013 at 4:55

@GoogleCodeExporter
Copy link
Contributor Author

- We now get almost to the end, but we end up looping to death because LR is 
not set right.

We will have to implement something like how ARM does it:

- masm.ret() should pop(lr) blr (something like retn() but change retn() to use 
LR, not CTR)
- masm.blr() should be only where we need a literal blr
- Create a masm.m_call_push and use it where ARM uses ma_callIonHalfPush (this 
pushes pc and then blx's to the register, setting LR; we should push pc and 
then use m_call())
- Note that EmitCallVM in ARM uses masm.call() not masm.ma_callIonHalfPush
- callWithABI does save LR but not to the stack (so keep using r18 like we are)
- generateVMWrapper does not push PC (it was probably pushed BEFORE the call)

Original comment by [email protected] on 10 Aug 2013 at 7:01

@GoogleCodeExporter
Copy link
Contributor Author

Also, we should look at their masm.call() and masm.branch()

Original comment by [email protected] on 10 Aug 2013 at 7:07

@GoogleCodeExporter
Copy link
Contributor Author

Reworked calls to simply use pushed return addresses on stack, except for 
Baseline, which uses LR (just like ARM), and adjusted the frame descriptor. 
This gets us through all the code and back into the ABI trampoline epilogue, 
where the routine exits successfully but asserts in the js shell. I think we 
have a bad context, which makes sense since we assert in an earlier debugging 
zone. We need to look at how the VMWrapper handles that.

Original comment by [email protected] on 13 Aug 2013 at 6:04

@GoogleCodeExporter
Copy link
Contributor Author

The context is correct, ionTop is correct. We're not recovering the script from 
the frame correctly. The script pointer is the callee token (r7 on entry) with 
the bottom two bits off (which should be 0x01 == script). 

IonFrame_Exit = 8
cx = 01e097a0

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00f848a0 in ?? ()
(gdb) set $pc+=4
(gdb) i reg r7
r7             0x2b28099        45252761
(gdb) cont
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00f9bca8 in ?? ()
(gdb) set $pc+=4
(gdb) cont
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00f99f18 in ?? ()
(gdb) x/32w $r1
0xbfffeb20: //  0x00f9bd70      0x00000401 <<BC 0x029006a0      0x00000005
0xbfffeb30:     0x02b24040      0xbfffeb64      0x00625ba4      0x90105248
0xbfffeb40:     0xbfffeba0      0x00000002      0x00000030      0x02b24040
0xbfffeb50:     0xbfffeba0      0x00000000      0x01a44f7c      0x00000000
0xbfffeb60:     0x00000000      0xbfffeb80  //  0x00f84a50      0x00000103 
<<Entry
0xbfffeb70:   >>0x02b28099<<    0x00000001      0xffffff87      0x02b24040
0xbfffeb80: //  0xbfffebf0      0xbfffece8      0xbfffeb90      0x01a44da0
0xbfffeb90:     0xbfffebf0      0xbfffec64      0xbfffeba0      0x00165364
(gdb) q

The Ion frames are correct at least as far as size; the stack seems sensible. 
So is it not fetching the right value?

Original comment by [email protected] on 14 Aug 2013 at 4:14

@GoogleCodeExporter
Copy link
Contributor Author

Bad EmitTailCallVM: the descriptor was being computed wrong.
We had an off-by-one error in the trampoline epilogue.

And now ... BaselineCompiler built and run its first script successfully. Yay!

Original comment by [email protected] on 16 Aug 2013 at 4:27

  • Changed state: Started

@GoogleCodeExporter
Copy link
Contributor Author

Tests that work so far:

var i=0
print(3)
print("hello world")
var i=2;i=i+3;if(i==5)print("five");else print("not five");

However, the inline cache code is not working, as evidenced by
var i;for(i=0;i<5;i++){print("loop"); } print(i);

This prints
loop
1
but it should print
loop
loop
loop
loop
loop
5

Not sure if it's failing the test in the inline cache or needs a bitflip or 
what.


Original comment by [email protected] on 17 Aug 2013 at 6:09

@GoogleCodeExporter
Copy link
Contributor Author

Declaring stage 3. We're about halfway to passing V8, but it will be easier to 
simply attack the test suite.

Original comment by [email protected] on 26 Aug 2013 at 2:26

@GoogleCodeExporter
Copy link
Contributor Author

So far, passing about 75% of tests. From a partial run we are still failing 
these:

FAILURES:
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testExpressions.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testHeapAccess.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testZOOB.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/auto-regress/bug690650.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/baseline/bug857580.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/doMath.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/orNaNTest1.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/orNaNTest2.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/test586387.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/testLogicalNotNaN.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/testNot.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/testTypedArrayClamping.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/testTypedArrays.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/basic/truthies.js

690650 actually crashes. This is due to an unpatched branch, which is probably 
supposed to be patched by a code path we never actually get to (the branch is 
through setNextJump).

The asm.js failures may be endian related, so we will skip those. Let's get the 
basic/ tests to work, since those are straightforward.

Original comment by [email protected] on 12 Sep 2013 at 3:51

@GoogleCodeExporter
Copy link
Contributor Author

After some more work, including one endian bug on Mozilla's part that I worked 
around, we are now passing all but 30 tests:

FAILURES:
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testHeapAccess.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/auto-regress/bug690650.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/baseline/bug857580.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Environment-find-06.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Environment-identity-03.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-04.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-error-return.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-throw-return.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onStep-resumption-01.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/breakpoint-10.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/breakpoint-resume-03.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/onExceptionUnwind-resumption-01.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/onExceptionUnwind-resumption-02.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/onNewScript-03.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug804064.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug813784.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug851792.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/lookupswitch.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/typed-arrays-1.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/typed-arrays-3.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/smallIntTypedArrays.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/tableSwitchConst.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/tableSwitchDouble.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/tableSwitchFloat.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/testSetTypedIntArray.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/testTableSwitchX.js
    --debugjit /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/bug563000/test-throwhook-2.js
    --debugjit /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/bug563000/trap-force-return-1.js
    --debugjit /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/bug563000/trap-force-return-2.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/v8-v5/check-regexp.js

The V8 test actually crashes, as do most of the tests/debug ones. Let's start 
on jaeger/ next. We seem to have some recurrent issues with typed arrays also.

Original comment by [email protected] on 15 Sep 2013 at 4:31

@GoogleCodeExporter
Copy link
Contributor Author

ion/lookupswitch is another problem with failing to map the return address to 
the JS VM PC.

Original comment by [email protected] on 15 Sep 2013 at 4:40

@GoogleCodeExporter
Copy link
Contributor Author

The problem with the debug versions appears to be a toggled call right before a 
constant pool, such as

0x028ce1a0:     addi    r1,r1,-8
0x028ce1a4:     stw     r5,4(r1)
0x028ce1a8:     stw     r6,0(r1)
0x028ce1ac:     b       0x28ce1b8 << call toggled to skip; jumps into constant 
pool and traps
0x028ce1b0:     b       0x28d01b0
0x028ce1b4:     xoris   r0,r0,0   << constant pool marker
0x028ce1b8:     trap
0x028ce1bc:     trap

We probably need an ensureSpace for those.

Original comment by [email protected] on 15 Sep 2013 at 4:47

@GoogleCodeExporter
Copy link
Contributor Author

ensureSpace fixes debug/Environment-* and onNewScript-03 but debug/Frame-* all 
crash with various forms of memory corruption except
tests/debug/Frame-onStep-resumption-01.js:14:0 Error: Assertion failed: got 
9.78947059482068e-301, expected "pass"
which might be the same error in a less critical section.
We'll fix that one first.

Similar problem with debug/breakpoint-*:
tests/debug/breakpoint-10.js:20:0 Error: Assertion failed: got 
9.78947059482068e-301, expected 42 (resume-03: 20:4 expected "ok")

check-regexp is also jumping into the constant pool, but the fix above didn't 
repair that.

Original comment by [email protected] on 15 Sep 2013 at 5:23

@GoogleCodeExporter
Copy link
Contributor Author

Here's the problem with check-regexp:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x01dbea24 in ?? ()
(gdb) disas 0x1dbea00 0x1dbea30
Dump of assembler code from 0x1dbea00 to 0x1dbea30:
0x01dbea00:     stw     r0,-28(r13)
0x01dbea04:     li      r0,27457
0x01dbea08:     stwu    r0,-4(r1)
0x01dbea0c:     bl      0x1dbea10 <<< this sets LR. this is all part of a call.
0x01dbea10:     b       0x1dbffc0
0x01dbea14:     xoris   r0,r0,0
0x01dbea18:     trap
0x01dbea1c:     trap
0x01dbea20:     trap
0x01dbea24:     trap
0x01dbea28:     trap
0x01dbea2c:     trap
(gdb) disas 0x1dbffc0 0x1dbfff0
Dump of assembler code from 0x1dbffc0 to 0x1dbfff0:
0x01dbffc0:     mflr    r12       <<< LR is set to PPC_BRANCH_STANZA_LENGTH + 
x, inside the constant pool
0x01dbffc4:     addi    r12,r12,20
0x01dbffc8:     stwu    r12,-4(r1)
0x01dbffcc:     bl      0xf9ae50
0x01dbffd0:     li      r0,14520
0x01dbffd4:     lwz     r13,0(r1)
0x01dbffd8:     addi    r1,r1,4
0x01dbffdc:     li      r6,-121
0x01dbffe0:     mr      r5,r3
0x01dbffe4:     lwz     r3,-24(r13)
0x01dbffe8:     lwz     r14,8(r3)
0x01dbffec:     nop

So anything setting a LR-relative branch back needs to have an ensureSpace as 
well. A big one.

Original comment by [email protected] on 15 Sep 2013 at 5:32

@GoogleCodeExporter
Copy link
Contributor Author

Now passing V8 and SunSpider! We'll attack the memory corruption and typed 
array problems tomorrow.

Original comment by [email protected] on 15 Sep 2013 at 6:05

@GoogleCodeExporter
Copy link
Contributor Author

We were using the wrong register in generateDebugTrapHandler. That's what I get 
for simplemindedly translating ARM to PowerPC. After that, the LR fixes and 
some insurance around GCPtrs, we are down to 19 failures:

    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testHeapAccess.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/auto-regress/bug690650.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-04.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-error-return.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-throw-return.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/onExceptionUnwind-resumption-01.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/onExceptionUnwind-resumption-02.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug804064.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug813784.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug851792.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/typed-arrays-1.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/typed-arrays-3.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/smallIntTypedArrays.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/tableSwitchConst.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/tableSwitchDouble.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/tableSwitchFloat.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/testSetTypedIntArray.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/testTableSwitchX.js
    --debugjit /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/bug563000/test-throwhook-2.js

Let's start with the typed array and table switch failures next.

Original comment by [email protected] on 15 Sep 2013 at 10:59

@GoogleCodeExporter
Copy link
Contributor Author

The problem with typed arrays was an intermittent error in clamping to uint8. 
Replaced with a fast branchless version; now all typed array tests pass. 15 
failures to go.

Original comment by [email protected] on 16 Sep 2013 at 3:16

@GoogleCodeExporter
Copy link
Contributor Author

Current failures:

    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testHeapAccess.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/auto-regress/bug690650.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-04.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-error-return.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/debug/Frame-onPop-throw-return.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug813784.js
    --debugjit /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/jaeger/bug563000/test-throwhook-2.js

Original comment by [email protected] on 22 Sep 2013 at 4:26

@GoogleCodeExporter
Copy link
Contributor Author

The problem with the debug tests is that the debug handler throws to an *ABI 
compliant* subroutine. It happens to be another JIT prologue, but it has no 
parameters, so the prologue ends up committing a memory fault. We need to 
either find the actual JIT entry point and go there, or figure out a way to 
populate the registers as if the call were coming from ABI compliant code. 
We're looking for something like this in the stack:

(gdb) i reg r3 r4 r5 r6 r7 r8 r9 r10
r3             0xfa81d0 16417232
r4             0x1      1
r5             0xbffff128       3221221672
r6             0x0      0
r7             0x2b28099        45252761
r8             0x2b24040        45236288
r9             0x1      1
r10            0xbfffefa0       3221221280

Original comment by [email protected] on 22 Sep 2013 at 6:26

@GoogleCodeExporter
Copy link
Contributor Author

by JIT entry point I mean Ion code, so that the call becomes Ion ABI->Ion ABI 
instead of Ion ABI->(unprotected edge)->OS X ABI

Original comment by [email protected] on 22 Sep 2013 at 6:26

@GoogleCodeExporter
Copy link
Contributor Author

We were whacking our own stack frame (and clobbering the return address) in the 
forced return handler. All the debug tests now pass.

Original comment by [email protected] on 22 Sep 2013 at 9:18

@GoogleCodeExporter
Copy link
Contributor Author

We're now down to these.

    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/asm.js/testHeapAccess.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/auto-regress/bug690650.js
    /Volumes/BruceDeuce/src/mozilla-24a/js/src/jit-test/tests/ion/bug813784.js

testHeapAccess is pretty clearly an endian problem. methodjit doesn't pass it 
either, but we should make sure it fails in the same way.

I have no idea what the hell is going on with 690650 and no easy way to reduce 
it right now.

I have 813784 reduced to

/* Test funapply when argument vector has less args than callee and callee known
 */
function apply_fun1(a, b, c) {  }
function funapply3() { return apply_fun1.apply({}, arguments); }
funapply3(1,2);
print("apply_fun1");

/* Test funapply when argument vector has less args than callee and callee unkno
wn */
var fun;
function apply_fun2(a, b, c) { }
function funapply4() { return fun.apply({}, arguments); }

fun = apply_fun1;
funapply4(1,2);
print("apply_fun1-2");
fun = apply_fun2;
funapply4(1,2);
print("dundun");

It must be in this order (the print()s may be omitted). It gets a wrong callee 
token, so we must be doing something wrong with the stack somewhere.

Original comment by [email protected] on 22 Sep 2013 at 9:25

@GoogleCodeExporter
Copy link
Contributor Author

Minimized 690650 to

var tryCatch = " try{}catch(e){}";
for (var i = 0; i < 13; ++i) { // see below
  tryCatch += tryCatch;
}

eval("\
    (function() {\
" + tryCatch + "\
    })\
")()

It crashes with bad instruction at 13, and then overflows the branch system at 
14 (hits our offset assert with an offset > 4MB). 12 and below work fine. I 
think this is highly unlikely code but I don't know how to guard against it 
though yet. For 13 iterations, the problem is that the instruction doesn't 
actually get patched. In fact, it's waiting for a patch that never comes, so 
it's not an exploitable crash (the generated placeholder byte uses -1 offset 
for a sui generis patch and so is always 0xfffffc00).

Original comment by [email protected] on 23 Sep 2013 at 12:42

@GoogleCodeExporter
Copy link
Contributor Author

We're going to wontfix 690650 for the moment. If we get any illegal instruction 
crashes of that sort, we'll attribute it to that. I'll patch the test so that 
it fails safely with an assertEq("x",null) or something instead of crashing.

813784 passes if we prevent TryAttachFunApplyStub in BaselineIC.cpp from 
attaching a stub (i.e., return true immediately). This is not a great solution, 
but it implies that most of the code works, and it's shippable. The stub is not 
generated for the first call. The crashing sequence always has to be 1-2; 1-1 
or 2-1 or 2-2 works, but 1-1-2 will crash on 2.

Original comment by [email protected] on 23 Sep 2013 at 3:24

@GoogleCodeExporter
Copy link
Contributor Author

Actually, let's go with that. mxr indicates that this stub changes dramatically 
by Fx26. I can't find anything wrong with our generated stub a/t/m, and it 
might be an issue with Baseline codegen.

Original comment by [email protected] on 23 Sep 2013 at 3:49

@GoogleCodeExporter
Copy link
Contributor Author

Now for G5 branching. While running V8 in 970-generated code, we get

% ../../../obj-ff-dbg/dist/bin/js --no-ion -f run.js
Richards: 205
DeltaBlue: 184
Crypto: 318
Assertion failure: ((*where & 0xF4000000) == 0x40000000) || ((*where & 
0xFC1F0000) == 0x38000000), at 
/Volumes/BruceDeuce/src/mozilla-24a/js/src/ion/ppcosx/Assembler-ppc.cpp:308

Original comment by [email protected] on 23 Sep 2013 at 5:21

@GoogleCodeExporter
Copy link
Contributor Author

Also, these numbers despite being G5-optimized (in theory) are slightly worse 
than the 7450 baseline w/o mcrxr, which got 215, 193 and 354.


Original comment by [email protected] on 23 Sep 2013 at 5:25

@GoogleCodeExporter
Copy link
Contributor Author

Assertion clear. Interestingly, although those three tests are worse, the 
others are somewhat to significantly higher. The G5 on G5 tuned code is about 
5-10% faster overall, which is consistent with methodjit.

G5 passes all tests as well.

Original comment by [email protected] on 24 Sep 2013 at 1:46

@GoogleCodeExporter
Copy link
Contributor Author

I'm considering altering branch marking to put the branch options into a 
separate array that gets passed back. Then the offset can be a full 32-bits, 
the same as x86. That would solve 690650, though I think we can ship with it.

Original comment by [email protected] on 26 Sep 2013 at 9:30

@GoogleCodeExporter
Copy link
Contributor Author

--baseline-eager --no-ion --debugjit, this fails (tests/basic/functionnames.js):

var Foo = function (){
    print(displayName(arguments.callee));
    return function(){};
}();
print(displayName(Foo));

bruce:/home/spectre/src/bruce/mozilla-24b/js/src/jit-test/% 
../../../obj-ff-dbg/dist/bin/js --baseline-eager --no-ion --debugjit -f 
~/badbadd.js
Foo<
Foo
bruce:/home/spectre/src/bruce/mozilla-24b/js/src/jit-test/% 
../../../obj-ff-dbg/dist/bin/js --baseline-eager --no-ion -f ~/badbadd.js
Foo<
Foo</<

The latter is the correct output.

Original comment by [email protected] on 27 Sep 2013 at 5:55

@GoogleCodeExporter
Copy link
Contributor Author

The browser starts, but we have to fix 813784. Google causes the JIT to crash 
with a similar signature and it's probably the same underlying problem.

For 690650, I'm thinking modifying labels to carry the bi/bo bits for us.

Original comment by [email protected] on 7 Oct 2013 at 4:05

@GoogleCodeExporter
Copy link
Contributor Author

Analyzing 813784 a little more. Here is the minimized test case with debugging 
prints:

function apply_fun1(a, b, c) {  }
function funapply3() { return apply_fun1.apply({}, arguments); }
funapply3(1,2);
print("apply_fun1");

var fun;
function apply_fun2(a, b, c) { }
function funapply4() { return fun.apply({}, arguments); }

fun = apply_fun1;
funapply4(1,2);
print("apply_fun1-2");
fun = apply_fun2;
funapply4(1,2);
print("dundun");

We generate four JSFunctions. They look like this, with the native blocks 
marked:

(gdb) x/32w 0x2b33a00
< apply_fun1 >
0x2b33a00:      0x02b23118      0x02b21160      0x00000000      0x0089f360
0x2b33a10:      0x00030001    >>0x02b281a8      0x02b24040      0x0291afa0
< funapply3 >
0x2b33a20:      0x02b23118      0x02b21160      0x00000000      0x0089f360
0x2b33a30:      0x00000001    >>0x02b28120      0x02b24040      0x0291afc0
< apply_fun2 >
0x2b33a40:      0x02b23118      0x02b21160      0x00000000      0x0089f360
0x2b33a50:      0x00031000    >>0x02b38070      0x02b24040      0x0291afe0
< funapply4 >
0x2b33a60:      0x02b23118      0x02b21160      0x00000000      0x0089f360
0x2b33a70:      0x00000001    >>0x02b28230      0x02b24040      0x0291c020

apply_fun2's block is *way* off. I can't find where it even got compiled when 
the stub is run.

The main "function" has a native block at 0x2b28098.

Somehow the Call_ScriptedApplyArguments stub does not trigger the build of 
apply_fun2, and so it fails because the function was never fully constructed. 
We need to look at why it doesn't at least try to run it in the interpreter.

Original comment by [email protected] on 7 Oct 2013 at 4:22

@GoogleCodeExporter
Copy link
Contributor Author

Looking at each native block, all of them have a valid pointer to generated Ion 
code at block+128 except, of course, apply_fun2. That native block is 
incomplete and ends after 96 bytes. Comparison of apply_fun1 and apply_fun2:

(gdb) x/36w 0x2b281a8
0x2b281a8:      0x02b364f0      0x01e21d60      0x00030000      0x01e1a668
0x2b281b8:      0x01e21d60      0x01e1a670      0x0204d400      0x00000000
0x2b281c8:      0x01e22e10      0x02b250a0      0x02b33a00      0x00000000
0x2b281d8:      0x00000001      0x0000000c      0x00000002      0x00000013
0x2b281e8:      0x00000000      0x00000000      0x00000067      0x00000075
0x2b281f8:      0x00000001      0x00000003      0x00000000      0x000000b9
0x2b28208:      0x00000000      0x00000000      0x00010008      0x00000000
0x2b28218:      0x00000000      0x01e22e60      0x00000000      0x00000000
0x2b28228:      0x00ccb9c0      0x00ccb9c0      0x02b364f0      0x01e2317c

(gdb) x/36w 0x2b38070
0x2b38070:      0x00000000      0x02b33a40      0x00000000      0x02b250a0
0x2b38080:      0x00000000      0x00000000      0xb9000000      0x00000000
0x2b38090:      0x0000014f      0x0000015c      0x00000009      0x00000013
0x2b380a0:      0x02b28230      0x02b33a60      0x00000000      0x02b250a0
0x2b380b0:      0x01e18040      0x00000000      0xb9000002      0x00000002
0x2b380c0:      0x0000016f      0x00000196      0x0000000a      0x00000012
0x2b380d0:      0xdadadada      0xdadadada      0xdadadada      0xdadadada
0x2b380e0:      0xdadadada      0xdadadada      0xdadadada      0xdadadada
0x2b380f0:      0xdadadada      0xdadadada      0xdadadada      0xdadadada

Notice we end in uninitialized memory. In addition, the object format is 
totally different. We're not doing something right with this in the stub.

Original comment by [email protected] on 7 Oct 2013 at 4:25

@GoogleCodeExporter
Copy link
Contributor Author

And there it is. For the three "working" functions, their flag and nargs word 
is 0xXXXX0001 (XXXX is the nargs which is irrelevant). For apply_fun2, the 
flag-nargs is 0xXXXX1000. jsfun.h says this is a lazily interpreted script, 
which makes sense. So we need to look at the guard to find out why it passed 
this, and I suspect the reason is an endian problem.

Original comment by [email protected] on 7 Oct 2013 at 5:06

@GoogleCodeExporter
Copy link
Contributor Author

It sure is. We hacked around the two JS_STATIC_ASSERT(IS_LITTLE_ENDIAN);s in 
IonMacroAssembler.h. Now we're going to have to fix those, because they are 
indeed assuming the bit is in a little endian type position.

Original comment by [email protected] on 7 Oct 2013 at 5:11

@GoogleCodeExporter
Copy link
Contributor Author

Repaired by endian fix. This is being posted in TenFourFox 24.

Original comment by [email protected] on 8 Oct 2013 at 2:33

@GoogleCodeExporter
Copy link
Contributor Author

Fixed a problem with the wrong label for toggled calls getting pushed forward 
by ensureSpace().

The only remaining JIT issue I'm aware of is 690650. This crash has not been 
triggered by anything but we should armour ourselves against it.

Then we need to optimize the shizznick out of the guards and try to do 
straightline code such as testStringTruthy, etc.

Original comment by [email protected] on 9 Oct 2013 at 3:04

@GoogleCodeExporter
Copy link
Contributor Author

Well, LOL. Trying to post a blog post triggers 690650. That'll teach me.

So we need to fix that next. We will modify the label class to carry the BI/BO 
bits, and use a full 32-bit displacement like x86 does.

Original comment by [email protected] on 9 Oct 2013 at 3:09

@GoogleCodeExporter
Copy link
Contributor Author

Actually, did this using a hashtable where the jump branch offset is the key 
for the original branch instruction, and nextJump() restores it as it iterates 
through the targets for the label. 690650 passes. Let's see if we can fix the 
bug in comment 41 and we can consider this working.

Original comment by [email protected] on 10 Oct 2013 at 1:56

@GoogleCodeExporter
Copy link
Contributor Author

Ah, screw it.

Followup bugs if needed.

Original comment by [email protected] on 11 Oct 2013 at 4:25

  • Changed state: Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant