Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format "Load/Store + PtrAdd" more carefully. #1533

Merged
merged 1 commit into from
Jan 2, 2025

Conversation

ltratt
Copy link
Contributor

@ltratt ltratt commented Dec 30, 2024

This commit does two things which might not seem immediately related:

  1. It only prints out instructions which aren't tombstones are equivalent.
  2. It formats loads/store that inline ptradds as load %x + y and *(%x + y) = z respectively.

In essence, before this commit we printed out all the "not used" PtrAdd instructions, which meant that one got things like:

; %2: ...
...
; %3: ptr = ptr_add %2, 4
; %4: i8 = load %3
mov ...

In other words, such PtrAdds didn't generate machine code. That's fine but it turns out that we also printed some non-PtrAdd-but-dead instructions too that the x64 backend's DCE had worked out. That's actively misleading: it meant that some optimisations could be misread as not being as effective as expected, because they would appear in jit-post-opt even though they didn't generate code.

Step #2 above changes the formatting so that the example above is now shown as the following in jit-asm output:

; %2: ...
...
; %4: i8 = load %2 + 4
mov ...

This means that the x64 backend produces JIT IR that isn't exactly parseable -- at least not right now. It's also different than jit-post-opt.

This initially felt a bit surprising to me, but then I realised that jit-asm really means "backend specific output" and suddenly it made more sense to me. The x64 backend is manipulating the trace IR in a way that lets it generate better code and once upon a time I briefly considered giving it a new IR before realising that I could repurpose the trace IR. So if one views the jit-asm output as "another IR that happens to mostly look the same as trace IR" it all makes sense -- at least to me!

@vext01
Copy link
Contributor

vext01 commented Jan 1, 2025

Happy new year!

It only prints out instructions which aren't tombstones are equivalent.

I didn't understand this part. Perhaps a typo?

jit-asm really means "backend specific output"

So if one views the jit-asm output as "another IR that happens to mostly look the same as trace IR" it all makes sense

jit-asm should just print the emitted assembler code? At least, that's what I initially designed it to do. I guess you can think of asm as an IR, or is there something deeper here?

This commit does two things which might not seem immediately related:

  1. It only prints out instructions which aren't tombstones or
     equivalent.
  2. It formats loads/store that inline ptradds as `load %x + y`
     and `*(%x + y) = z` respectively.

In essence, before this commit we printed out all the "not used"
`PtrAdd` instructions, which meant that one got things like:

```
; %2: ...
...
; %3: ptr = ptr_add %2, 4
; %4: i8 = load %3
mov ...
```

In other words, such `PtrAdd`s didn't generate machine code. That's fine
but it turns out that we also printed some non-`PtrAdd`-but-dead
instructions too that the x64 backend's DCE had worked out. That's
actively misleading: it meant that some optimisations could be misread
as not being as effective as expected, because they would appear in
`jit-post-opt` even though they didn't generate code.

Step ykjit#2 above changes the formatting so that the example above is now
shown as the following in `jit-asm` output:

```
; %2: ...
...
; %4: i8 = load %2 + 4
mov ...
```

This means that the x64 backend produces JIT IR that isn't exactly
parseable -- at least not right now. It's also different than
`jit-post-opt`.

This initially felt a bit surprising to me, but then I realised that
`jit-asm` really means "backend specific output" and suddenly it made
more sense to me. The x64 backend *is* manipulating the trace IR in a
way that lets it generate better code and once upon a time I briefly
considered giving it a new IR before realising that I could repurpose
the trace IR. So if one views the `jit-asm` output as "another IR that
happens to mostly look the same as trace IR" it all makes sense -- at
least to me!
@ltratt ltratt force-pushed the commenting_change branch from 107f7da to e0ebc85 Compare January 1, 2025 10:58
@ltratt
Copy link
Contributor Author

ltratt commented Jan 1, 2025

Force pushed a commit message tweak s/are/or/.

We should probably rename jit-asm to backend or something like that, because there's no need for the backend to even print assembly: it might want to tell us completely different things.

@vext01
Copy link
Contributor

vext01 commented Jan 2, 2025

Ok, I see what you are saying now.

I found this surprising too.

The x64 backend is manipulating the trace IR

One question I have for the future is: should there be a platform specific optimisation "pass" that runs after generic optimisations, but before the backend is invoked. This way the platform specific IR optimisations are explicit, rather than having the backend do covert optimisations and have special IR knowledge.

Maybe we could introduce a new jit-post-backend-opt phase?

LLVM does something similar. The MIR starts generic and eventually becomes platform specific later in the pipeline.

(Obviously I'm not saying we should do any of this now, just ruminating)

@ltratt
Copy link
Contributor Author

ltratt commented Jan 2, 2025

Such a phase is inherently backend specific, so it can't be generic.

@vext01
Copy link
Contributor

vext01 commented Jan 2, 2025

I think you misunderstood.

I'll merge this and we can discuss on my return.

@vext01 vext01 added this pull request to the merge queue Jan 2, 2025
Merged via the queue into ykjit:master with commit c82a074 Jan 2, 2025
2 checks passed
@ltratt
Copy link
Contributor Author

ltratt commented Jan 2, 2025

I think I understood this bit:

should there be a platform specific optimisation "pass" that runs after generic optimisations, but before the backend is invoked

but what I'm saying is: that's kind of a contradiction in terms in our setup. Whatever happens after opt/mod.rs is the backend: if the backend wants to do optimisations of its own, it can do so, or not (and if it wants 27 phases, it can do that too). At a high level we don't gain anything (at least, not so far as I can currently see) by dictating to the backend what it can/should do.

FWIW, I wish we had a better term for this than "backend" or "codegen". I'm almost inclined to say something like

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants