Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit optimizations in IntegerMultiplyDecomposer.cpp #7352

Open
0xdaryl opened this issue May 30, 2024 · 2 comments
Open

Revisit optimizations in IntegerMultiplyDecomposer.cpp #7352

0xdaryl opened this issue May 30, 2024 · 2 comments

Comments

@0xdaryl
Copy link
Contributor

0xdaryl commented May 30, 2024

The IntegerMultiplyDecomposer [1] is an optimization in the x86 backend to strength-reduce integer multiplies into cheaper forms and leveraging LEA instructions as much as possible. This code has not been updated much at all in the past 20 years despite significant evolution in Intel and AMD architectures and varying recommendations on the use of LEA instructions in certain circumstances.

This code should be revisited and its optimization decisions reconsidered in the context of modern Intel and AMD microarchitectures.

[1] https://github.com/eclipse/omr/blob/master/compiler/x/codegen/IntegerMultiplyDecomposer.cpp

@BradleyWood
Copy link
Contributor

@0xdaryl I don't think this code is actually exercised that much because tree simplification will perform the decomposition first most of the time.

@0xdaryl
Copy link
Contributor Author

0xdaryl commented Dec 9, 2024

The point of the integer multiply decomposer was to leverage sequences of LEAs, adds, shifts, etc to efficiently decompose multiplications of constants into cheaper, equivalent instructions to the multiply.

Does tree simplification prevent decomposition into those sequences because it thinks it is doing the right thing, and is the generated code from the tree simplifier any better? Are those instruction sequences from the integer multiply decomposer still optimal on modern x86 hardware (they were determined years ago on early Intel, perhaps P4)? If yes and yes, then perhaps the tree simplifier needs to ask the codegens for guidance before transforming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants