Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy-propagation for single-use memory load #115

Open
l-kent opened this issue Dec 11, 2024 · 4 comments
Open

Copy-propagation for single-use memory load #115

l-kent opened this issue Dec 11, 2024 · 4 comments

Comments

@l-kent
Copy link

l-kent commented Dec 11, 2024

#114 disabled copy propagation for all memory loads, so a temporary variable is now created for every single memory load. This creates a significant amount of redundant assignments. This also means that the BASIL translation will have to be modified to account for this to remove the extra layer of redundant assignments - we already ensure that loads are in their own statement to smooth things for the analyses, so having ASLp also do this now creates an extra set of temporary variable assignments.

before #114:

"address": "0x00000610 (1552)",
"assembly": "ldr x1, [sp]",
"semantics": [
  "Stmt_Assign(LExpr_Array(LExpr_Var(\"_R\"),1),Expr_TApply(\"Mem.read.0\",[8],[Expr_Var(\"SP_EL0\");8;0]))"
]

after #114:

"address": "0x00000610 (1552)",
"assembly": "ldr x1, [sp]",
"semantics": [
  "Stmt_ConstDecl(Type_Bits(64),\"Exp14__5\",Expr_TApply(\"Mem.read.0\",[8],[Expr_Var(\"SP_EL0\");8;0]))",
  "Stmt_Assign(LExpr_Array(LExpr_Var(\"_R\"),1),Expr_Var(\"Exp14__5\"))"
]

Fixing the case where a single load was copied multiple times was very beneficial, but it would be ideal if it were still possible to eliminate redundant temporary variables in the most common case where a load is only used a single time.

@ncough
Copy link
Collaborator

ncough commented Dec 11, 2024

Apologies for the unannounced change here, but we are increasingly interested in producing a clear representation of instruction semantics that facilitates analysis, rather than generating the most compact representation possible. As a result, we are enforcing the invariant that memory loads exist only in their own constant declarations. I'm not entirely sure how this impacts the BASIL translation but I'd hope this invariant simplifies things.

@l-kent
Copy link
Author

l-kent commented Dec 11, 2024

It ends up complicating things a little at present, but it's not too bad.

I understand the idea here, but I don't really understand the point of keeping temporary variables even in this case where they are completely redundant. I completely understand for cases where the result of the load has to be extended to fit the register size though.

Since you are now in effect enforcing memory loads as separate statements, would it make sense to enforce this syntactically too? This would be clearer for BASIL to deal with. Would it make sense to do something like making Mem.read a Stmt_TCall with the lhs as a parameter?

So instead of:

"Stmt_ConstDecl(Type_Bits(64),\"Exp14__5\",Expr_TApply(\"Mem.read.0\",[8],[Expr_Var(\"SP_EL0\");8;0]))",
"Stmt_Assign(LExpr_Array(LExpr_Var(\"_R\"),1),Expr_Var(\"Exp14__5\"))"

It would be something like

"Expr_TCall(\"Mem.read.0\",[8],LExpr_Array(LExpr_Var(\"_R\"),1), [Expr_Var(\"SP_EL0\");8;0]))"

or

"Stmt_VarDeclsNoInit(Type_Bits(64),[\"Exp14__5\"])"",
"Expr_TCall(\"Mem.read.0\",[8],Expr_Var(\"Exp14__5\"), [Expr_Var(\"SP_EL0\");8;0]))"
"Stmt_Assign(LExpr_Array(LExpr_Var(\"_R\"),1),Expr_Var(\"Exp14__5\"))"

but I would prefer not to have that extra temporary declaration unless it is needed because there is a zero extend of the load result or something.

The advantage as I see it from this sort of approach is that the invariant is enforced by the syntax which would particularly simplify things for BASIL.

@ncough
Copy link
Collaborator

ncough commented Dec 22, 2024

I don't really understand the point of keeping temporary variables even in this case where they are completely redundant

This is a fair point. The current outcome is simply the result of the partial evaluation process, which creates these constants for impure operations to simplify subsequent reasoning/transformations. Addressing this limited case will require additional logic in copy prop or perhaps an additional pass. I'll see what I can do.

would it make sense to enforce this syntactically too?

I understand the goal here and would like to enable this, but there is no nice way to represent this with valid ASL (without reintroducing In/Out arguments). We benefit greatly from producing valid ASL, as it keeps the representation executable and simplifies composition.

Alternatively, we could consider a new IR that is produced simply for the purposes of parsing. This could likely encode many of the invariants we currently enforce. But this will take some time to fully explore. Also, it's not clear if we would retain any of this through the BAP lifter.

@l-kent
Copy link
Author

l-kent commented Dec 23, 2024

That all makes sense, thanks for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants