-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for parsing extended assembly code including asm goto
#161
base: develop
Are you sure you want to change the base?
Conversation
asm goto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR, it is nice to see CIL (and in turn Goblint) getting support for this feature!
On top of the inline comments, here's a few remarks:
- It would be good to include additional tests for the new features.
- We will internally discuss what to do with
zrapp
- We should leave a comment somewhere that for the purpose of computing the successors, we do not consider the
goto
s that could happen in inline assembly so this is clear to everyone. Probably even in the documentation.
src/ext/zrapp/availexps.ml
Outdated
@@ -243,11 +243,14 @@ let eh_handle_inst i eh = | |||
(eh_kill_mem eh; | |||
eh_kill_addrof_or_global eh; | |||
eh) | |||
(* todo: didn't find location where stmts are handled *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to look up zrapp
again to decide what to do there.
|
The dummyInstr can be safely deleted from cil.mli I think, as after grepping for it inside gobling-analyzer and goblint-cil no matches were found. Otherwise I have no clue on how to implement a dummy instruction using either set | vardecl | call. |
src/frontc/cprint.ml
Outdated
print ":"; space (); | ||
print_commas false print_string clobs | ||
print_commas false print_string clobs; | ||
if labels_exist then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you still need the if here, the match should catch all cases.
I think has not been addressed yet, if I see correctly? |
I think it would still be useful to keep it, as there are other users of goblint-cil. Maybe you can add a vardecl instruction for a new dummy variable? I think this is the best option. |
Also, feel free to mark suggestions that you have implemented and there's nothing more to discuss as resolved, that makes it easier for us to keep track of what remains still TODO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of TODOs with commented out Asm
instr
cases that need to be moved to Asm
stmtkind
cases. It's probably that they're included in some default match cases, which is why addition of Asm
to stmtkind
didn't cause compilation errors there.
src/cil.mli
Outdated
(** A instr to serve as a placeholder *) | ||
(* todo: maybe this can be removed? *) | ||
(* | ||
val dummyInstr: instr | ||
*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't any no-op instr
though, that's why Asm
was abused for it. It's not needed by anyone in the OCaml ecosystem, so I would be fine with removing it until #145 can provide a replacement no-op instruction.
@@ -6517,6 +6550,7 @@ and succpred_stmt s fallthrough rlabels = | |||
| hd :: tl -> link s hd ; | |||
succpred_block b fallthrough rlabels | |||
end | |||
| Asm _ -> trylink s fallthrough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Asm
now includes labels, shouldn't CFG construction do something with those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see below.
| Asm (_, _, outs, ins, _, _, _) -> | ||
List.iter (fun (_, _, exp) -> match exp with | ||
| Lval(Var v, _) -> varUsed := VS.add v !varUsed | ||
| _ -> () | ||
) ins; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this different than the Asm
code removed from above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It now only adds read variables, which was a request from our supervisors if I remember correctly.
src/cfg.ml
Outdated
@@ -225,6 +225,7 @@ and cfgStmt (s: stmt) (next:stmt option) (break:stmt option) (cont:stmt option) | |||
cfgBlock blk (Some s) next (Some s) nodeList rlabels | |||
(* Since all loops have terminating condition true, we don't put | |||
any direct successor to stmt following the loop *) | |||
| Asm _ -> addOptionSucc next (* the extra edges are added inside goblint's cfg *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done here for consistency because everything else is done here. Otherwise the support appears broken by anyone else wanting to use CIL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unclear how to insert such edges, as these are only edges that are taken potentially and such a ASM block has several successors and jumps to these locations may happen at arbitrary locations within the inline assembly.
I cannot think of a general one-size-fits-all approach to include edges here, it depends on what one wants to do. In the case of Goblint over-approximating by taking any of the edges after the block has executed is what we want -- other tools might to inspect what goes on inside the assembly block or may want to underapproximate reachability etc.
I think the solution is proposed here is the best we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the handling of computed goto-s already that way, though? I think CIL overapproximates their targets as well, yielding more than maybe necessary.
Following in the same spirit, it would make sense to use the overapproximation here as well. That way CIL's CFG is at least sound, i.e. it includes all possible control flows. Otherwise, it's neither sound nor complete, which is very ambiguous and unreliable. Downstream tools can always be more precise about it and exclude some targets if they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? I can find no reference for this. I was under the impression that CIL assigns each label that has it's address taken an integer and then computes with these and goes to a jump table to decide where to jump to. The resulting CFG is still a "normal" CFG that corresponds to a C program.
Even if we insert these edges into the CFG, the resulting CFG would not be sound: Jumps from inline Assembly do not necessarily happen at the end or the beginning of an inline ask block. To obtain a CFG that is actually sound, we would have to look into the ASM block to insert edges appropriately, which I don't think we want to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? I can find no reference for this. I was under the impression that CIL assigns each label that has it's address taken an integer and then computes with these and goes to a jump table to decide where to jump to. The resulting CFG is still a "normal" CFG that corresponds to a C program.
But that is an overapproximation! Instead of computing which exact labels a computed goto might jump to, it overapproximates that to all the labels. Of course not all computed gotos can jump to all those targets in the concrete, but that's fine: some edges are simply never used.
It would be similar for ASM gotos: have all the possible targets (which are explicitly listed by the user) in the CFG, but some of those might never be used. Without analyzing the assembly, we just cannot be any more precise.
Which exact instruction does which exact jumps is also a matter of precision, as opposed to having all of them jump to everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was decided to have a flag to toggle this behavior on or off at Gobcon.
The currently still failing task is |
The error message is
|
/* Generated by Goblint-CIL v. %%VERSION_NUM%% */
/* print_CIL_Input is false */
//#line 3 "testharness.h"
extern int printf(char const *format , ...) ;
//#line 12
extern void exit(int ) ;
//#line 3 "asm_goto3.c"
void code(void)
{
{
//#line 5
__asm__ goto ("nop": : : : );
//#line 7
printf((char const *)"Success\n");
//#line 7
exit(0);
}
} This the produced C program. |
And the error message is
|
See #81