Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for parsing extended assembly code including asm goto #161

Open
wants to merge 27 commits into
base: develop
Choose a base branch
from

Conversation

WernerDrasche
Copy link

@WernerDrasche WernerDrasche commented Jan 11, 2024

See #81

@michael-schwarz michael-schwarz linked an issue Jan 12, 2024 that may be closed by this pull request
@michael-schwarz michael-schwarz changed the title practical-asm Support for parsing extended assembly code including asm goto Jan 13, 2024
Copy link
Member

@michael-schwarz michael-schwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR, it is nice to see CIL (and in turn Goblint) getting support for this feature!

On top of the inline comments, here's a few remarks:

  • It would be good to include additional tests for the new features.
  • We will internally discuss what to do with zrapp
  • We should leave a comment somewhere that for the purpose of computing the successors, we do not consider the gotos that could happen in inline assembly so this is clear to everyone. Probably even in the documentation.

src/main.ml Outdated Show resolved Hide resolved
src/frontc/frontc.mli Outdated Show resolved Hide resolved
src/frontc/cprint.ml Outdated Show resolved Hide resolved
src/frontc/cprint.ml Outdated Show resolved Hide resolved
src/frontc/cprint.ml Outdated Show resolved Hide resolved
src/ext/liveness/usedef.ml Outdated Show resolved Hide resolved
src/ext/liveness/usedef.ml Outdated Show resolved Hide resolved
src/ext/syntacticsearch/funcVar.ml Outdated Show resolved Hide resolved
src/frontc/cabs.ml Outdated Show resolved Hide resolved
@@ -243,11 +243,14 @@ let eh_handle_inst i eh =
(eh_kill_mem eh;
eh_kill_addrof_or_global eh;
eh)
(* todo: didn't find location where stmts are handled *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to look up zrapp again to decide what to do there.

@michael-schwarz michael-schwarz added this to the 2.0.4 milestone Jan 13, 2024
@WernerDrasche
Copy link
Author

  • The main.ml changes were unintentionally forgotten testing utilities.
  • asmlabelslst_ne is not used in first step because we semantically distinguish between:
    • labels field was provided (Some [...])
    • labels field was omitted (None)
  • the check asm function ensures that goto attribute <=> labels field (otherwise we would need state in the parser).
    it doesn't return unit because for whatever reason when I write in cparser.mly inside asm rule:
    { let asm = ASM ($1 ....) in checkAsm asm; asm } dune build fails with a syntax error in cparser.ml (can't really debug that because it's generated.
  • as cil cfg is only used to make the goblint cfg, we don't do anything special for the cil one
  • ASM a does not work for some reason (replying to the comment in email)
  • I don't know what to do for the dummy instruction (do we need to introduce a new instruction type for that?)

@WernerDrasche
Copy link
Author

The dummyInstr can be safely deleted from cil.mli I think, as after grepping for it inside gobling-analyzer and goblint-cil no matches were found. Otherwise I have no clue on how to implement a dummy instruction using either set | vardecl | call.

print ":"; space ();
print_commas false print_string clobs
print_commas false print_string clobs;
if labels_exist then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you still need the if here, the match should catch all cases.

@michael-schwarz
Copy link
Member

It would be good to include additional tests for the new features.

I think has not been addressed yet, if I see correctly?

src/cil.ml Outdated Show resolved Hide resolved
@michael-schwarz
Copy link
Member

The dummyInstr can be safely deleted from cil.mli I think, as after grepping for it inside gobling-analyzer and goblint-cil no matches were found. Otherwise I have no clue on how to implement a dummy instruction using either set | vardecl | call.

I think it would still be useful to keep it, as there are other users of goblint-cil. Maybe you can add a vardecl instruction for a new dummy variable? I think this is the best option.

@michael-schwarz
Copy link
Member

Also, feel free to mark suggestions that you have implemented and there's nothing more to discuss as resolved, that makes it easier for us to keep track of what remains still TODO.

@sim642 sim642 self-requested a review February 5, 2024 15:39
Copy link
Member

@sim642 sim642 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of TODOs with commented out Asm instr cases that need to be moved to Asm stmtkind cases. It's probably that they're included in some default match cases, which is why addition of Asm to stmtkind didn't cause compilation errors there.

src/cil.mli Outdated
Comment on lines 1755 to 1759
(** A instr to serve as a placeholder *)
(* todo: maybe this can be removed? *)
(*
val dummyInstr: instr
*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't any no-op instr though, that's why Asm was abused for it. It's not needed by anyone in the OCaml ecosystem, so I would be fine with removing it until #145 can provide a replacement no-op instruction.

src/cil.ml Outdated Show resolved Hide resolved
@@ -6517,6 +6550,7 @@ and succpred_stmt s fallthrough rlabels =
| hd :: tl -> link s hd ;
succpred_block b fallthrough rlabels
end
| Asm _ -> trylink s fallthrough
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Asm now includes labels, shouldn't CFG construction do something with those?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below.

Comment on lines +197 to +201
| Asm (_, _, outs, ins, _, _, _) ->
List.iter (fun (_, _, exp) -> match exp with
| Lval(Var v, _) -> varUsed := VS.add v !varUsed
| _ -> ()
) ins;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this different than the Asm code removed from above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now only adds read variables, which was a request from our supervisors if I remember correctly.

src/cfg.ml Outdated
@@ -225,6 +225,7 @@ and cfgStmt (s: stmt) (next:stmt option) (break:stmt option) (cont:stmt option)
cfgBlock blk (Some s) next (Some s) nodeList rlabels
(* Since all loops have terminating condition true, we don't put
any direct successor to stmt following the loop *)
| Asm _ -> addOptionSucc next (* the extra edges are added inside goblint's cfg *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done here for consistency because everything else is done here. Otherwise the support appears broken by anyone else wanting to use CIL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear how to insert such edges, as these are only edges that are taken potentially and such a ASM block has several successors and jumps to these locations may happen at arbitrary locations within the inline assembly.

I cannot think of a general one-size-fits-all approach to include edges here, it depends on what one wants to do. In the case of Goblint over-approximating by taking any of the edges after the block has executed is what we want -- other tools might to inspect what goes on inside the assembly block or may want to underapproximate reachability etc.
I think the solution is proposed here is the best we can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the handling of computed goto-s already that way, though? I think CIL overapproximates their targets as well, yielding more than maybe necessary.

Following in the same spirit, it would make sense to use the overapproximation here as well. That way CIL's CFG is at least sound, i.e. it includes all possible control flows. Otherwise, it's neither sound nor complete, which is very ambiguous and unreliable. Downstream tools can always be more precise about it and exclude some targets if they want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I can find no reference for this. I was under the impression that CIL assigns each label that has it's address taken an integer and then computes with these and goes to a jump table to decide where to jump to. The resulting CFG is still a "normal" CFG that corresponds to a C program.

Even if we insert these edges into the CFG, the resulting CFG would not be sound: Jumps from inline Assembly do not necessarily happen at the end or the beginning of an inline ask block. To obtain a CFG that is actually sound, we would have to look into the ASM block to insert edges appropriately, which I don't think we want to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I can find no reference for this. I was under the impression that CIL assigns each label that has it's address taken an integer and then computes with these and goes to a jump table to decide where to jump to. The resulting CFG is still a "normal" CFG that corresponds to a C program.

But that is an overapproximation! Instead of computing which exact labels a computed goto might jump to, it overapproximates that to all the labels. Of course not all computed gotos can jump to all those targets in the concrete, but that's fine: some edges are simply never used.

It would be similar for ASM gotos: have all the possible targets (which are explicitly listed by the user) in the CFG, but some of those might never be used. Without analyzing the assembly, we just cannot be any more precise.
Which exact instruction does which exact jumps is also a matter of precision, as opposed to having all of them jump to everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was decided to have a flag to toggle this behavior on or off at Gobcon.

@michael-schwarz
Copy link
Member

The currently still failing task is CC=gcc make test/asm_goto3.

@michael-schwarz
Copy link
Member

The error message is

Entering directory '/home/michael/Documents/goblint-cil/cil'
cd ./small1; /home/michael/Documents/goblint-cil/cil/_build/default/test/../bin/cilly --mode=GNUCC --decil --save-temps  --strictcheck --stats --nomerge --commPrintLn \
       -c -std=gnu90 -D_GNUCC -Wall -g -ggdb -D_DEBUG -S -o asm_goto3.s asm_goto3.c
First CIL check
Timings:
TOTAL                           0.001 s
  parse                           0.000 s
  convert to CIL                  0.000 s
  printCIL                        0.000 s
Timing used
Memory statistics: total=5.10MB, max=1.02MB, minor=4.93MB, major=0.91MB, promoted=0.74MB
    minor collections=4  major collections=0 compactions=0
./asm_goto3.cil.c: In function ‘code’:
./asm_goto3.cil.c:15:30: error: expected identifier before ‘)’ token
   __asm__  goto   ("nop": : : : );
                              ^
Makefile:175: recipe for target 'test/asm_goto3' failed
make: *** [test/asm_goto3] Error 1

@michael-schwarz
Copy link
Member

/* Generated by Goblint-CIL v. %%VERSION_NUM%% */
/* print_CIL_Input is false */

//#line  3 "testharness.h"
extern int printf(char const   *format  , ...) ;
//#line  12
extern void exit(int  ) ;
//#line  3 "asm_goto3.c"
void code(void) 
{ 


  {
//#line  5
  __asm__  goto   ("nop": : : : );
//#line  7
  printf((char const   *)"Success\n");
//#line  7
  exit(0);
}
}

This the produced C program.

@michael-schwarz
Copy link
Member

And the error message is

gcc small1/asm_goto3.cil.c
small1/asm_goto3.cil.c: In function ‘code’:
small1/asm_goto3.cil.c:15:33: error: expected identifier before ‘)’ token
   __asm__  goto   ("nop": : : : );
                                 ^

@sim642 sim642 modified the milestones: 2.0.4, 2.0.5 Jul 24, 2024
@jerhard jerhard self-assigned this Jul 30, 2024
@sim642 sim642 modified the milestones: 2.0.5, 2.0.6 Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for asm goto
4 participants