-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge local variables and parameter declarations when possible #188
Comments
How does the input LLVM IR look like? We need to remember that we're not in the business of roundtripping C code. Rellic produces C from LLVM IR in a way that captures the semantics of LLVM IR as closely as feasible. |
Uh I think we need to hear @pgoodman about this particular point: I was under the impression he wants to produce the "most beautiful" C code we can from the IR while keeping the semantics the same |
@surovic we are also in the business of roundtripping C code. That has always been an explicit goal for Rellic. We can talk on Slack about the use-cases. The future of anvill-produced bitcode may also include LLVM debug information metadata, and so gaining first-class support is key. |
Agreed. My point mainly here is that if we get input like the following: ; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @foo(i32 %a, i32 %b, i32 %c) #0 {
entry:
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
%c.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
store i32 %c, i32* %c.addr, align 4
%0 = load i32, i32* %a.addr, align 4
%mul = mul nsw i32 %0, 2
store i32 %mul, i32* %a.addr, align 4
%1 = load i32, i32* %a.addr, align 4
%2 = load i32, i32* %b.addr, align 4
%mul1 = mul nsw i32 %1, %2
%3 = load i32, i32* %c.addr, align 4
%add = add nsw i32 %mul1, %3
ret i32 %add
} I think we should emit C that's true to the bitcode, if we can, instead of doing transformations that the user might not expect or want. I also think that the suggested transformation is more easily done by running an unsigned int foo(unsigned int a, unsigned int b, unsigned int c);
unsigned int foo(unsigned int a, unsigned int b, unsigned int c) {
return (a << 1U) * b + c;
} This is rellic output for the above C source compiled with |
The way I see Rellic is similar but different. Instead of "true to the bitcode," what I want is "if compiled, could be proven equivalent to the bitcode." There is thus a spectrum of of outputs that meets this criteria. On one end, which would be truest to the bitcode, we have "literal translation of bitcode to stuff with gotos and labels and what not." Somewhere near the middle, we have what Rellic does, which frankly isn't true to the bitcode at all -- it has restructured things in a dramatic way, and via z3, possibly altered the complexity or nature of conditions and other expressions. z3's machinations are opaque. Then further toward the other end of the spectrum, we bring in more and more high-level features, be they names or types or actual language features/syntaxes. Then there's the bitcode itself. Why is the bitcode the way it is? Sometimes it is full of crap that spills out of LLVM optimization passes, e.g. naming variables with TL;DR: we should accept that we're not true to the bitcode already, and that the bitcode isn't necessarily something that we want to be true to -- just equivalent. Heck, in anvill, we often try to change the bitcode for Rellic's benefit. I think we should strive for "idiomatic" code, which is itself a wishy-washy target. How do we pin down what idiomatic code is? We look at the output, we make a decision saying "It'd be more comprehensible if it looked this way" and then we make a change. Sometimes those opinions on "what is more comprehensible" are going to raise conflicts -- conflicts that are the same depth as tabs vs. spaces. In those cases, we find a client and they decide. |
Rellic generates temporary local variables to store parameter values, instead of directly using the parameters themselves:
roundtrips to
It would be nice if we could detect when variables such as
a_var0
,b_var1
andc_var2
are redundant and could be substituted with the parameters instead.The text was updated successfully, but these errors were encountered: