LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 31652 - Loop unswitch and GVN interact badly and miscompile
Summary: Loop unswitch and GVN interact badly and miscompile
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: All All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords: miscompilation
Depends on:
Blocks: 27506
  Show dependency tree
 
Reported: 2017-01-16 05:19 PST by Nuno Lopes
Modified: 2021-07-14 05:48 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s):


Attachments
Reduced IR test case (1.51 KB, text/plain)
2017-01-16 05:19 PST, Nuno Lopes
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nuno Lopes 2017-01-16 05:19:41 PST
Created attachment 17845 [details]
Reduced IR test case

Loop unswitch and GVN when taken together can miscompile code due to disagreeing semantics of branch on undef.
In our opinion it's loop unswitch that is incorrect.

Test case in C:
static volatile int always_false = 0;

void maybe_init(int *p) {}

int f(int limit0) {
  int maybe_undef_loc;
  maybe_init(&maybe_undef_loc);

  int limit = limit0;
  int total = 0;
  for (int i = 0; i < limit; i++) {
    total++;
    if (always_false) {
      if (maybe_undef_loc != (limit0 + 10)) {
        total++;
      }
    }
    limit = limit0 + 10;
  }
  return total;
}

int printf(const char *, ...);

int main(int argc, char **argv) {
  printf("f(10) = %d\n", f(10));
  return 0;
}


I believe this test case has no UB even at C level. It should print "f(10) = 20".

I've attached a reduced IR test case.


Running LICM, the comparison of maybe_undef_loc gets hoisted:

$ opt -S bug.ll -loop-rotate -licm
...
loop.body.lr.ph:
  %undef_loc = load i32, i32* %maybe_undef_loc, align 4
  %add = add nsw i32 %limit0, 10
  %cmp3 = icmp ne i32 %undef_loc, %add
  %limit2 = add nsw i32 %limit0, 10
  br label %loop.body
...


This is still ok. But then adding loop unswitch:

$ opt -S bug.ll -loop-rotate -licm -loop-unswitch
...
loop.body.lr.ph:
  %undef_loc = load i32, i32* %maybe_undef_loc, align 4
  %add = add nsw i32 %limit0, 10
  %cmp3 = icmp ne i32 %undef_loc, %add
  %limit2 = add nsw i32 %limit0, 10
  br i1 %cmp3, label %loop.body.lr.ph.split.us, label %loop.body.lr.ph.loop.body.lr.ph.split_crit_edge
...

Loop unswitch introduce a branch on %cmp3, which will be undef once we inline maybe_init(). Hence loop unswitch assumes branch on undef is a non-deterministic jump (so not UB).
By running GVN after loop unswitch, we clearly see that GVN has a different perspective on the semantics of branch on undef. It produces this diff:
-  %limit2 = add nsw i32 %limit0, 10
...
-  %cmp = icmp slt i32 %i2, %limit2
+  %cmp = icmp slt i32 %i2, %undef_loc

I'm not including the context here, but to be able to justify this transformation, branch on undef has to be UB, not a non-deterministic jump as assumed by loop unswitch above.

Putting all things together:
$ opt -S bug.ll -loop-rotate -licm -loop-unswitch -gvn | opt -S -O2 | llc -o x.S && clang x.S -o x && ./x
f(10) = 1

This is wrong. Should be 20, not 1.

This example shows how Loop unswitch and GVN can produce a miscompilation end-to-end. The bug can be fixed by using the proposed freeze instruction in loop unswitch.

(Example by Sanjoy)
Comment 1 Daniel Berlin 2017-01-16 16:44:32 PST
Are you sure it's actually GVN?
GVN uses a lot of the instruction simplification and CFG simplification utilities, my guess is it's really down there in one of them :)
Comment 2 Daniel Berlin 2017-01-16 16:44:50 PST
Which would be much worse of course, because it would mean it affected more passes
Comment 3 Sanjoy Das 2017-01-16 17:08:48 PST
(In reply to comment #1)
> Are you sure it's actually GVN?

The most aggressive thing we can say at this time is that the bug is in (GVN + LoopUnswitch).  It is difficult to narrow it down further today without formalizing undef. :)

In the new poison formalization, the bug is *not* in GVN, but in loop unswitch.

> GVN uses a lot of the instruction simplification and CFG simplification
> utilities, my guess is it's really down there in one of them :)

In brief (at the risk of potentially sounding misleading): if, in *today's* undef formalization we *had to decide* the bug was in GVN (i.e. change the definition such that the bug was not in LoopUnswitch), then the blame would fall on GVN::propagateEquality.
Comment 4 Daniel Berlin 2017-01-16 17:14:47 PST
Except propagateEquality does not do anything with undef.
Only utilities it calls do.
Otherwise, the only undef handling it has is to  treat values from unreachable blocks as undef.
Comment 5 Sanjoy Das 2017-01-16 17:39:34 PST
(In reply to comment #4)
> Except propagateEquality does not do anything with undef.

It does not have to do anything specifically with undef.  The bug is more about taking into account the *possibility* of some value being undef either at runtime, or discovered to be undef after optimization that happens *later*.

I believe GVN::propagateEquality does this transform:

  if (x == y) {
    use(x); // divides by x
  }

=>

  if (x == y) {
    use(y); // (now) divides by y
  }

Now if x was 1, and y was undef (at runtime or discovered to be undef later, during GVN it is a normal llvm::Value that you don't otherwise have information about), what did the transform just do?

One interpretation is that with x == 1 and y == undef, the original program,

  if (1 == undef) {
    use(1);
  }

is undefined because it branches on `1 == undef` == `undef`.  Under this interpretation GVN::propagateEquality was correct since the program was undefined originally and any transform on it is correct by definition.

However, under this interpretation of branching on undef, loop unswitch is incorrect, since it effectively "hoists" branches "out of control flow":

  for (;;) {
    if (c0) {
      if (c1) {
      }
    }
  }
 
to

  if (c1) {
    for (;;) {
      if (c0) {
      }
    }
  } else {
    // other version ...
  }

Given the semantic "br undef" is UB, we've miscompiled the program if c0 was false and c1 was undef since we transformed a program that would never branch on undef (no UB) to one that does (has UB).



The other interpretation is that LoopUnswitch is correct, and branch on undef just picks one of its successors at random.

However, this means GVN::propagateEquality is incorrect, since if you have:

  if (x == y) {
    use(x);  // divides by x
  }

=>

  if (x == y) {
    use(y);  // (now) divides by y
  }

If x was 1 and y was undef, the initial program would (at random) either do a division by 1 or not.  However, after the transform it would (at random) divide by undef or not.  But dividing by undef is undefined behavior, and by doing the transform you've introduced UB in a correct program.


> Only utilities it calls do.
> Otherwise, the only undef handling it has is to  treat values from
> unreachable blocks as undef.
Comment 6 Daniel Berlin 2017-01-16 19:33:07 PST
First, "I believe this test case has no UB even at C level. It should print "f(10) = 20".
"
This is, IMHO, pretty clearly abuse of the standard :)

You are simply taking the address to avoid this clause:

"If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined."

This seems more a bug in the standard than the compiler ;)

Which means when i see programs like this, my main answer is "hey, that's super-interesting that you found this loophole. But i don't have interest ATM in trying to make the compiler do something sane here, you should just fix your variable".


I certainly realize your goal with the freeze proposal is to give meaning to the indeterminate value that maybe_undef_loc represents here.

But htere are plenty of cases where the compiler doesn't do a thing that the standard says because it is insane or broken.

Past that, if i had to choose, it's loop unswitch that's broken here.

I actually have pretty simple reasoning:

Any transformation that makes it so that 
if (x==y) 

requires extra work to try to handle "undef" or any other value, is broken.

Because it would be nonsensical.

That is, given if (x==y), it must always be legal to replace x with y inside any dominated part of the if condition.

If we've created semantics that make that untrue, we should fix them eventually
If transformations are currently doing things that make that untrue (like loop unswitch), we should fix them for now.

But i'm also of the mind that it's not worth losing sleep over this case at all.

IE i don't think it's really worth making loop unswitch try to prove the non-undefness of this variable just to fix this case.
Comment 7 Sanjoy Das 2017-01-16 19:57:49 PST
(In reply to comment #6)
> First, "I believe this test case has no UB even at C level. It should print
> "f(10) = 20".
> "
> This is, IMHO, pretty clearly abuse of the standard :)
> 
> You are simply taking the address to avoid this clause:
> 
> "If the lvalue designates an object of automatic storage duration that could
> have been declared with the register storage class (never had its address
> taken), and that object is uninitialized (not declared with an initializer
> and no assignment to it has been performed prior to use), the behavior is
> undefined."

Nope.

The maybe_init(&maybe_undef_loc) call is just to let me order optimizations in a way that I can trigger the micompile in an easily understood manner.

The program never actually reads from maybe_undef_loc; that read is guarded under a condition that is always false.  So even if the you drop the clause you quoted, as long as taking the address of an uninitialized stack variable is well defined this is a miscompile.

That is why I have the second "opt -S -O2".  That second run runs the inliner, and makes it "obvious" that maybe_undef_loc is uninitialized.  But the we've *already* miscompiled the program.
 

> Past that, if i had to choose, it's loop unswitch that's broken here.

That's what we went with.
Comment 8 Daniel Berlin 2017-01-16 20:24:15 PST
I'm kinda surprised, i would never have bet this is from real code.
In any case, i think loop unswitch should not unswitch variables unless it can prove they have defined values.


THis is what gcc does:

"
      /* Unswitching on undefined values would introduce undefined
	 behavior that the original program might never exercise.  */
      if (ssa_undefined_value_p (use, true))
	return NULL_TREE;"
"


Which in turn determines if it can prove it's initialized (by being an argument, a hard register variable, a returned reference, or has a defining statement).

(we could do better using IPA to see if it's always inited through IPA dataflow)
Comment 9 Sanjoy Das 2017-01-16 20:32:19 PST
(In reply to comment #8)
> I'm kinda surprised, i would never have bet this is from real code.

It's not "real real code" (i.e. it *is* synthetic, not reduced from some larger miscompiling test case), but it isn't relying on some dusty corner case of the standard either.  :)

> In any case, i think loop unswitch should not unswitch variables unless it
> can prove they have defined values.

That's kind of hard to do at the IR level -- unless you can know that a llvm::Value is a constant, or a some simple expression involving constants, it can always be undef/poison.

The "freeze" operator lets us "cast away" the potential undefined-ness in the cases where loop unswitch cannot prove the definedness of the value it is unswitching on, giving us an expression safe to switch on.

> Which in turn determines if it can prove it's initialized (by being an
> argument, a hard register variable, a returned reference, or has a defining
> statement).

In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds etc.) can result in poison and undef.
Comment 10 Daniel Berlin 2017-01-16 20:34:54 PST
"
In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds etc.) can result in poison and undef.
"
Sure, and i'd argue that until it can definitely prove that such a thing does not occur, through symbolic evaluation, whatever, it can't do unswitching on it.


I realize this limits the usefulness of loop unswitching ATM. I don't know how else you'd ever fix this testcase *right now*. :)

(since freeze/etc is a larger change)
Comment 11 Daniel Berlin 2017-01-16 20:40:02 PST
and as for "not a dusty corner caseæ argument, saying "well, the condition is false, so it has no undefined behavior", is technically true, but that's kind of pointless in a large way.  In the example, there is no useful case for the variable to be uninitialized, IMHO, and any case in which it is a bug.

Do you have an example case where, lacking the condition, the program is both usefully using it and it's not a bug?

It is definitely a bug if the condition is not false, and while it's true that it would be nice to not introduce this bug to the case where it is always false, i still have a lot of trouble saying "well, it's super important we get this right".

(But i'll admit I feel this way about most cases where people have hidden undefined behavior like this :P.  I think it would be just fine for the standard to say it's undefined if it ends up unitialized, whether or not it's ever evaluated)
Comment 12 Sanjoy Das 2017-01-16 20:42:07 PST
(In reply to comment #10)
> "
> In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds
> etc.) can result in poison and undef.
> "
> Sure, and i'd argue that until it can definitely prove that such a thing
> does not occur, through symbolic evaluation, whatever, it can't do
> unswitching on it.
> 
> I realize this limits the usefulness of loop unswitching ATM. I don't know
> how else you'd ever fix this testcase *right now*. :)

Yes, there is no way we can truly fix the problem in today's IR.

However, I personally don't have the bandwidth at this time to take on pessimizing loop unswitch.  :)
Comment 13 Sanjoy Das 2017-01-16 20:51:22 PST
(In reply to comment #11)
> and as for "not a dusty corner caseæ argument, saying "well, the condition
> is false, so it has no undefined behavior", is technically true, but that's
> kind of pointless in a large way.  In the example, there is no useful case
> for the variable to be uninitialized, IMHO, and any case in which it is a
> bug.
> 
> Do you have an example case where, lacking the condition, the program is
> both usefully using it and it's not a bug?

I suppose you could have cases like:

SmallVector<int, 4> Vect;  // 4 uninitialized ints
Vect.push_back(1);

// After inlining inlined body

if (Vect.size() > 2 && Vect[1] != 400) {
  // Do something
}

Here `Vect.size() > 2` is the always_false part, and `Vect[1]` is the `maybe_undef` bit.

Another example is how we use PatternMatch.h.  One idiomatic use tends to be (with references desugared to pointers):

Value *V;  // Sometimes people will set V to nullptr, but sometimes they won't
if (match(Expr, &V)) {
  // Use V;
}
// V is not initialized down this path.


> (But i'll admit I feel this way about most cases where people have hidden
> undefined behavior like this :P.  I think it would be just fine for the
> standard to say it's undefined if it ends up unitialized, whether or not
> it's ever evaluated)

IIUC you're saying all stack variables have to be initialized on exit from the function.  If so, I don't think that is idiomatic in C++ but I will admit I'm only familiar with a handful of C++ codebases.
Comment 14 Daniel Berlin 2017-01-16 20:59:02 PST
(In reply to comment #13)
> (In reply to comment #11)
> > and as for "not a dusty corner caseæ argument, saying "well, the condition
> > is false, so it has no undefined behavior", is technically true, but that's
> > kind of pointless in a large way.  In the example, there is no useful case
> > for the variable to be uninitialized, IMHO, and any case in which it is a
> > bug.
> > 
> > Do you have an example case where, lacking the condition, the program is
> > both usefully using it and it's not a bug?
> 
> I suppose you could have cases like:
> 
> SmallVector<int, 4> Vect;  // 4 uninitialized ints
> Vect.push_back(1);
> 
> // After inlining inlined body
> 
> if (Vect.size() > 2 && Vect[1] != 400) {
>   // Do something
> }
> 
> Here `Vect.size() > 2` is the always_false part, and `Vect[1]` is the
> `maybe_undef` bit.
> 
> Another example is how we use PatternMatch.h.  One idiomatic use tends to be
> (with references desugared to pointers):
> 
> Value *V;  // Sometimes people will set V to nullptr, but sometimes they
> won't
> if (match(Expr, &V)) {
>   // Use V;
> }
> // V is not initialized down this path.
> 
> 
> > (But i'll admit I feel this way about most cases where people have hidden
> > undefined behavior like this :P.  I think it would be just fine for the
> > standard to say it's undefined if it ends up unitialized, whether or not
> > it's ever evaluated)
> 
> IIUC you're saying all stack variables have to be initialized on exit from
> the function.  If so, I don't think that is idiomatic in C++ but I will
> admit I'm only familiar with a handful of C++ codebases.


Well, actually, i'd just change the standard to definitely init them to zero, and then our only problem is proving that we don't have to init them to zero because it's overwritten by some other value :)

I doubt the performance consequences are really that high these days ...
Comment 15 Daniel Berlin 2017-01-16 20:59:25 PST
(In reply to comment #12)
> (In reply to comment #10)
> > "
> > In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds
> > etc.) can result in poison and undef.
> > "
> > Sure, and i'd argue that until it can definitely prove that such a thing
> > does not occur, through symbolic evaluation, whatever, it can't do
> > unswitching on it.
> > 
> > I realize this limits the usefulness of loop unswitching ATM. I don't know
> > how else you'd ever fix this testcase *right now*. :)
> 
> Yes, there is no way we can truly fix the problem in today's IR.
> 
> However, I personally don't have the bandwidth at this time to take on
> pessimizing loop unswitch.  :)

Hey, it's your bug, if you want to leave it open, SGTM :)
Comment 16 Nuno Lopes 2017-01-17 02:58:03 PST
Thanks Daniel for your feedback (and for the gcc ref)! At least we agree that loop unswitch has a problem :)

Freeze is indeed the best fix for loop unswitching that we could come up with.
Also, please realize that LLVM/clang themselves introduce a lot of undefs (e.g., bitfields, padding, etc), so a program doesn't need to have UB for this miscompilation to happen; we just didn't look close enough to find those cases.

BTW, MSVC has an easy strategy to deal with undef values. Since their pipeline is (mostly) fixed, after inlining and all the inter-proc analyses they know there won't be any more undefs flowing through function arguments or global variables. Therefore they can mark those variables as defined.
In LLVM we've always assumed that we can have at arbitrary order, but this makes things more complicated. We could perhaps have a similar notion of "only-local-transformations-from-now-on" in the new PM?