Hi everyone,
I’ve got a problem that I would like some input on. The problem basically boils down to a program that I am compiling, whose source I don’t control, doing something like this:
p = (char*)0 + n
where ‘n’ is an intptr_t-sized value that the program knows is actually a valid address for a pointer.
clang translates this as
%p = getelementptr inbounds i8, i8* null, i64 %n
So far, so good. The problem is that while LLVM seems to consider the above IR to be valid, we officially do not allow dereferencing a pointer constructed in this way (if I’m reading the rules correctly). Consequently, if this GEP ever gets close enough to a load using the pointer, InstCombine will eliminate the GEP and the load.
I’ve been told that the ‘(char*)0 + n’ construct is invalid according to the C standard. However, this pattern appears in the glibc malloc implementation, so I’d like to be able to handle it anyway.
I’ve discussed this with a few co-workers and we’ve considered a few possible solutions:
-
Add a new transformation to InstCombine that will replace ‘getelementptr i8, i8* null, %n’ with ‘inttoptr %n to i8*’ when has the same size as a pointer for the target architecture.
-
Disable the existing transformation in InstCombine that eliminates the GEP+load.
-
Have the front end recognize this particular idiom and translate it directly as inttoptr.
We like the first solution best. The second “solution” is basically a punt. It does away with the immediate problem but leaves the code basically working by chance. I think the third solution is incomplete, because it relies on the front end being able to detect the use of a null pointer whereas that might not emerge until a few basic optimizations have been performed.
I was hoping to get some more input on this matter before proceeding.
What do you think?
Thanks,
Andy
Personally, I’d prefer #3 for two reasons:
- This is a very C specific weirdness, so putting it into the frontend makes sense.
- This is really about supporting a specific (horrible
idiom. It makes sense to recognize this in the frontend, which is close to the idiom truth, rather than in the optimizer, which is run multiple times and sees code after being transformed.
I see this as pretty similar to the analogous hacks we do to support broken offsetof idioms.
-Chris
glibc does accept patches...or are you talking about two separate instances
of this problem, both in glibc and something else?
glibc does accept patches…or are you talking about two separate instances of this problem, both in glibc and something else?
I originally saw this in a benchmark (which it may be possible to get changed) and only afterward found the glibc idiom.
The most recent glibc code is a bit more complicated than I represented below. If you look up obstack.h you can see what’s there now. Basically, they’re computing a pointer alignment that may be based on some non-null base pointer or may be based on null, depending on the target architecture. I’m sure it’s possible to make it more sensible, but I’ve never interacted with the glibc community so I’m not sure how open they would be to changing this. I suspect I’d need a more compelling argument than clang-compatibility (or maybe even standards compatibility).
I’m not entirely opposed to solution #3. As I said, my concern is that there are cases it would miss.
For instance, if I had some code like this:
char *get_ptr(char *base, intptr_t offset) {
return base + offset;
}
char *convert_to_ptr(intptr_t ptr_val) {
return get_ptr((char*)0, ptr_val);
}
There the idiom would only appear after inlining, so the front end couldn’t handle it. The current glibc code is implemented with a couple of layers of macros that have a logical branch that could theoretically result in the null coming in via a PHI, but some early investigation makes it look like the choice between null and something else is actually resolved in the front end in some way. If that holds up and it turns out that I don’t have any actual programs where the front end can’t spot this idiom (and I agree it’s horrible) maybe it would be acceptable to not handle the theoretical cases.
-Andy
FWIW and IIUC this same issue occurs in SPEC2017/gcc (with ref input) and I hear getting patches past the SPEC committee is incredibly difficult…
I’m not entirely opposed to solution #3. As I said, my concern is that there are cases it would miss.
For instance, if I had some code like this:
char *get_ptr(char *base, intptr_t offset) {
return base + offset;
}
char *convert_to_ptr(intptr_t ptr_val) {
return get_ptr((char*)0, ptr_val);
}
There the idiom would only appear after inlining, so the front end couldn’t handle it.
Agreed, but that is a good thing, not a bad thing. The whole point here is that we want to limit the impact to only occur for the cases that matter. This would be a source compatibility hack, not a general extension to the IR.
The current glibc code is implemented with a couple of layers of macros that have a logical branch that could theoretically result in the null coming in via a PHI, but some early investigation makes it look like the choice between null and something else is actually resolved in the front end in some way. If that holds up and it turns out that I don’t have any actual programs where the front end can’t spot this idiom (and I agree it’s horrible) maybe it would be acceptable to not handle the theoretical cases.
Macros shouldn’t pose a problem. Clang has several recursive AST walkers that could help you. For example, it has a “is a null pointer expression” predicate, which will walk through casts and other nonsense that may be in the way of detecting the zero.
-Chris
That works for me. The front end solution was actually the first one I tried out, so I can confirm that it fixes the problem for the benchmark in question (which Chad Rosier correctly identified). It feels like a hack, but it sounds like maybe you’d prefer that it feel like a hack (which makes sense since it is, in fact, a hack no matter where we put it).
Indeed, I agree it is a horrible hack. Such things are an important part of the life of a real world C compiler 
-Chris
* Andrew via llvm-dev Kaylor:
The most recent glibc code is a bit more complicated than I
represented below. If you look up obstack.h you can see what’s there
now. Basically, they’re computing a pointer alignment that may be
based on some non-null base pointer or may be based on null, depending
on the target architecture. I’m sure it’s possible to make it more
sensible, but I’ve never interacted with the glibc community so I’m
not sure how open they would be to changing this. I suspect I’d need
a more compelling argument than clang-compatibility (or maybe even
standards compatibility).
Not at all, we are fine with making installed headers more palatable
to a wider range of compilers.
That being said, I find obstacks rather horrible and wouldn't be
surprised if they can't be implemented in a standard-conforming
manner. Strictly speaking, you can't take a pointer, convert it to
(u)intptr_t, twiddle some bits for alignment purposes, and cast it
back to get another object pointer. But it seems to me that obstacks
go beyond that in terms of C language abuse.
The offset relative to NULL was probably needed for some old
compilers. We should rewrite this using uintptr_t and __alignof__.
If anyone wants to work on this and change the macros, we'd definitely
review a patch submission (if it is short enough or is backed by a
copyright assignment, sorry).
So, I actually think there’s two separate parts of this.
- Pointer math is required to stay within an object if the GEP has the “inbounds” attribute. Nothing is inbounds of the null pointer, so a GEP instruction with a null pointer, and non-zero offsets, is invalid on its face, even without the load.
Now, you can tell clang to stop putting inbounds on pointer arithmetic GEPs by using -fwrapv or -fno-strict-overflow. Doing that doesn’t fix the issue, because it’s not actually the GEP that’s being optimized in this particular case.
- Separately, LLVM’s pointer aliasing rules prohibit (even without the GEP marked inbounds) constructing a pointer to a random object with a GEP. (http://llvm.org/docs/LangRef.html#pointeraliasing). This permission is used in InstCombineLoadStoreAlloca.cpp canSimplifyNullLoadOrGEP to eliminate the load based on a null pointer GEP, despite the non-zero offset to a not-inbounds GEP.
To the first, asking people who do crazy stuff like this to compile with -fno-strict-overflow is reasonable…they’re invoking C-level undefined behavior by doing arbitrary pointer-math, so they should use the flag that tells the compiler to make arbitrary pointer-math not be UB.
And to the second, it seems to me that it would be fairly reasonable change to make the latter optimization dependent on the GEP being marked inbounds. (Even if the alias analysis still doesn’t promise you anything here.)
* James Y. Knight via llvm-dev:
To the first, asking people who do crazy stuff like this to compile with
-fno-strict-overflow is reasonable...they're invoking C-level undefined
behavior by doing arbitrary pointer-math, so they should use the flag that
tells the compiler to make arbitrary pointer-math not be UB.
Bit this comes from a glibc header, so ideally, it should be fixed to
work with all compiler modes (with the caveat that some things that
obstack does may never be valid C code).
Responded to this earlier but didn't get llvm-dev on the list:
This is the commit that caused the newest revision of the problem: https://github.com/lattera/glibc/commit/2fd4de4b15a66f821057af90714145d2c034a609
Note that it removes the OLDER version of the problem (__PTR_TO_INT/__INT_TO_PTR). The interesting part of the old version is the comment:
-/* We use subtraction of (char *) 0 instead of casting to int
- because on word-addressable machines a simple cast to int
- may ignore the byte-within-word field of the pointer. */
I'll note that I tried both the new and old version, and think Clang's emit for this pattern (GEP(i8,nullptr, %SOMETHING)) will correct this issue.
Yes, no matter what clang does, this header should be fixed to not align
pointers an extraneously-weird-and-broken way. On platforms where uintptr_t
exists, just do the straightforward obvious math that _everyone else_ uses.
On platforms where it doesn't exist (like I guess this IBM thing?), the
existing code, path where it subtracts an actual base object is fine...
Also, gotta love this code snippet below. It wants to avoid "polluting the
namespace" by including stddef.h (really???), and thus proceeds to define a
global macro in the user namespace called "PTR_INT_TYPE". Ha ha very funny!
/* We need the type of a pointer subtraction. If __PTRDIFF_TYPE__ is
defined, as with GNU C, use that; that way we don't pollute the
namespace with <stddef.h>'s symbols. Otherwise, include <stddef.h>
and use ptrdiff_t. */
#ifdef __PTRDIFF_TYPE__
# define PTR_INT_TYPE __PTRDIFF_TYPE__
#else
# include <stddef.h>
# define PTR_INT_TYPE ptrdiff_t
#endif
* Erich Keane:
Responded to this earlier but didn't get llvm-dev on the list:
This is the commit that caused the newest revision of the problem:
[BZ #321] · lattera/glibc@2fd4de4 · GitHub
For posterity:
commit 2fd4de4b15a66f821057af90714145d2c034a609
Author: Roland McGrath <[email protected]>
[BZ #321]