Create a new attribute for range.
From what I can see this needs some new storage handling as before only int/enum/string/types have been stored in attributes but maybe I’m wrong.
One other thing is that there is already handling for eg range metadata so shall this attribute return a MDNode so that code can be reused?
I don’t think we should add support for metadata attachments on arguments, as this will introduce two different ways to represent constraints on function arguments. The attribute system has the advantage of being more efficient, while the metadata system is more generic – but I don’t think we have need of its full generality here.
Create a new attribute for range.
This alternative would definitely be my preference.
From what I can see this needs some new storage handling as before only int/enum/string/types have been stored in attributes but maybe I’m wrong.
Yes, this would require a new storage type. I would recommend storing it as a ConstantRange.
I think this storage type will also come in handy for other things. An obvious example is the existing vscale_range attribute (currently encoded as a packed integer, and gets away with it because the ranges it represents are somewhat constrained). Another example would be an extension to dereferenceable that specifies both a lower and an upper bound, rather than only a lower bound (this is useful for alias analysis).
One other thing is that there is already handling for eg range metadata so shall this attribute return a MDNode so that code can be reused?
No, the API should return a ConstantRange. That’s what all the code later works on.
The !range MDNode representation is slightly more general in that it allows specifying multiple ranges, but I believe that this is an unnecessary over-generalization. I don’t think frontends make use of this feature, and the optimizer usually just unions all the constituent ranges together.
As functions have metadata and attributes I thought it was two by design and that metadata was supposed to only be used for optimization hints as it can be removed.
one downside of not using the same format as in the metadata is that if the frontend creates multiple ranges in the range metadata it now needs a new functionality to create the data for the attribute.
My interpretation was that multiple ranges was to support enums with gaps in them, but as you say most uses seems to combine them, valueTracking was one exception from that.
Not many more comments for this.
Looks like I shall close my current PR that add metadata to function arguments and the proposal is to add a new Attribute range(<min>, <max>) to Parameter Attributes by adding the possibility for ConstantRange attributes. @nikic shall a new RFC be created for the attribute proposal or shall it be the result of this one?
Hi again have been trying to create the ConstantRange attribute type but have a problem with my previous suggestion of range(<min>,<max>) as there is no information about bit width of the APInt in the ConstantRange.
So my question is shall there be some information about number of bit in the IR?
For MDNode each field have a type like (i32 42) so shall the ConstantRange attribute type also have a type or a bit width? alternatives:
use one type : range(<ty>, <min>,<max>) range(i32,-1,255)
use a number of bits : range(<bits>, <min>,<max>) range(32, -1,255)
use a type per value (needs to be the same) : range(<ty> <min>, <ty> <max>) range(i32 -1, i32 255)
or maybe there is no need for a fixed bit width.
in that case maybe when parsing positive values add 1 bit to the bit width of the lexed value and always write as signed APInt to get the asm parse/write roundtrip to give the same result. then also need to select the largest of min and max bit width for the ConstantRange bit width.
feels a bit strange to not have a fixed bit width.
Limiting this to 64-bits is a bit unfortunate. This is fine for attributes like dereferenceable() where larger values truly make no sense, but for range() use with larger integer types is plausible. For example, Rust has a NonZeroU128 type, which could make use of the attribute.
I was going to suggest that we can make use of the type specified for the parameter, i.e. when we have i32 range(A,B) %x parse it as a 32-bit range. Unfortunately, this wouldn’t work straightforwardly for return values, because they use (why???) the reverse order range(A,B) i32 instead.
Generally, I would expect there to be a verifier requirement that the parameter type and the range() attribute have the same bitwidth. Otherwise the attribute becomes hard to interpret. If you always store a 64-bit range, then truncating it to the correct width will not produce the desired result (it will be a full range instead).
So overall I think we’ll have to live with something slightly ugly like i32 range(i32 A, B) %x for params and range(i32 A, B) i32 for returns.
What I was thinking about when I wrote that is that you’d take a 64-bit ConstantRange and then call truncate() on it to adjust it to the correct size. If you have a range like [1,0) and truncate() that to a smaller size you’ll get a full range back. What would probably work fine is to construct a new ConstantRange by explicitly truncating Lower and Upper.
But I still think that allowing the range to have a different bit-width from the parameter type is going to cause unnecessary complications…
There is currently no optimization that use the range attribute from “assume operand bundles”.
I can try to add it but maybe will be a while, I’m still working on adding handling for all the places that use the range metadata.
Also do not know how easy it is to add from what I found almost only value tracking have used “assume operand bundles” before and do not know if that is enough for the optimization that you want.
I’d just like to be able to use fewer icmp+assumes in things like
I can’t say I know what “value tracking” is specifically, but from the name it sounds like it’d plausibly be sufficient.
Maybe even better than the assumes today, since I hit some places where the assume(ule(x, INT_MAX)) turns into assume(sgt(x, -1)) and that ends up confusing the optimizer into not being able to take advantage of it later.
Could you share an example for that? For range-based optimizations, we usually don’t care how exactly the comparison is written, so this seems unexpected…
(I don’t remember exactly what I was seeing originally, and of course before I was looking at Rust output which isn’t on trunk, but I think that issue is somewhat close at least.)