How to handle size_t in front ends?

Talin1 · May 22, 2008, 2:17am

Chris Lattner wrote:

On a related topic: The source-level debugging descriptors require you
to know up front what the sizeof pointer types are. Is there any hope of
the frontend remaining blissfully unaware of platform details?

I really don't know how to do this. The current debug info stuff depends on emitting size info into the IR. At this point, I don't think there is a good way around this. Improvements to the design are welcome of course.

-Chris

As I understand this, this issue and others like it all require a difficult step to be taken, which is to introduce the concept of a constant whose value is not known until code generation time or at least until the compilation target is fully known. These "late bound constants" could then be used to implement "sizeof(type)" and other constants whose value is different on different targets.

On the C++ side, you would have ConstantSize(Type) which would be usable anywhere that you could use ConstantInt, except that you can't actually inspect the integer value of the constant or use it in expressions. That part is fairly simple; It gets more complicated if you want to try defining operators such as sizeof(A) + sizeof(B); This requires the IR to support arbitrary expressions, which I don't think you want. But most of the time, you don't want to add the size of A and B, you want sizeof({A, B}), which doesn't require any special syntax other than sizeof itself.

So in other words, the frontend sees the "sizeof" constant purely as a symbolic, opaque object, while the code generator simply converts it into a ConstantInt.

This makes filling in the dwarf debugging structures relatively easy as long as you have an LLVM type reference to use as a measuring stick. In fact, I'd likely make the hypothetical "DebugBuilder" API such that most of the info was derived from an LLVM type given as a parameter, with just a few additional parameters to specify the things that cannot be determined just from looking at the the LLVM type.

-- Talin

Gordon_Henriksen1 · May 22, 2008, 3:26am

LLVM already does this.

http://www.nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt

— Gordon

Talin1 · May 23, 2008, 1:40am

Gordon Henriksen wrote:

clattner · May 23, 2008, 3:16am

There is more than one form of alignment. To find the struct field alignment of something, you can do something like:

"sizeof({i8, T}) - sizeof(T)"

-Chris

Talin1 · May 23, 2008, 5:00am

Chris Lattner wrote:

LLVM already does this.

http://www.nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt

— Gordon

Is there a similar technique that would allow calculation of the
alignment? (which is also required by the DWARF derived-type descriptor.)

There is more than one form of alignment. To find the struct field alignment of something, you can do something like:

"sizeof({i8, T}) - sizeof(T)"

Clever. I'll use that.

However, I feel that when a "trick" like this gets used enough times, that's a signal that it should be codified. Making sizeof() and alignmentof() first-class operations in the IR would have the advantage of making the generated IR clearer; And we already know that it can be done because the tricks exist.

A lot of this thinking comes out of my attempting to create (as I mentioned on the other thread) a generic "DebugBuilder", similar to IRBuilder, that pumps out source level debugging definitions. As much as possible, I want to hide details of the target machine from the user of the API. You ought to be able to hand it an LLVM type, plus a little sprinkling of source-derived metadata to go along with it, and it figures out all the metrics for you.

This is separate from the issue of size_t, which I realize is much more complex because it's not merely a machine-dependent constant, it's a machine-dependent *type*. And unlike constants, types cannot be the product of an expression in LLVM, so there's no handy trick that can be used.

-- Talin

clattner · May 24, 2008, 12:07am

There is more than one form of alignment. To find the struct field
alignment of something, you can do something like:

"sizeof({i8, T}) - sizeof(T)"

Clever. I'll use that.

However, I feel that when a "trick" like this gets used enough times,
that's a signal that it should be codified. Making sizeof() and
alignmentof() first-class operations in the IR would have the advantage
of making the generated IR clearer; And we already know that it can be
done because the tricks exist.

Sure, I'd be fine with adding them as constant exprs. Go for it.

A lot of this thinking comes out of my attempting to create (as I
mentioned on the other thread) a generic "DebugBuilder", similar to
IRBuilder, that pumps out source level debugging definitions. As much as
possible, I want to hide details of the target machine from the user of
the API. You ought to be able to hand it an LLVM type, plus a little
sprinkling of source-derived metadata to go along with it, and it
figures out all the metrics for you.

Yep.

This is separate from the issue of size_t, which I realize is much more
complex because it's not merely a machine-dependent constant, it's a
machine-dependent *type*. And unlike constants, types cannot be the
product of an expression in LLVM, so there's no handy trick that can be
used.

Right.

-Chris

Topic		Replies	Views
Proposal: intp type LLVM Dev List Archives	38	148	December 3, 2009
The size of a pointer to function. Clang Frontend	11	196	February 12, 2013
portable sizeof Clang Frontend	2	89	October 30, 2009
variable sized structs in LLVM LLVM Dev List Archives	23	212	June 25, 2005
The definition of getTypeSize LLVM Dev List Archives	9	86	November 1, 2007

How to handle size_t in front ends?

Related topics