LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 46463 - __attribute((noinline)) not respected
Summary: __attribute((noinline)) not respected
Status: CONFIRMED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC All
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-25 19:04 PDT by Jeff Muizelaar
Modified: 2021-11-09 02:55 PST (History)
12 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Muizelaar 2020-06-25 19:04:54 PDT
__attribute((noinline))
char *demo(char *s) {
    return s;
}

char *foo()  {
    return demo("input string");
}

compiles to:

demo(char*):                              # @demo(char*)
        movq    %rdi, %rax
        retq
foo():                                # @foo()
        movl    $.L.str, %eax
        retq
.L.str:
        .asciz  "input string"
Comment 1 Roman Lebedev 2020-06-26 00:09:56 PDT
noinline doesn't mean "block all optimizations", it only says "don't move code out of this function"
Comment 2 Dimitry Andric 2020-06-26 01:21:44 PDT
Which version of clang is this? With the latest 10.0.1 off the branch:

$ clang -fno-asynchronous-unwind-tables -fomit-frame-pointer -O2 -S pr46463.c -o -
        .text
        .file   "pr46463.c"
        .globl  demo                    # -- Begin function demo
        .p2align        4, 0x90
        .type   demo,@function
demo:                                   # @demo
# %bb.0:                                # %entry
        movq    %rdi, %rax
        retq
.Lfunc_end0:
        .size   demo, .Lfunc_end0-demo
                                        # -- End function
        .globl  foo                     # -- Begin function foo
        .p2align        4, 0x90
        .type   foo,@function
foo:                                    # @foo
# %bb.0:                                # %entry
        movl    $.L.str, %edi
        jmp     demo                    # TAILCALL
.Lfunc_end1:
        .size   foo, .Lfunc_end1-foo
                                        # -- End function
        .type   .L.str,@object          # @.str
        .section        .rodata.str1.1,"aMS",@progbits,1
.L.str:
        .asciz  "input string"
        .size   .L.str, 13

        .ident  "FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.0-97-g6f71678ecd2)"
        .section        ".note.GNU-stack","",@progbits
        .addrsig

So this tail-calls the demo function.
Comment 3 Jeff Muizelaar 2020-06-26 05:47:07 PDT
clang trunk: https://gcc.godbolt.org/z/tpJsOm
It seems like this was a recent behaviour change as clang 10.0 doesn't show the problem.
Comment 4 Dimitry Andric 2020-06-26 06:43:11 PDT
(In reply to Jeff Muizelaar from comment #3)
> It seems like this was a recent behaviour change as clang 10.0 doesn't show
> the problem.

It appears to have regressed with https://reviews.llvm.org/rG8903e61b6611 ("[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects"), by Fangrui Song.

I guess it is an unintended side effect? Whatever the cause, it would be nice to have it fixed before 11 branches.
Comment 5 Dimitry Andric 2020-06-26 10:13:25 PDT
(In reply to Dimitry Andric from comment #4)
> It appears to have regressed with https://reviews.llvm.org/rG8903e61b6611
> ("[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects"),
> by Fangrui Song.

Eh sorry, this was a bisection mistake on my part. This commit introduces a .Lfoo$local label, but does not yet eliminate the tail call. Looking further.
Comment 6 Dimitry Andric 2020-06-26 10:47:40 PDT
Looked again, and it is definitely regressed with https://reviews.llvm.org/rG03727687766a ("[InstCombine] Simplify calls with "returned" attribute") by Nikita Popov.

With clang llvmorg-11-init-06295-g9cf920e64d1:
========================================================================
	.text
	.file	"pr46463.c"
	.globl	demo                    # -- Begin function demo
	.p2align	4, 0x90
	.type	demo,@function
demo:                                   # @demo
.Ldemo$local:
# %bb.0:                                # %entry
	movq	%rdi, %rax
	retq
.Lfunc_end0:
	.size	demo, .Lfunc_end0-demo
                                        # -- End function
	.globl	foo                     # -- Begin function foo
	.p2align	4, 0x90
	.type	foo,@function
foo:                                    # @foo
.Lfoo$local:
# %bb.0:                                # %entry
	movl	$.L.str, %edi
	jmp	.Ldemo$local            # TAILCALL
.Lfunc_end1:
	.size	foo, .Lfunc_end1-foo
                                        # -- End function
	.type	.L.str,@object          # @.str
	.section	.rodata.str1.1,"aMS",@progbits,1
.L.str:
	.asciz	"input string"
	.size	.L.str, 13

	.ident	"clang version 11.0.0 (https://github.com/llvm/llvm-project.git 9cf920e64d18e9c64706c8c8baf71a4919dcbb42)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
========================================================================

With clang llvmorg-11-init-06296-g03727687766:
========================================================================
	.text
	.file	"pr46463.c"
	.globl	demo                    # -- Begin function demo
	.p2align	4, 0x90
	.type	demo,@function
demo:                                   # @demo
.Ldemo$local:
# %bb.0:                                # %entry
	movq	%rdi, %rax
	retq
.Lfunc_end0:
	.size	demo, .Lfunc_end0-demo
                                        # -- End function
	.globl	foo                     # -- Begin function foo
	.p2align	4, 0x90
	.type	foo,@function
foo:                                    # @foo
.Lfoo$local:
# %bb.0:                                # %entry
	movl	$.L.str, %eax
	retq
.Lfunc_end1:
	.size	foo, .Lfunc_end1-foo
                                        # -- End function
	.type	.L.str,@object          # @.str
	.section	.rodata.str1.1,"aMS",@progbits,1
.L.str:
	.asciz	"input string"
	.size	.L.str, 13

	.ident	"clang version 11.0.0 (https://github.com/llvm/llvm-project.git 03727687766a72504712861bf038f0be962527d0)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
========================================================================
Comment 7 Nick Desaulniers 2020-06-26 10:50:23 PDT
Sounds like issues related to noinline were raised there: https://reviews.llvm.org/D75815#1941004
Comment 8 Johannes Doerfert 2020-06-26 11:40:46 PDT
I'd argue there is n obug. This function is not inlined, `noinline` is respected.

(In reply to Roman Lebedev from comment #1)
> noinline doesn't mean "block all optimizations", it only says "don't move
> code out of this function"

+1

See https://reviews.llvm.org/D75815#1976892


---

Some examples that might be interesting:

```
__attribute((noinline))
char *demo(char *s) {
    return s;
}

void foo()  {
    demo("input string");
}
```
would you argue there must be a call to demo in foo?
I would say no.


```
__attribute((noinline))
char *demo(char *s) {
    return NULL;
}

void foo()  {
    return demo("input string");
}
```
I would again not expect a call.


(Without testing it, I assume both of the above should have been optimized before the `returned` argument patch.)
Comment 9 Dimitry Andric 2020-06-26 11:53:18 PDT
From an end-user perspective, appearing to inline the function while it is marked as 'noinline' would be rather suprising. Is there any usefulness to the noinline attribute in that case at all?

That said, looking at gcc's documentation here:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noinline-function-attribute

they are saying:
> This function attribute prevents a function from being considered for
> inlining. If the function does not have side effects, there are
> optimizations other than inlining that cause function calls to be
> optimized away, although the function call is live.

Unfortunately they don't really say what those other optimizations are, but they also specify a workaround to get the desired effect, which is (at least from my point of view) to force the function to always be called, and never inlined:

> To keep such calls from being optimized away, put
> 
> asm ("");
> 
> (see Extended Asm) in the called function, to serve as a special side effect.
Comment 10 Dimitry Andric 2020-06-26 11:57:34 PDT
(In reply to Dimitry Andric from comment #9)
> they also specify a workaround to get the desired effect, which is (at least
> from my point of view) to force the function to always be called, and never
> inlined

What I meant by this was that sometimes end-users can want this, for various reasons. And instead of using the rather horrid method of inserting barriers, an official attribute specifying this (reallydontinline?) would be nicer. Then again, if you need to build software that compiles with both clang and gcc, you will likely be forced to use the barriers anyway.
Comment 11 Nick Desaulniers 2020-06-26 12:01:14 PDT
This kind of thing is going to get Clang banned from use in the Linux kernel: https://godbolt.org/z/M2g4ZJ

We ***need*** the semantics of noinline to mean "explicit call" instruction.  Messing with that is going to cause all kinds of unexpected bugs.

For example, noinline is used extensively in the kernel to limit stack usage, as the kernel only has 2 pages worth of stack to work with.

The linux kernel has 1696 uses of noinline today.  I'm sure a vast majority of them have very good reasons to do so.
Comment 12 Dimitry Andric 2020-06-26 12:18:59 PDT
Hm reading jdoerfert's comments on D75815 and the surrounding text, it seems that indeed the 'noipa' attribute is more likely covering this use case. But TBH I had not heard of this attribute before, maybe it is not that well-known?

At least gcc also supports it, but in their documentation they mention "This attribute is supported mainly for the purpose of testing the compiler". So no idea how production-ready it is.
Comment 13 Johannes Doerfert 2020-06-26 12:29:34 PDT
>  We ***need*** the semantics of noinline to mean "explicit call" instruction.  > Messing with that is going to cause all kinds of unexpected bugs.

The thing is, that was never the case: https://godbolt.org/z/rix62q
If you want that, we need a different attribute (in IR).

I understand people want to be compatible with GCC or have certain expectations. However, we should not solve this by inconsistent descriptions/interpretations of the IR. As shown above, we always removed calls to `noinline` functions even if the result was used. Not to mention if the result is unused and the function is `const`.
That we now propagate arguments through the call is the natural evolution from this year long precedence.

In https://reviews.llvm.org/D75815#1976892 we talk about `noipa` for the IR, similarly, you could do `optnone` on the call instruction. Either way, if you want to prevent optimization across call edges, we need to provide a different/new attribute to represent that (or use `optnone` for the function).


> From an end-user perspective, appearing to inline the function while it is marked as 'noinline' would be rather suprising. Is there any usefulness to the noinline attribute in that case at all?

`noinline` will, as described by GCC, prevent inlining. If we now say it also does "imply" X and Y, people that want only to prevent inlining might complain. It is also not defined that way so we play whack-a-mole with optimizations and analysis that interpret the meaning based on the definition we use (https://llvm.org/docs/LangRef.html#function-attributes).
That said, if we need an attribute to describe something similar to `noinline` we should create one and use it.



> Then again, if you need to build software that compiles with both clang and gcc, you will likely be forced to use the barriers anyway.

The (particular) asm barrier might or might not prevent us from doing the IPO. For now it would probably work.
Comment 14 Johannes Doerfert 2020-06-26 12:30:58 PDT
(In reply to Dimitry Andric from comment #12)
> Hm reading jdoerfert's comments on D75815 and the surrounding text, it seems
> that indeed the 'noipa' attribute is more likely covering this use case. But
> TBH I had not heard of this attribute before, maybe it is not that
> well-known?
> 
> At least gcc also supports it, but in their documentation they mention "This
> attribute is supported mainly for the purpose of testing the compiler". So
> no idea how production-ready it is.

As far as I know, LLVM doesn't have it (yet). We could/should introduce it though.
Comment 15 Nick Desaulniers 2020-06-26 12:46:11 PDT
(In reply to Nick Desaulniers from comment #11)
> This kind of thing is going to get Clang banned from use in the Linux
> kernel

Sorry, that was hyperbolic.  I had lunch and a cup of coffee and now I less hangry and fully caffeinated.

That said, none of our alarm bells have gone off about anything being broken from this change.  I would just hate for something to be found broken due to this further down the road.

Similarly that there are cases where __attribute__((always_inline)) doesn't mean "always inline," I guess we'll have cases to point people to where __attribute__((noinline)) doesn't mean "no inline."

I much prefer people having function attributes that provide some actual semantics rather than empty asm statements though.
Comment 16 Johannes Doerfert 2020-06-26 12:59:00 PDT
(In reply to Nick Desaulniers from comment #15)
> (In reply to Nick Desaulniers from comment #11)
> > This kind of thing is going to get Clang banned from use in the Linux
> > kernel
> 
> Sorry, that was hyperbolic.  I had lunch and a cup of coffee and now I less
> hangry and fully caffeinated.

No worries. I also do not want this to happen. The idea would have been that clang adds "something extra" if it compiles `noinline` in "kernel mode". That is still an option FWIW.


> That said, none of our alarm bells have gone off about anything being broken
> from this change.  I would just hate for something to be found broken due to
> this further down the road.

Short of having a second attribute in the kernel source I strongly advocate for a "kernel mode/flag" that adds the second attribute in IR.


 
> Similarly that there are cases where __attribute__((always_inline)) doesn't
> mean "always inline," I guess we'll have cases to point people to where
> __attribute__((noinline)) doesn't mean "no inline."

The first is true, the second is not. `noinline` did and does mean "no inline".
I remember we discussed a new attribute to address the former, as they were again two camps, those who wanted it to mean **always** and those who wanted the current behavior. `noinline` is the same. It cannot mean two things. It always did mean "no inline" but if people also want "no inter-procedural stuff" we need a new IR attribute and probably a source one too.



> I much prefer people having function attributes that provide some actual
> semantics rather than empty asm statements though.

+2 yes, and that is why I emphasize we add new function attributes (on IR level) if there is a need for them. `noipa` seems like a uncontroversial candidate to be added.
Comment 17 Hans Wennborg 2020-07-27 07:54:54 PDT
Nick flagged this as a potential release blocker a while ago. Are there still concerns here?
Comment 18 Roman Lebedev 2020-07-27 07:59:53 PDT
(In reply to Hans Wennborg from comment #17)
> Nick flagged this as a potential release blocker a while ago. Are there
> still concerns here?

Fixing this would require adding new noipa attribute.
I'm not sure this can be viewed as a blocker.
Comment 19 Johannes Doerfert 2020-07-27 08:09:41 PDT
(In reply to Roman Lebedev from comment #18)
> (In reply to Hans Wennborg from comment #17)
> > Nick flagged this as a potential release blocker a while ago. Are there
> > still concerns here?
> 
> Fixing this would require adding new noipa attribute.
> I'm not sure this can be viewed as a blocker.

Agreed. And the behavior did not conceptually change since 3.0 (https://godbolt.org/z/rix62q)
Comment 20 Hans Wennborg 2020-07-28 01:54:42 PDT
Okay, let's not block on it then.