Skip to content

[flang] fix ppc test broken after #74709 #74826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Dec 8, 2023

I don't have hardware to test this myself. Would anyone else be able to verify if it works?

I don't have hardware to test this myself. Would anyone else be able to
verify if it works?
@tblah tblah requested a review from DanielCChen December 8, 2023 11:22
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Dec 8, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2023

@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)

Changes

I don't have hardware to test this myself. Would anyone else be able to verify if it works?


Full diff: https://github.com/llvm/llvm-project/pull/74826.diff

1 Files Affected:

  • (modified) flang/test/Lower/PowerPC/ppc-vec-store.f90 (+48-48)
diff --git a/flang/test/Lower/PowerPC/ppc-vec-store.f90 b/flang/test/Lower/PowerPC/ppc-vec-store.f90
index 8e20228d68259..c25cc8b07cf79 100644
--- a/flang/test/Lower/PowerPC/ppc-vec-store.f90
+++ b/flang/test/Lower/PowerPC/ppc-vec-store.f90
@@ -89,10 +89,10 @@ subroutine vec_st_vi4i4via4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[iextsub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[iextmul:.*]] = mul i64 %[[iextsub]], 1
-! LLVMIR: %[[iextmul2:.*]] = mul i64 %[[iextmul]], 1
-! LLVMIR: %[[iextadd:.*]] = add i64 %[[iextmul2]], 0
+! LLVMIR: %[[iextsub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[iextmul:.*]] = mul nsw i64 %[[iextsub]], 1
+! LLVMIR: %[[iextmul2:.*]] = mul nsw i64 %[[iextmul]], 1
+! LLVMIR: %[[iextadd:.*]] = add nsw i64 %[[iextmul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr <4 x i32>, ptr %2, i64 %[[iextadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4
@@ -206,10 +206,10 @@ subroutine vec_ste_vi4i4ia4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr i32, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4
@@ -244,10 +244,10 @@ subroutine vec_stxv_test_vi4i8ia4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr i32, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i64, ptr %1, align 8
@@ -278,10 +278,10 @@ subroutine vec_stxv_test_vi4i4vai4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr <4 x i32>, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4
@@ -317,10 +317,10 @@ subroutine vec_xst_test_vi4i8ia4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr i32, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i64, ptr %1, align 8
@@ -351,10 +351,10 @@ subroutine vec_xst_test_vi4i4vai4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr <4 x i32>, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4
@@ -390,10 +390,10 @@ subroutine vec_xst_be_test_vi4i8ia4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr i32, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i64, ptr %1, align 8
@@ -426,10 +426,10 @@ subroutine vec_xst_be_test_vi4i4vai4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr <4 x i32>, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4 
@@ -467,10 +467,10 @@ subroutine vec_xstd2_test_vi4i8ia4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr i32, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i64, ptr %1, align 8
@@ -503,10 +503,10 @@ subroutine vec_xstd2_test_vi4i4vai4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr <4 x i32>, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4 
@@ -543,10 +543,10 @@ subroutine vec_xstw4_test_vi4i8ia4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr i32, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i64, ptr %1, align 8
@@ -578,10 +578,10 @@ subroutine vec_xstw4_test_vi4i4vai4(arg1, arg2, arg3, i)
 
 ! LLVMIR: %[[i:.*]] = load i32, ptr %3, align 4
 ! LLVMIR: %[[iext:.*]] = sext i32 %[[i]] to i64
-! LLVMIR: %[[isub:.*]] = sub i64 %[[iext]], 1
-! LLVMIR: %[[imul1:.*]] = mul i64 %[[isub]], 1
-! LLVMIR: %[[imul2:.*]] = mul i64 %[[imul1]], 1
-! LLVMIR: %[[iadd:.*]] = add i64 %[[imul2]], 0
+! LLVMIR: %[[isub:.*]] = sub nsw i64 %[[iext]], 1
+! LLVMIR: %[[imul1:.*]] = mul nsw i64 %[[isub]], 1
+! LLVMIR: %[[imul2:.*]] = mul nsw i64 %[[imul1]], 1
+! LLVMIR: %[[iadd:.*]] = add nsw i64 %[[imul2]], 0
 ! LLVMIR: %[[gep1:.*]] = getelementptr <4 x i32>, ptr %2, i64 %[[iadd]]
 ! LLVMIR: %[[arg1:.*]] = load <4 x i32>, ptr %0, align 16
 ! LLVMIR: %[[arg2:.*]] = load i32, ptr %1, align 4 

@tblah tblah requested review from kkwli and madanial0 December 8, 2023 11:25
@kkwli
Copy link
Collaborator

kkwli commented Dec 8, 2023

Yes, the change works for PowerPC. Thanks.

Copy link
Collaborator

@kkwli kkwli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG. Thanks.

@tblah tblah merged commit a9adcef into llvm:main Dec 11, 2023
tblah referenced this pull request Dec 11, 2023
`nsw` is a flag for LLVM arithmetic operations meaning "no signed wrap".
If this keyword is present, the result of the operation is a poison
value if overflow occurs. Adding this keyword permits LLVM to re-order
integer arithmetic more aggressively.

In

https://discourse.llvm.org/t/rfc-changes-to-fircg-xarray-coor-codegen-to-allow-better-hoisting/75257/16
@vzakhari observed that adding nsw is useful to enable hoisting of
address calculations after some loops (or is at least a step in that
direction).

Classic flang also adds nsw to address calculations.
@kkwli
Copy link
Collaborator

kkwli commented Dec 11, 2023

Oops. Sorry about that I forgot about ppc-vec-store-elem-order.f90 that essentially has the same issue.

@kkwli
Copy link
Collaborator

kkwli commented Dec 11, 2023

@tblah Let me create a PR to fix it up.

@kkwli
Copy link
Collaborator

kkwli commented Dec 11, 2023

#75064

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants