[RFC] Move execute_on_lane_0 from vector to gpu dialect

The vector.warp_execute_on_lane_0 op allows incremental transformation of vectorized IR to GPU SIMT. Currently, the op lives in the vector dialect, hence expects distributed types to always be vectors. This is an unnecessary restriction (there is nothing specific to vector) that prevents other vector-like types to be distributed.

This proposes to move the op definition and some related distribution utilities (similar to [MLIR][Vector][NFC] Move helper functions to vector distribution utils by kurapov-peter · Pull Request #114208 · llvm/llvm-project · GitHub) to the gpu dialect. All the existing distribution patterns will use the exposed utilities but still reside in vector.

For more context see [RFC] Extending vector distribution to support other types.

1 Like

I am supportive of moving execute_on_lane_0 to the GPU dialect where it can be grounded in the GPU SIMT execution model, similar to other ops that assume the GPU execution model (reductions, various IDs, etc). This seems like a much better location that the vector dialect.

Currently, the op lives in the vector dialect, hence expects distributed types to always be vectors. This is an unnecessary restriction (there is nothing specific to vector) that prevents other vector-like types to be distributed.

I don’t believe that the op currently being in the vector dialect restricts it to vector types only – that’s orthogonal to its location. I would not use it as the justification for code motion that you propose.

Alright, let’s just say gpu is a better place for it. Allowing other types is indeed a separate problem that is not directly resolved by the code move.

+1, thanks!
CC: @grypp?

The semantics of this OP make sense to include it in the GPU dialect.

I am not sure if this OP is used for non-GPU codegen. If it isn’t, we can move it to the GPU dialect.

Looks like there are no objections. I’ll wait just a couple more days for people to react, and then move it.

2 Likes