The vector.warp_execute_on_lane_0 op allows incremental transformation of vectorized IR to GPU SIMT. Currently, the op lives in the vector dialect, hence expects distributed types to always be vectors. This is an unnecessary restriction (there is nothing specific to vector) that prevents other vector-like types to be distributed.
I am supportive of moving execute_on_lane_0 to the GPU dialect where it can be grounded in the GPU SIMT execution model, similar to other ops that assume the GPU execution model (reductions, various IDs, etc). This seems like a much better location that the vector dialect.
Currently, the op lives in the vector dialect, hence expects distributed types to always be vectors. This is an unnecessary restriction (there is nothing specific to vector) that prevents other vector-like types to be distributed.
I don’t believe that the op currently being in the vector dialect restricts it to vector types only – that’s orthogonal to its location. I would not use it as the justification for code motion that you propose.
Alright, let’s just say gpu is a better place for it. Allowing other types is indeed a separate problem that is not directly resolved by the code move.