[RFC] mlir-spirv-runner

george · August 10, 2020, 10:31am

Hi everyone,

I am working on SPIR-V to LLVM conversion at the moment and together with @antiagainst, @MaheshRavishankar we were thinking to have a “mlir-spirv-runner” tool. This runner will in some way resemble Cuda/Vulkan runners and aim at JITing SPIR-V via SPIR-V to LLVM conversion. One of the results I am particularly interested in is executing GPU/SPIR-V modules on CPU.

Encoding descriptor sets
Kernel arguments in SPIR-V are represented as global variables with set and binding numbers specified, e.g.

spv.globalVariable @__var bind(0, 1) : ...

I think this can be encoded in symbolic reference of the variable:

spv.globalVariable @__var_set0_binding1 : ...

so that we can lower spv.globalVariable to llvm.mlir.global via existing conversion pass.

Pipeline
I have 2 options how to structure the pipeline:

Nested modules

Untitled Diagram1521×671 131 KB

The input is a module with a gpu module containing the kernel, main function and function declarations for helpers (LHS of the diagram). The outline of the passes is the following:

Convert GPU dialect to SPIR-V dialect
Lower ABI attributes and update VCE triple
Preprocess spv.module so that it can be lowered to LLVM (called SPIRVEncodeDescriptorSets in the diagram for now and described in more detail above)
Convert SPIR-V to LLVM (and drop entry points for now assuming there is no “internal” functions)
Convert standard to LLVM
Handle GPU launch op (ConvertGPULaunchToLLVM). For that, we can get the source pointer to the buffer data and the destination pointer of the kernel’s global variable. We would naturally want to transfer the buffer from the host to device to execute the kernel. But we are running on CPU so instead we “emulate” this memory transfer by copying the data to some destination pointer (global variable in our case), and then executing the kernel which now has its global variables with data all set up.

The problem with this approach is that the result of running the passes is a nested module that cannot be translated to proper LLVM. To take care of it, we can “embed” the kernel’s module into the main one, and resolve possible conflicts in symbolic references.

Separate modules

Untitled Diagram-21531×1230 157 KB

This approach separates the host code and the device code into 2 modules in 2 files. The pipeline is similar to the one above, but in the end we compile the nested module and the main module separately into two separate object files, and then link them.

This approach has a number of drawbacks however:

We would need to specify the variable/kernel declarations in the main module to tell the compiler that those exist in some other module.
More importantly, there is no crossing of the boundary in modules in MLIR at the moment. Handling this is a separate case and it has to be discussed separately.

I see the second approach more natural, but given the current state of separate modules handling I think that embedding may be preferable.

It would be great to hear any other comments on this!

Thanks,
George

mehdi_amini · August 11, 2020, 3:49am

I don’t really understand what is the conceptual difference between mlir-spiv-runner and mlir-vulkan-runner right now?

MaheshRavishankar · August 11, 2020, 5:04am

Vulkan runner uses the Vulkan API to launch SPIR-V binary. This is using the SPIR-V converted to LLVM, The converted module is compiled and linked with the object file generated for the host side.

mehdi_amini · August 11, 2020, 5:05am

Oh so this is running on the cpu then? This is more like mlir-cpu-runner with a spirv lowering path? Is there a specific runtime as well or it is just the lowering pipeline that is different?

george · August 11, 2020, 6:34am

Sorry, I maybe should have been more specific on the conceptual part.

It is running on CPU, and the runtime side is the same as in mlir-cpu-runner: lowering to LLVM IR, JIT-compiling and executing. The difference is in the starting point (and hence the lowering pipeline), which is roughly the same as the gpu runners use, and the actual goal of being able to JIT SPIR-V.

Topic		Replies	Views
[RFC] Converting multi-threaded SPIR-V to LLVM dialect: overview MLIR	14	1166	January 22, 2021
Struggling with simple mlir-translate to spir-v MLIR gpu	13	957	March 7, 2023
SPIR-V to SPIR-V dialect translation MLIR	4	1101	March 24, 2020
Breaking change to gpu.func to spv.func lowering MLIR	23	692	April 4, 2020
SPIRV binary to MLIR hello world example MLIR	40	3945	June 10, 2020

[RFC] mlir-spirv-runner

Related topics