[RFC] Executing multiple MLIR modules

Related: [RFC] mlir-spirv-runner
Current implementation: https://reviews.llvm.org/D86108

I have created a prototype for mlir-spirv-runner that executes a gpu kernel on the cpu via SPIR-V to LLVM conversion. As a part of the runner, I need to link and execute 2 MLIR modules, which is not possible with the current infrastructure.
I think it is easier to put the discussion here on how to address multiple MLIR modules execution, and outline the current points discussed. Ping @antiagainst, @MaheshRavishankar, @mehdi_amini and @ftynse for visibility.

Problem
After several passes, I have a nested module structure that I want to translate into LLVM IR and execute.

// main module
module {
  ...
  module { ... } // nested module
  ...
}

Ideally, I want to link the main and the nested modules.

Current solution
The solution I have currently implemented adds an optional function reference parameter called llvmModuleBuilder to JitRunnerMain(), similar to mlirTransformer. llvmModuleBuilder is a custom function specified by the user (mlir-spirv-runner in this case) that processes MLIR module and creates LLVM IR module in a specified way. In our case - translates each of the two modules to LLVM IR and linked them with LLVM’s Linker.

The clear drawback of this solution is that we need to pass llvmModuleBuilder through the whole stack down to ExecutionEngine::create(). There, we would call

auto llvmModule = llvmModuleBuilder ? llvmModuleBuilder(m, *ctx)
                                    : translateModuleToLLVMIR(m, *ctx);

to get LLVM IR module either via our custom function, or if it is not specified by translateModuleToLLVMIR() call.
This has some loose contract right now, so any suggestions are welcome.

Alternative
There is an alternative proposed in the discussion on the Phabricator:

translateModuleToLLVMIR() can be modified to handle multiple modules. So instead of just taking a single ModuleOp, it can take an ArrayRef<ModuleOp> as arguments and then link them. This way we wouldn’t have to thread through the llvmModuleBuilder callback through the entire stack. This would require ExecutionEngine::create() to take an argument ArrayRef<ModuleOp> instead of just ModuleOp and an extra utility to handle module separation (can be provided by mlir-spirv-runner or other user).

Any suggestions and thoughts are welcome!
Thanks,
George

It isn’t clear to me so far why this has to be done at the time of the conversion to LLVM IR and it can’t be done earlier at the MLIR level?

Since the goal is to link the main module and the nested module, it seems easier to do this at LLVM IR level and reuse the linking library. Otherwise, we would have to embed the nested module into the main and resolve all possible symbolic conflicts, which seems a less natural way I suppose?

What needs to happen here is effectively compiling two separate MLIR modules which need to be linked together. As George mentioned, this would mean symbol resolution etc. Stepping out of the specific need here, I thought it might be useful to modify the ExecutionEngine to take multiple MLIR modules, compile each down to LLVM, and use LLVM to link it. This was discussed in previous post as well, and @ftynse also mentioned that compiling to MLIR modules to LLVM and linking it as an option (Nested modules translation and symbol table - #6 by ftynse). Is there a specific issue with this generalization? It could be beneficial to other use cases as well.

The specific module structure mentioned above is specific to the mlir-spirv-runner . It makes sense to me to have the SPIR-V runner separate the two modules out into two separate compilation units and pass them into the JitEngine.

OK, this may time to resume the discussion on removing the MLIR ExecutionEngine entirely and migrate any remaining features (if any?) to LLVM ExecutionEngine (which exposes addModule() by the way).

There isn’t anything MLIR-specific left in the implementation right now I believe?

I think there are the following differences:

  1. the API itself is arguably more user-friendly, but LLJIT is approaching the same interface and it would be great to combine efforts;
  2. we have an “API canonicalization” pass that makes all functions available with void (void **) signature and unpacks the arguments, this is a pure LLVM IR rewrite that can live anywhere;
  3. we have a flow to dynamically load shared libraries and resolve symbols to those libraries, this was the most painful part to set up;
  4. minor CLI interface things like initializing and printing certain kinds of results, this can mostly go to a C library that we can compile, link and call into instead.

I think a good way to incrementally make progress in this direction and still landing new features, is to make sure that new changes are actually making the mlir::ExecutionEngine look like the llvm::ExecutionEngine. In this case instead of having the constructor take a list of module, the client could initialize the ExecutionEngine and then use addModule() separately?

If I understand this correctly that would mean splitting the ExecutionEngine::create(…) is split into

init
addModule
translateToLLVM

?

I looked back at LLVM and I suspect that llvm::ExecutionEngine may not be what we should target (only got there because of the same name as mlir::ExecutionEngine).
Instead the LLJIT can be used directly, seeing this example it seems entirely straightforward: https://github.com/llvm/llvm-project/blob/master/llvm/examples/HowToUseLLJIT/HowToUseLLJIT.cpp#L70-L80

@george is this something you could in the mlir-spirv-runner in place of mlir::ExecutionEngine?

Yes, this looks like something I can use in this case if we are not going via mlir::ExecutionEngine pipeline.