[llvm-exegesis][AArch64] Load and Store Instruction Benchmarking Support

We would like to add support for load and store instructions for AArch64 in llvm-exegesis, which currently throw a Segmentation Fault or error “Not all operands are initialized by snippet Generator”.

The reason is that the memory these instruction load/store from/to is uninitialised, so our question is how to initialise memory in the setup code. We are looking at x86 for which it seems to work, and also try to get inspiration from other targets more similar to AArch64 but it is unclear how well this is supported.
We are missing the high level idea and design how to get this working and what concepts apply to our target. So while we continue exploring this, any high level ideas how to implement loads/stores would be very much appreciated.

These are some of the questions that we have:

  • We have noticed the memory annotations, e.g. LLVM-EXEGESIS-MEM-MAP and LLVM-EXEGESIS-MEM-DEF, but is this necessary to implement loads/stores?
  • We started looking at one of the most basic load instruction that load a GPR register (LDR w0, [x0] → LDRWui) and found that llvm-exegesis fails to identify operands as memory operands here. Is that to be expected? That seems weird?

Benchmarking memory instructions for scheduling model validation is only done with the scratch register. From the documentation:

  • Scratch memory register - The specific register that this value is put in is platform dependent (e.g., it is the RDI register on X86 Linux). Setting this register as a live in ensures that a pointer to a block of memory (1MB) is placed within this register that can be used by the snippet.

It should just end up being put in the register that carries the first argument of a function call (RDI on X86 according to the SysV ABI on Linux). llvm-project/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp at cb0b9614f8ca7ffcd5f091b1c9990adfd6cb7e33 · llvm/llvm-project · GitHub

The memory annotations are primarily intended for use in custom snippets, although I’ve been wanting to play around with using them for automatic snippet generation. They shouldn’t be needed for the simple cases though.

I’m not sure why this isn’t working on AArch64, but have no idea if it has ever worked in the past. It probably just needs someone to hack on it a bit to get it into shape.