Skip to content

Inline Asm Constraints and Modifiers - RVC, Raw Encodings, Pairs #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 12, 2024

Conversation

lenary
Copy link
Contributor

@lenary lenary commented Oct 15, 2024

We have customers with usecases that want more kinds of register
constraints and modifiers. This change proposes support for these
constraints and modifiers, and their names.

Broadly, these are intended to make it easier for users who want to
manually assemble instructions inside inline assembly blocks, either
using the existing instruction formats, or using the raw form of the
.insn directive. This makes it easier for hardware designers to
experiment on new ISA extensions, and makes it easier to support
the use of proprietary extensions with unmodified open-source
toolchains.

There are three groups of additions here:

  • Constraints for RVC-compatible registers. These use the c prefix on
    an existing register constraint, so cr gives a GPR between x8-x15,
    and cf does the same for an FPR between f8-f15.

    I'm not aware of compressed vector instructions, but we could add
    cvr, cvd and cvm in the future if the core architecture ends up
    having the concept of a vector register with an RVC encoding.

  • A modifier, N, to print the raw encoding of a register. This is used
    when using .insn <length>, <encoding>, where the user wants to pass
    a value to the instruction in a known register, but where the
    instruction doesn't follow the existing instruction formats, so the
    assembly parser is not expecting a register name, just a raw integer.

  • Constraints for even-odd pairs of general-purpose registers. These use
    the R constraint.

    While the concept of even-odd register pairs is reasonably "new",
    there are places in the architecture where these already exist - the
    doubleword/quad CAS in Zacas, and they are also present in the Zilsd
    specification.


There's also a commit which fixes a header for the Assembly Constraints table.

We have customers with usecases that want more kinds of register
constraints and modifiers. This change proposes support for these
constraints and modifiers, and their names.

Broadly, these are intended to make it easier for users who want to
manually assemble instructions inside inline assembly blocks, either
using the existing instruction formats, or using the raw form of the
`.insn` directive. This makes it easier for hardware designers to
experiment on new ISA extensions, and makes it easier to support
the use of proprietary extensions with unmodified open-source
toolchains.

There are three groups of additions here:

- Constraints for RVC-compatible registers. These use the `c` prefix on
  an existing register constraint, so `cr` gives a GPR between x8-x15,
  and `cf` does the same for an FPR between f8-f15.

  I'm not aware of compressed vector instructions, but we could add
  `cvr`, `cvd` and `cvm` in the future if the core architecture ends up
  having the concept of a vector register with an RVC encoding.

- A modifier, `N`, to print the raw encoding of a register. This is used
  when using `.insn <length>, <encoding>`, where the user wants to pass
  a value to the instruction in a known register, but where the
  instruction doesn't follow the existing instruction formats, so the
  assembly parser is not expecting a register name, just a raw integer.

- Constraints for even-odd pairs of registers. These use the `P` prefix
  on an existing register constraint. At the moment, this only defines
  `Pr` to mean an even-odd pair of GPRs. (We use `P` as a prefix as `p`
  already means "pointer" in GCC's target-independent constraints).

  I think this will print as the even register in the even-odd register
  pair, but I'm still working on the details around this.

  While the concept of even-odd register pairs is reasonably "new",
  there are places in the architecture where these already exist - the
  doubleword/quad CAS in Zacas, and they are also present in the Zilsd
  specification.

Signed-off-by: Sam Elliott <[email protected]>
src/c-api.adoc Outdated
@@ -746,13 +746,18 @@ statements, including both RISC-V specific and common operand constraints.
|K |5-bit unsigned immediate integer operand |
|J |Zero integer immediate operand |
|s |symbol or label reference with a constant offset |
|cr |RVC general purpose register (`x8`-`x15`) |
|cf |RVC floating point register (`f8`-`f15`) |
|Pr |Even-odd general purpose register pair |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any other targets support pair? I haven't been able to find any in LLVM and I don't know how to implement it in LLVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AArch64 supports tuples of 8 64-bit registers, as implemented here llvm/llvm-project@7d94043

I noticed that the NXP LLVM fork with Zilsd support seems to allocate both v2i<xlen> and i<2*xlen> to the paired register class, but maybe we only need to do the former? I'm not sure, as it would be nice to be able to pass uint<2*xlen>_t to a GPR pair constraint.

I understand that pair constraint is the most complex part of this proposal. I have most of a patch for the other constraints, which I will try to finish in the next few days and post to LLVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm supports pairs in inline asm using the "H" output modifier: https://developer.arm.com/documentation/dui0774/l/armclang-Inline-Assembler/Inline-assembly-template-modifiers/Template-modifiers-for-AArch32-state?lang=en. E.g. when reading a 64-bit system register in 32-bit mode:

    uint64_t _val;
    __asm__("mrrc p15, 0, %0, %H0, c14 : "=r" (_val));
    return _val;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went looking at this and honestly, using a specific constraint rather than something implicit in a modifier seems much more obvious. But thanks for pointing out they had it, the LLVM implementation (they do a bunch of fixing things up later in isel rather than in lowering) was so strange that I hadn't come across what was going on. The RISC-V implementation has ended up following what SystemZ did, which has some overlaps with how Arm works but not many.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also one more constraint on GCC side, a multi-letter constraint must have same length, e.g. define both Rp and Rpr are invalid, since it come with length 2 and 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to wonder how extensible we really need pairs to be. So far they've only turned up in the ISA for GPRs. I think for vector tuples, i think right now it's possible to just use vr or vd with the right type of argument?

So maybe all we need is GPR pairs, and we don't need to think about extensibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed an update with R, just saying that c* is extensible and pairs are not. We can still change the letter though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think right now it's possible to just use vr or vd with the right type of argument?

Yes, just vr or vd for vector tuple is fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe all we need is GPR pairs, and we don't need to think about extensibility?

I am OK with R for GPR pair only for now :)

lenary added a commit to lenary/llvm-project that referenced this pull request Oct 16, 2024
This change implements support for the `cr` and `cf` register
constraints (which allocate a RVC GPR or RVC FPR respectively), and the
`N` modifier (which prints the raw encoding of a register rather than the
name).

The intention behind these additions is to make it easier to use inline
assembly when assembling raw instructions that are not supported by the
compiler, for instance when experimenting with new instructions or when
supporting proprietary extensions outside the toolchain.

These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92

As part of the implementation, I felt there was not enough coverage of
inline assembly and the "in X" floating-point extensions, so I have
added more regression tests around these configurations.
@lenary
Copy link
Contributor Author

lenary commented Oct 16, 2024

I have an LLVM implementation of the c* constraints and the N modifier here llvm/llvm-project#112561 which has been approved.

I am still working on the implementation of the Pr constraint.

lenary added a commit to llvm/llvm-project that referenced this pull request Oct 18, 2024
This change implements support for the `cr` and `cf` register
constraints (which allocate a RVC GPR or RVC FPR respectively), and the
`N` modifier (which prints the raw encoding of a register rather than
the name).

The intention behind these additions is to make it easier to use inline
assembly when assembling raw instructions that are not supported by the
compiler, for instance when experimenting with new instructions or when
supporting proprietary extensions outside the toolchain.

These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92

As part of the implementation, I felt there was not enough coverage of
inline assembly and the "in X" floating-point extensions, so I have
added more regression tests around these configurations.
@lenary
Copy link
Contributor Author

lenary commented Oct 18, 2024

There was a question on #39 about whether the paired constraint means "an even GPR", or a "even-odd GPR pair", as both would likely print exactly the same as the even subregister.

This matters for liveness analysis as, if Pr meant "an even GPR", the odd GPR in the pair might be allocated to a different input/output for the inline assembly block, which would conflict with the use of the even/odd value for a pair.

So, to be clear, Pr means "treat this as an even-odd GPR pair for liveness" and "print this as the even subregister", so if the compiler choses a0/a1, then a0 is printed, but a1 will also be defined/used by the inline assembly block.

lenary added a commit to lenary/llvm-project that referenced this pull request Oct 23, 2024
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `Pr` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

Target-Independent Changes:
- This adds two new Machine Value Types (MVTs), which represent pairs for
  each xlen. Two are needed because MVTs usually have a fixed length. This
  change unfortunately increases the size of SelectionDAG tables indexed
  by MVT by a small percentage.

- When an inline assembly block returns multiple values, it returns them
  in a struct, rather than as a single value. This fixes TargetLowering
  so that `getAsmOperandValueType` is called on the types in that
  struct, so that targets have the opportunity to propose their own MVT
  for an inline assembly operand where this wouldn't match conventional
  arguments/return values. This matches what happens when a single value
  is returned.

RISC-V Changes:
- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files. I
  reordered the definitions in RISCVRegisterInfo.td and added headings
  to make it easier to browse.

- Adds new `GPRPair*` register classes which will be used for `Pr`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a
  `MVT::riscv_i<xlen>_pair`. Two new RISCVISD node types are added for
  creating and destructing a pair - `BuildGPRPair` and `SplitGPRPair`,
  and are introduced when bitcasting to/from the pair type and the
  `i<2*xlen>` type.

- This adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two).

- Ensures that the DAGCombiner doesn't merge the `bitcast` between
  `i<2*xlen>` types and the pair type into a load/store, as we want to
  legalise these 2*xlen-wide loads/stores as before - by splitting them
  into two xlen-wide loads/stores, which will happen with `i<2*xlen>`
  types.

- Ensures that Clang understands that `Pr` is a valid inline assembly
  constraint.
@lenary
Copy link
Contributor Author

lenary commented Oct 23, 2024

I just opened llvm/llvm-project#112983 which implements the Pr constraint.

@lenary
Copy link
Contributor Author

lenary commented Oct 29, 2024

Gentle Ping

Is there anything missing from the textual additions that is needed to clarify the proposal?

How are people feeling about this proposal?

@kito-cheng
Copy link
Collaborator

The proposal look good, let me make sure it's implementable on GCC side before we moving forward...:P

lenary added a commit to lenary/llvm-project that referenced this pull request Nov 12, 2024
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `Pr` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

Target-Independent Changes:
- This adds two new Machine Value Types (MVTs), which represent pairs for
  each xlen. Two are needed because MVTs usually have a fixed length. This
  change unfortunately increases the size of SelectionDAG tables indexed
  by MVT by a small percentage.

- When an inline assembly block returns multiple values, it returns them
  in a struct, rather than as a single value. This fixes TargetLowering
  so that `getAsmOperandValueType` is called on the types in that
  struct, so that targets have the opportunity to propose their own MVT
  for an inline assembly operand where this wouldn't match conventional
  arguments/return values. This matches what happens when a single value
  is returned.

RISC-V Changes:
- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files. I
  reordered the definitions in RISCVRegisterInfo.td and added headings
  to make it easier to browse.

- Adds new `GPRPair*` register classes which will be used for `Pr`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a
  `MVT::riscv_i<xlen>_pair`. Two new RISCVISD node types are added for
  creating and destructing a pair - `BuildGPRPair` and `SplitGPRPair`,
  and are introduced when bitcasting to/from the pair type and the
  `i<2*xlen>` type.

- This adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two).

- Ensures that the DAGCombiner doesn't merge the `bitcast` between
  `i<2*xlen>` types and the pair type into a load/store, as we want to
  legalise these 2*xlen-wide loads/stores as before - by splitting them
  into two xlen-wide loads/stores, which will happen with `i<2*xlen>`
  types.

- Ensures that Clang understands that `Pr` is a valid inline assembly
  constraint.
lenary added a commit to lenary/llvm-project that referenced this pull request Nov 13, 2024
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `Pr` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `Pr`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `Pr` is a valid inline assembly
  constraint.

- Adds Conditions to the GPRF64Pair-related changes to `LowerOperation`
  and `ReplaceNodeResults` which match when BITCAST for the relevant
  types should be handled in a custom manner.

- This also allows `Pr` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
lenary added a commit to lenary/llvm-project that referenced this pull request Nov 14, 2024
[RISCV] GPR Pairs for Inline Asm using `R`

This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `R` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `R`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `R` is a valid inline assembly
  constraint.

- Adds Conditions to the GPRF64Pair-related changes to `LowerOperation`
  and `ReplaceNodeResults` which match when BITCAST for the relevant
  types should be handled in a custom manner.

- This also allows `R` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
Copy link
Collaborator

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lenary added a commit to llvm/llvm-project that referenced this pull request Nov 18, 2024
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `R` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `R`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `R` is a valid inline assembly
  constraint.

- This also allows `R` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
@cmuellner
Copy link
Collaborator

cmuellner commented Nov 21, 2024

Kito just mentioned in the Toolchain SIG that he has a PoC implementation of this for LLVM and GCC.

@kito-cheng
Copy link
Collaborator

Just let you know the reason why I didn't send the GCC patch yet: Our server are under relocation and I forgot to push that to github...so I need to wait few more day until the server up again...

@kito-cheng
Copy link
Collaborator

We have both GCC and LLVM, so I gonna merge this PR :)

GCC Part: https://patchwork.sourceware.org/project/gcc/list/?series=41826, this will merge once CI passed.
LLVM Part: llvm/llvm-project#112983, llvm/llvm-project#112561

@kito-cheng kito-cheng merged commit 6c42a1f into riscv-non-isa:main Dec 12, 2024
@lenary lenary deleted the pr/inline-asm branch December 12, 2024 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants