Updates to Casting.h

Why

Casting.h is a critical piece of infrastructure, but its customization hooks are currently undocumented and inflexible, and as a result the old infra really only works well for base-to-derived pointer conversions. This leads to projects like MLIR having to reinvent the wheel any time they want to support slightly different casting behavior. The old infra wasn’t designed to support all the use cases we expect of it, and anything else evolved piecemeal. Since casting is core to a lot of code, we should update its abstractions/interfaces with a design that reflects its current uses but leaves room for future growth.

What do we have now?

Casting.h works through 3 structs - isa_impl, cast_retty, and cast_convert_val. They’re fairly inflexible and not really documented, making it difficult to customize behavior (as the MLIR folks have seen).

An instructive example here is the MLIR Attribute:

template <typename U> bool isa() const;
template <typename First, typename Second, typename... Rest>
bool isa() const;
template <typename First, typename... Rest>
bool isa_and_nonnull() const;
template <typename U> U dyn_cast() const;
template <typename U> U dyn_cast_or_null() const;
template <typename U> U cast() const;

Another instructive example that has to use non-standard APIs to do casting: llvm::PointerUnion (it’s got isa, then cast, then dyn_cast but only dyn_cast has the same name for extra confusion)

  /// Test if the Union currently holds the type matching T.
  template <typename T> bool is() const {
    return this->Val.getInt() == FirstIndexOfType<T, PTs...>::value;
  }
  
  /// Returns the value of the specified pointer type.
  ///
  /// If the specified pointer type is incorrect, assert.
  template <typename T> T get() const {
    assert(is<T>() && "Invalid accessor called");
    return PointerLikeTypeTraits<T>::getFromVoidPointer(this->Val.getPointer());
  }
  
  /// Returns the current pointer if it is of the specified pointer type,
  /// otherwise returns null.
  template <typename T> T dyn_cast() const {
    if (is<T>())
      return get<T>();
    return T();
  }

Since Casting.h doesn’t support value-to-value casting, these structs have to re-implement the whole infrastructure with class methods. This makes it difficult to re-use infrastructure and decreases modularity and uniformity across the LLVM monorepo. Personally, one of the things I like most about LLVM is that I can drop in pretty much anywhere and the code looks largely similar - it makes it very easy to start to grok things that I have never looked at before.

A brave new world

This patch redesigns the interface from a user’s perspective. It expands the expressive capability of the casting utilities by introducing new, documented abstractions that can be used to support more flexible/varied forms of casting. Current cases are supported by default, of course. This is enabled by a new struct, CastInfo, shown here with details elided.

template <typename To, typename From, typename Enable = void>
struct CastInfo {
  static inline bool isPossible(From f);
  static inline CastReturnType doCast(From f);
  static inline CastReturnType castFailed();
  static inline CastReturnType doCastIfPossible(From f);
};

CastInfo is the main entrypoint to the cast functionality in this patch, and as you can see it’s a little more flexible than cast_convert_val. To begin, it’s possible to override what is functionally isa_impl from CastInfo, meaning you can specialize just one struct if you want to customize casting behavior. Next, we have doCast which provides the implementation of cast<T>. Functionally, it’s cast_convert_val and in fact the default implementation forwards to cast_convert_val for backwards compatibility. This might be updated in the future, but once it’s hidden behind an interface, we can change it without affecting users! Finally, we can group castFailed and doCastIfPossible; these two together provide the functionality for dyn_cast and friends. doCastIfPossible is separated from doCast because in some cases (e.g. MLIR Dialect Interfaces) (a) the lookup required for the isPossible step is non-trivial and (b) the lookup will return the exact result of castFailed on failure. In these cases we currently rely on inlining to avoid multiple lookups but with the new interface this is no longer a concern - just implement doCastIfPossible.

This can of course create a lot of boilerplate for simple use cases, and so this patch also introduces Cast Traits. Cast Traits provide one or more of the methods of CastInfo and can be used to implement a specialization of CastInfo by inheriting from them. As an example, we can use the MLIR Operation:

namespace llvm {
/// Cast from an (const) Operation * to a derived operation type.
template <typename T>
struct CastInfo<T, ::mlir::Operation *>
    : public ValueFromPointerCast<T, ::mlir::Operation,
                                  CastInfo<T, ::mlir::Operation *>> {
  static bool isPossible(::mlir::Operation *op) { return T::classof(op); }
};
template <typename T>
struct CastInfo<T, const ::mlir::Operation *>
    : public ConstStrippingForwardingCast<T, const ::mlir::Operation *,
                                          CastInfo<T, ::mlir::Operation *>> {};

/// Cast from an (const) Operation & to a derived operation type.
template <typename T>
struct CastInfo<T, ::mlir::Operation>
    : public NullableValueCastFailed<T>,
      public DefaultDoCastIfPossible<T, ::mlir::Operation &,
                                     CastInfo<T, ::mlir::Operation>> {
  static bool isPossible(::mlir::Operation &val) { return T::classof(&val); }
  static T doCast(::mlir::Operation &val) { return T(&val); }
};
template <typename T>
struct CastInfo<T, const ::mlir::Operation>
    : public ConstStrippingForwardingCast<T, const ::mlir::Operation,
                                          CastInfo<T, ::mlir::Operation>> {};

Here we override a few of the hooks manually, but choose to use already-provided Cast Traits for expressive and compact specialization. The idea of the Cast Traits is that many common casting cases that are not supported by the old infrastructure should be possible (e.g. returning Optional from a dyn_cast) with the ones provided already, but also that adding new ones as new use cases arise should also be very simple.

There is one Cast Trait that is a little special called CastIsPossible. The reason it’s special is because many projects (LLVM/Clang) specialize isa_impl for a variety of reasons, and we want to be able to hide those specializations behind a documented interface rather than using an implementation detail like isa_impl. The way that this trait is special is that the default CastInfo inherits from CastIsPossible, so just overriding CastIsPossible allows you to use a documented API rather than implementation details to provide special isa support for your types.

I mentioned value-to-value casting earlier, but those types are all constructible from nullptr. Sometimes we have value types where we’d like to have dyn_cast<ValueType> return an Optional<ValueType>. Maybe not everybody, but it should at least be possible to do. Because of this, we also need to be able to dyn_cast_or_null on an Optional<T>. or_null doesn’t make a ton of sense when you’re not talking about a pointer, and furthermore, other use cases may have different definitions of what ‘null’ means for the case of dyn_cast_or_null. For this reason, this patch introduces dyn_cast_if_present and friends, and we have dyn_cast_or_null forwarding to that (but to be deprecated in favor of the if_present variant). Again, like with the Cast Traits, we’d like to provide to the user a way to define what ‘present’ means for their type, so we provide the struct ValueIsPresent (details elided below):

template <typename T, typename Enable = void> struct ValueIsPresent {
  static inline bool isPresent(const T &t);
  static inline decltype(auto) unwrapValue(T &t);
};

This allows us to define what being ‘present’ means for our type as we wish, as well as how to ‘unwrap’ it for the sake of performing a cast. This shouldn’t need to be specialized as often as this patch provides specializations for Optional<T> and any value that is nullable, which has a test defined like this:

template <typename T>
constexpr bool IsNullable = std::is_pointer<T>::value ||
                            std::is_constructible<T, std::nullptr_t>::value;

Basically, if a value is ‘nullable’ then isPresent compares it with nullptr and unwrapValue just forwards its argument. Of course this struct can be specialized to provide custom behavior, and it is a documented, public API.

TL;DR: A new public API for specializing how we do isa/cast/dyn_cast called CastInfo is in this patch.

Upcoming Changes

Large swathes of MLIR could be converted to use the common casting machinery, and making these changes will just be large NFC changes that we’ll stage in as appropriate. Specifically, this means updating Attributes, Types, and Dialects at least, Operations are in a patch already.

We also want to make isa_impl, cast_retty, and cast_convert_val into implementation details, so there’ll be some changes coming to switch users of isa_impl over to specialize CastIsPossible and moving isa_impl into a detail namespace. Again, this change will be NFC. Users of cast_retty and cast_convert_val should just specialize CastInfo directly, and that change will also come later to move cast_retty/cast_convert_val into a detail namespace and switch their users over to the new public API.

I would love opinions/reviews on the patch, I find the LLVM RTTI infrastructure incredibly useful…until it’s not, and I’d like to make it useful in more cases so more folks can use it!

Thank you for working on this, this is a hugely important rock in the LLVM universe. I’m glad to see it improved,

-Chris

Does this impact compile time (if used for dyn_cast/cast/isa) or is it all folded away nicely?
Is there any change required if one wants to use the “old” dyn_cast/cast/isa behavior?

I didn’t notice anything - that said is there a benchmark suite I can run to check this? It should be folded away, I’m not doing anything fancier than what already exists for the most part.

If you don’t want to customize the way your cast is performed (so pointer-to-pointer with the normal static bool classof() thing), you don’t have to change anything.

If you’re a current user of isa_impl you’ll have to switch to CastIsPossible. I’ll go through and do this for the monorepo in a series of NFC patches, then put isa_impl into a detail namespace. For cast_convert_val or cast_retty you’d have to switch to CastInfo. Same procedure here, I’ll switch stuff over with a find/replace and then add cast_convert_val and cast_retty to a detail namespace.

2 Likes

Huge +1 from me. I have battled with the casting stuff for years trying to get to work for us (MLIR). Really happy to see this, I think this is a really great step forward.

– River

Ran the patch through llvm-compile-time-tracker, looks fine

3 Likes