2011: E. Kmett uploads the semigroups package which uses the new name (<>). It’s easier to introduce a new thing than to change an old thing in a way that may break backwards-compatibility.
?: The Data.Semigroup modules gets absorbed into Data.Monoid from base.
?: Semigroup becomes a superclass of Monoid.
This discussion.
Long story short: As far as intentions go, (<>) was always supposed to be identical to mappend, both semantically and operationally.
Not just similar, but identical in terms of type signature and denotation. That’s the big reason why I think a cleanup is valuable. Because null and foldr have different types, it’s pretty obvious when to use one versus the other. In Alternative, because some and many have different denotations, it should be obvious when to use one versus the other. With (<>) and mappend, they do the same thing and they have the same type—that’s the problem.
There’s one non-aesthetic reason to favor (<>) over mappend, and it’s so that your code can depend on a smaller interface/be used over more types. That’s a pretty good reason! But it’s potentially opposed by a reason to favor mappend over (<>), because mappend could be more operationally powerful if it’s written to take advantage of a more specific instance declaration. I hope we can resolve this by sweeping the opposing reason under the rug in the case of mappend; I’m more concerned about (>>).
So my list would be:
the new class method is a variant, but not an exact duplicate (so it is clear when you should prefer one over the other), of an already existing class method (in the same class or an ancestor)
it does have operational advantages for certain instances (not just theoretically)
I’m saying we axe identical redundant methods, because they’re places for unexpected behavior to hide, they confuse learners, and they encourage a proliferation of wibbleA/wibbleM-style function pairs that use each version of the interface.
Non-identical redundant methods are okay because you use the one that is closest to what you want to do, and you potentially get the benefit of an implementation that is tailored to your use case. That benefit doesn’t exist with identical method pairs; they’re interchangeable, so how can you know which is tailored to your use case? (If we can define the use cases so that there is a way to know, then that’s different! I have no issue with the foldl'/foldl pair, for instance; we can explain when to use which.)
(Btw… I appreciate your sharp arguments and I’m not trying to be argumentative, but I find this is a good opportunity to figure out what criteria we have)
So your criteria for allowing redundant methods within a class hierarchy (as opposed to: a single class) would be that there is a clear operational difference across most instances?
So, hypothetically, if (>>) was 50% faster than (*>) in 80% of all known instances on hackage, we would allow it?
Otherwise, why not make null a top-level function?
The only big difference I see between null and that hypothetical (>>) is…
This is indeed true.
(slightly OT: I wonder if something like ifcxt could help with that, but I guess it could potentially mess up optimization and inlining? Found via this SO question)
The general principle I’m stumping for here is that there needs to be some sort of difference between the methods, such that we can tell people which to use in what circumstances. Denotational, operational, distinct type… anything that creates two natural niches for the two methods to occupy.
If the difference is operational, it doesn’t even have to apply across most instances; just enough of them for the advice of ‘try to use this one just in case’ to be valid. (I don’t know exactly what fraction of instances that nets out to.) I also don’t think it should be a point-in-time thing. Imagine that null didn’t yet exist on Foldable; at time of creation, it would be faster than its default implementation on 0% of instances! But obviously I can imagine an ideal end state in which several Foldable instances can offer an O(1) null, while the default implementation remains O(n), and that ideal is what I’d want to use to make the call.
(And this is the same principle I’d use for two methods in a single class, if there was some reason to do that—no ‘as opposed to’ about it! There needs to be a reason to choose one over the other, and a practical advantage to making the doppelganger method overridable. The strict and lazy fold pairs on Foldable are an example of identically-typed methods on one class meeting these criteria.)
If it were faster for a ‘good reason’, yeah? If it just happens to be faster because a bunch of instances haven’t bothered to make (*>) use the same, fast implementation, then I don’t want that to count—as above, it’s the ideal end state thing again! But if it’s faster because those instances can’t do that, and tactics like using a newtype wrapper to select the better implementation aren’t feasible to employ, then yeah, I think that difference would be plenty sufficient to convince me to allow it.
(The issue with (>>) versus (*>) is primarily space consumption, not speed per se, but that doesn’t meaningfully change my answer.)
What about the other two differences I mentioned? People are surprised to learn that (*>) can behave differently from (>>), but nobody is surprised to learn that null behaves differently from foldr, right? People can get confused about whether they should use (*>) or (>>), but if they want to know if a container is empty and they’re presented with null and foldr, it’s a no-brainer, right?
So, hypothetically, if (>>) was 50% faster than (*>) in 80% of all known instances on hackage, we would allow it?
Otherwise, why not make null a top-level function?
Another way to frame the difference between (>>)/(*>) and foldr/null:
if we carry out the “Monad with no (>>)” change, and if it turns out that for most types (>>) is faster than (*>) (but more constrained), this difference can still be addressed by forking all of the types and redefining their (*>) as (>>), accepting the extra constraints, without changing consumers of the Applicative and Monad classes (i.e., functions with an Applicative or Monad constraint).
if we carried out a hypothetical “Foldable with no null” change, we must rewrite all consumers of the Foldable class that use null with the expectation that it is constant-time for certain common types. Constant-time null is no longer possible with only foldr, so we would have to add a HasNull constraint everywhere to enable constant-time null implementations.
We have the “decision problem” for many other cases: strict vs lazy Text, ByteString vs ShortByteString, FilePath vs OsPath, foldl vs foldl’, etc.. These mostly differ operationally.
The question is really: can we document properly why and when to implement/use a non-canonical mappend? At the moment we were only able to come up with pathological examples. Does that mean we’ve exhausted the design space in this thread? Maybe not.
“Odd” APIs are everywhere, but they need a good excuse. It dawns to me that we don’t have a good excuse.
The reason I’m so invested in finding a good excuse is not because I think there is one, but because I’m very tired of typeclass redesigns. They’re expensive and very hard to revert. So we better be sure we didn’t miss anything.
We have the “decision problem” for many other cases: strict vs lazy Text, ByteString vs ShortByteString, FilePath vs OsPath, foldl vs foldl’, etc.. These mostly differ operationally.
Exactly, that’s what I meant with “order of evaluation”, albeit it was sloppy.
If that’s the case, if we have such a problem also in this case, then we have to go for the “odd API”.