-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Description
Here is YAAFSP: yet another anonymous function syntax proposal. I'm opening as an issue rather than a PR because I have no idea how to start implementing this, but if someone has some tips to get me started I could try.
It might be nice to not have to give explicit names to the arguments to anonymous functions. This is simply because, very often, these names will never get used again.
Consider:
x -> x + 1
The variable x
here is only used once, so there is no need to give it an explicit name. Moreover, if we had a list of the "default" argument names, we could syntactically figure out how many arguments there are, just by looking at the surface level expression (assuming the last argument actually gets used at some point).
Suppose that we will always call the first argument _
, the second argument __
(two underscores), the third argument ___
(three underscores), etc. These are names are somewhat arbitrary. Perhaps it is difficult to distinguish the number of underscores, so something like _1
, _2
, _3
might work too; proposals welcome.
Then, by looking at the body of the anonymous function, we could automatically figure out how many arguments there are:
_ + 1
has 1 argument, because a single underscore appears._ + __
has two arguments, because both a single and double underscore appear.1 + __
has two arguments. The first, a single underscore, isn't used, but the second, a double underscore, is used.abs
has no arguments written. This is a bit of a corner case. We can assume that the expression itself returns a function (in this case,abs
). Alternatively, you might think that it means something like() -> abs
, which is perhaps a bit more consistent, but maybe not as useful.
This is great, but we still need a marker to determine where the boundary of the anonymous function is. Instead of some sort of arbitrary rule, I'm proposing an explicit marker. In particular, I'm proposing "unary" |>
(again, this is symbol is somewhat arbitrary, so other proposals welcome).
So
|> _ + 1
gets lowered to something likex -> x + 1
.|> _ + __
gets lowered to something like(x, y) -> x + y
|> 1 + __
gets lowered to something like(x, y) -> 1 + y
|> abs
gets lowered toabs
.
The cool thing about unary |>
is that we can extend it for use in chaining, that is, "binary" |>
.
So 1 |> _ + 1
would get lowered to something like 1 |> (x -> x + 1)
.
Existing chains would continue to work. This is because of the "corner case" above of no arguments. In 1 |> abs
, abs
would remain unchanged.
So pros and cons:
Pros:
- Relatively non-breaking. Unary
|>
currently errors. Binary|>
with underscores throws errors due to underscores banned as r-values, and if there are no underscores, nothing changes anyway. - There is no need for arbitrary rules to figure out the "boundaries" of anonymous functions. This comes with a con below.
- Fairly powerful, and can do a lot that existing anonymous function syntax can do. See the cons about silent arguments, keyword arguments, and slurping below though.
- Similar to (and inspired by) how standalone queries work in the queryverse, so at least some users might use similar syntax already. https://www.queryverse.org/Query.jl/stable/standalonequerycommands/
- No need to continually add arbitrary curried methods for functions (side-note: can we please stop doing that).
- Works very nicely with chaining, and can greatly reduce the amount of text needed.
x |> (x -> x + 1) |> (x -> x - 1)
will be much shorter:
x |> _ + 1 |> _ - 1
Cons:
- Because we are explicitly marking the boundary of the anonymous function, this is a fairly verbose options. Probably in most cases, this won't significantly reduce the amount of text needed. Even though it won't reduce the amount of characters much, it will possibly reduce the amount of mental work needed to understand what a symbol means, because arguments are always the same and always mean the same thing. It's also definitely a clear winner in brevity when used with chaining, see the pro above.
- Doesn't really work well if you want to you want to write an anonymous function where you don't use the last argument. That is because we can't guess how many arguments there are if you don't use the actually use the last argument.
- Doesn't really work with slurped or keyword arguments (although perhaps there are clever ways to support these too).
- We might have to think about about what the order of operation rules for unary
|>
should be. Ideally, something like the following would work without any additional parentheses:
map(|> _ + 1, X)
Notes:
- There might be an argument for automatically inlining functions defined this way. Certainly it would make sense in the context of chaining, where the little anonymous functions can never be used again anyway.
- People complain that "chaining is confusing" but I think that might be because they haven't practiced much, or didn't come to Julia via working with tabular data. Pretty much any software for tabular data implicity or explicity uses chaining (e.g. SQL, dplyr, LINQ, Query.jl, DataFramesMeta). And I'd hazard to guess that most of the growth in Julia users in future years will come from this area, as opposed to matlab/fortran matrix people. I bet someone smarter than me could come up with a fancier explanation of why it's useful too (monoids? a linear graph of inputs?)