Discussion:
various forms of generic programming?
(too old to reply)
raould
2010-02-09 00:38:26 UTC
Permalink
hello

the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap. but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches. of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.

gracias.

[1] http://repository.tamu.edu/bitstream/handle/1969.1/5008/etd-tamu-2006C-CPSC-Nguessan.pdf
Michael Ekstrand
2010-02-09 01:30:32 UTC
Permalink
Post by raould
the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap. but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches. of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.
C++ does have some very powerful generic programming features. I find,
however, that I can do the vast majority of useful generic programming
things that I might do with C++ templates by using functors and
higher-order functions in OCaml (or Standard ML).

Functors allow modules to be parameterized over other modules,
frequently data types and associated operations. Practically, this
allows you to have code which can operate over any structures/backends
that meet particular requirements, and the backend can be switched by
simply instantiating the functor over a different module. Further, you
have very strong typing guarantees which are far easier to debug than
C++ template error messages.

The OCamlGraph library is a very good example of how ML-style generic
programming can be implemented and used. It provides a variety of
graph-related algorithms, each functorized over a module interface
exposing the graph functionality it needs. It also provides a variety
of graph implementations. You can then piece together the pieces you
need. For example:

(* let's use persistent directed graphs *)
module G = Graph.Persistent.Digraph.Concrete(VertexModule)
(* everything has weight 1 *)
module W = struct
type label = unit
type t = int
let weight () = 1
let compare = Pervasives.compare
let add = (+)
let zero = 0
end
(* instantiate Dijkstra's algorithm with our graph and weight *)
module SP = Path.Dijkstra(G)(W)

Now we have access to Dijkstra's all-pairs shortest path over our choice
of graph implementation. Many other things don't even need the
syntactic overhead of modules and functors - passing around closures
will suffice.

- Michael
raould
2010-02-10 18:07:39 UTC
Permalink
Post by Michael Ekstrand
The OCamlGraph library is a very good example of how ML-style generic
programming can be implemented and used.
thanks for the note! graph libraries are what are used in the paper(s)
referenced, and ml and haskell and scala turn out to not be too bad
compared to java and c#.

sincerely.

(mls have been a favourite family of mine since sml in college, but i
have never had super great luck with ocaml.)
Keith H Duggar
2010-02-09 05:20:50 UTC
Permalink
Post by raould
the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap. but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches. of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.
You might find these useful

"A Comparative Study of Language Support for Generic Programming"
Garcia et al
http://www.osl.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf

"An Extended Comparative Study of Language Support for Generic
Programming" Garcia et al
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.110.122&rep=rep1&type=pdf

KHD
raould
2010-02-10 18:08:24 UTC
Permalink
Post by Keith H Duggar
You might find these useful
ja, i think some of those were pre-cursors to the scala paper?
Ertugrul Söylemez
2010-02-09 05:43:29 UTC
Permalink
Post by raould
the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap. but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches. of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.
For most generic programming tasks the type system of standard Haskell
is totally sufficient. If you really need GHC extensions, you're
already doing something more sophisticated. Note that the type systems
of most other languages can't even express the most basic Haskell
constructs.

In fact, C++ isn't even a good choice, when it comes to generic
programming. Besides its ugly syntax, it also lacks type inference,
which can be very annoying. Further it has no support for anonymous or
at least lexically scoped functions, not to mention closures.

Another important fact is that C++ has no contracts, contexts,
interfaces or whatever you call them, so you need dynamic casts, which
are bad. A lot of problems are catched by them at compile time, but
because of the lack of them, most type errors in C++ can only be found
at run time, if the programmer doesn't forget to check for them.

Finally it's impossible to separate template interface and
implementation in a sane way. There is no module system of any kind.
You get header files and the classic dumb linker.

To see all of these problems in action at the same time, try to
implement Haskell's Functor class as a C++ template. It's about the
simplest example of useful generic programming, and C++ makes it a
nightmare.

No type inference means that you have to specify your container type all
the time. No anonymous functions, not even lexically scoped functions,
so your mapping function will be a named top-level function. No
contracts, so your fmap function doesn't even know whether your
container type is in fact a container type. It needs dynamic_cast, so
type errors are not catched at compile time. Furthermore it's
impossible to extend existing container classes to support your new
Functor template, not to mention primitive types like arrays. You need
to subclass existing classes and make a completely new class for arrays.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
raould
2010-02-10 18:10:07 UTC
Permalink
Post by Ertugrul Söylemez
In fact, C++ isn't even a good choice, when it comes to generic
programming.
hrm, part of this might be the "which kind of generic are we talking
about" question? the generic was for STL-style generic, rather than
data-shape-style generic.

(yes, personally, i would rather stay well away from C++ in just about
any shape or form, if i have the option ;-)
Jon Harrop
2010-02-10 23:35:18 UTC
Permalink
Post by Ertugrul Söylemez
For most generic programming tasks the type system of standard Haskell
is totally sufficient. If you really need GHC extensions, you're
already doing something more sophisticated. Note that the type systems
of most other languages can't even express the most basic Haskell
constructs.
In fact, C++ isn't even a good choice, when it comes to generic
programming. Besides its ugly syntax, it also lacks type inference,
which can be very annoying. Further it has no support for anonymous or
at least lexically scoped functions, not to mention closures.
Another important fact is that C++ has no contracts, contexts,
interfaces or whatever you call them, so you need dynamic casts, which
are bad. A lot of problems are catched by them at compile time, but
because of the lack of them, most type errors in C++ can only be found
at run time, if the programmer doesn't forget to check for them.
Finally it's impossible to separate template interface and
implementation in a sane way. There is no module system of any kind.
You get header files and the classic dumb linker.
To see all of these problems in action at the same time, try to
implement Haskell's Functor class as a C++ template. It's about the
simplest example of useful generic programming, and C++ makes it a
nightmare.
No type inference means that you have to specify your container type all
the time. No anonymous functions, not even lexically scoped functions,
so your mapping function will be a named top-level function. No
contracts, so your fmap function doesn't even know whether your
container type is in fact a container type. It needs dynamic_cast, so
type errors are not catched at compile time. Furthermore it's
impossible to extend existing container classes to support your new
Functor template, not to mention primitive types like arrays. You need
to subclass existing classes and make a completely new class for arrays.
Interesting statements given Haskell's lack of higher-order modules and
comparatively poor type inference.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-11 08:59:42 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
For most generic programming tasks the type system of standard
Haskell is totally sufficient. If you really need GHC extensions,
you're already doing something more sophisticated. Note that the
type systems of most other languages can't even express the most
basic Haskell constructs.
In fact, C++ isn't even a good choice, when it comes to generic
programming. Besides its ugly syntax, it also lacks type inference,
which can be very annoying. Further it has no support for anonymous
or at least lexically scoped functions, not to mention closures.
Another important fact is that C++ has no contracts, contexts,
interfaces or whatever you call them, so you need dynamic casts,
which are bad. A lot of problems are catched by them at compile
time, but because of the lack of them, most type errors in C++ can
only be found at run time, if the programmer doesn't forget to check
for them.
Finally it's impossible to separate template interface and
implementation in a sane way. There is no module system of any
kind. You get header files and the classic dumb linker.
To see all of these problems in action at the same time, try to
implement Haskell's Functor class as a C++ template. It's about the
simplest example of useful generic programming, and C++ makes it a
nightmare.
No type inference means that you have to specify your container type
all the time. No anonymous functions, not even lexically scoped
functions, so your mapping function will be a named top-level
function. No contracts, so your fmap function doesn't even know
whether your container type is in fact a container type. It needs
dynamic_cast, so type errors are not catched at compile time.
Furthermore it's impossible to extend existing container classes to
support your new Functor template, not to mention primitive types
like arrays. You need to subclass existing classes and make a
completely new class for arrays.
Interesting statements given Haskell's lack of higher-order modules and
comparatively poor type inference.
Ah, it's Dr. Harrop again. I know that you love making things up, but
frankly I don't care. Firstly Haskell doesn't need higher order
modules, but you've always failed to use Haskell properly. Secondly
Haskell's type inference isn't poor, especially not compared to C++.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-11 15:52:08 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Interesting statements given Haskell's lack of higher-order modules and
comparatively poor type inference.
Ah, it's Dr. Harrop again. I know that you love making things up, but
frankly I don't care. Firstly Haskell doesn't need higher order
modules, but you've always failed to use Haskell properly. Secondly
Haskell's type inference isn't poor, especially not compared to C++.
Haskell cannot even infer sum types. You actually have to define and
maintain them by hand.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-11 15:22:09 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Interesting statements given Haskell's lack of higher-order modules
and comparatively poor type inference.
Ah, it's Dr. Harrop again. I know that you love making things up,
but frankly I don't care. Firstly Haskell doesn't need higher order
modules, but you've always failed to use Haskell properly. Secondly
Haskell's type inference isn't poor, especially not compared to C++.
Haskell cannot even infer sum types. You actually have to define and
maintain them by hand.
As said, you're making things up again. Type inference is the process
of finding out the type of an expression.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-11 22:09:12 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Interesting statements given Haskell's lack of higher-order modules
and comparatively poor type inference.
Ah, it's Dr. Harrop again. I know that you love making things up,
but frankly I don't care. Firstly Haskell doesn't need higher order
modules, but you've always failed to use Haskell properly. Secondly
Haskell's type inference isn't poor, especially not compared to C++.
Haskell cannot even infer sum types. You actually have to define and
maintain them by hand.
As said, you're making things up again. Type inference is the process of
finding out the type of an expression.
Try to translate the following simple OCaml expression into Haskell, for
example:

# `Add(`Mul(`Int 3, `Int 4), `Int 5);;
- : [> `Add of
[> `Mul of [> `Int of int ] * [> `Int of int ] ] * [> `Int of
int ] ]
= `Add (`Mul (`Int 3, `Int 4), `Int 5)

Note how OCaml infers the algebraic datatype.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-11 21:59:59 UTC
Permalink
Post by Jon Harrop
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Interesting statements given Haskell's lack of higher-order
modules and comparatively poor type inference.
Ah, it's Dr. Harrop again. I know that you love making things
up, but frankly I don't care. Firstly Haskell doesn't need
higher order modules, but you've always failed to use Haskell
properly. Secondly Haskell's type inference isn't poor,
especially not compared to C++.
Haskell cannot even infer sum types. You actually have to define
and maintain them by hand.
As said, you're making things up again. Type inference is the
process of finding out the type of an expression.
Try to translate the following simple OCaml expression into Haskell,
# `Add(`Mul(`Int 3, `Int 4), `Int 5);;
- : [> `Add of
[> `Mul of [> `Int of int ] * [> `Int of int ] ] * [> `Int of
int ] ]
= `Add (`Mul (`Int 3, `Int 4), `Int 5)
Note how OCaml infers the algebraic datatype.
I'm sorry, but my knowledge of OCaml is too limited. I don't understand
your code. Could you please describe in words, what it does?


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Dirk Thierbach
2010-02-11 22:24:57 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
# `Add(`Mul(`Int 3, `Int 4), `Int 5);;
- : [> `Add of
[> `Mul of [> `Int of int ] * [> `Int of int ] ] * [> `Int of
int ] ]
= `Add (`Mul (`Int 3, `Int 4), `Int 5)
Note how OCaml infers the algebraic datatype.
I'm sorry, but my knowledge of OCaml is too limited. I don't understand
your code. Could you please describe in words, what it does?
These are polymorphic variants. In Haskell, one can implement them
with a straightforward extension of the typeclass system. See
"A Polymorphic Type System for Extensible Records and Variants" by
Gaster and Jones, 1996. GHC doesn't implement this, Hugs implements at
least the extensible records (Trex); I'm not sure about the variants.

- Dirk
Jon Harrop
2010-02-12 00:01:41 UTC
Permalink
Post by Dirk Thierbach
Post by Ertugrul Söylemez
Post by Jon Harrop
# `Add(`Mul(`Int 3, `Int 4), `Int 5);;
- : [> `Add of
[> `Mul of [> `Int of int ] * [> `Int of int ] ] * [> `Int of
int ] ]
= `Add (`Mul (`Int 3, `Int 4), `Int 5)
Note how OCaml infers the algebraic datatype.
I'm sorry, but my knowledge of OCaml is too limited. I don't understand
your code. Could you please describe in words, what it does?
These are polymorphic variants. In Haskell, one can implement them
with a straightforward extension of the typeclass system.
No, you cannot.
Post by Dirk Thierbach
See "A Polymorphic Type System for Extensible Records and Variants" by
Gaster and Jones, 1996. GHC doesn't implement this, Hugs implements at
least the extensible records (Trex); I'm not sure about the variants.
That paper describes how you can implement open (extensible) sum types, not
polymorphic variants. You're still missing the type inference and there is
no way to implement that within Haskell.

Note that this discussion was specifically about the type inference and had
nothing to do with open sum types.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Dirk Thierbach
2010-02-12 14:13:04 UTC
Permalink
Post by Jon Harrop
That paper describes how you can implement open (extensible) sum types, not
polymorphic variants. You're still missing the type inference and
Read the paper again. I recommend section 4.2 "A type inference algorithm".
Post by Jon Harrop
there is no way to implement that within Haskell.
One cannot implement that "within" Haskell, in the same way one
cannot implement that "within" OCaml (i.e., without using the in-built
polymorphic variants). That's why I wrote "a straightfoward extension". Hugs
implements this algorithm, not "within" Haskell, but "within" Hugs,
at least for extensible records. Polymorphic variants are just the dual.

The GHC people apparently didn't consider this extension worthwhile,
and I tend to agree.

It's also rather old news. Now, back into my killfile.

- Dirk
Jon Harrop
2010-02-12 19:33:56 UTC
Permalink
...One cannot implement that "within" Haskell...
This is the only relevant point. Everything else you wrote seems to have
been a Turing argument shrouded in irrelevancies.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-11 23:43:07 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Try to translate the following simple OCaml expression into Haskell,
# `Add(`Mul(`Int 3, `Int 4), `Int 5);;
- : [> `Add of
[> `Mul of [> `Int of int ] * [> `Int of int ] ] * [> `Int of
int ] ]
= `Add (`Mul (`Int 3, `Int 4), `Int 5)
Note how OCaml infers the algebraic datatype.
I'm sorry, but my knowledge of OCaml is too limited. I don't understand
your code. Could you please describe in words, what it does?
That is just a value representing the symbolic expression 3*4+5.

A Haskell equivalent is:

data Expr = Int Int
| Add Expr Expr
| Mul Expr Expr
deriving Show

main = do
print $ Add (Mul (Int 3) (Int 4)) (Int 5)

Note that you must define the algebraic datatype (called "Expr" here) in
Haskell but not in OCaml because it can infer them.

OCaml also infers object types.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-12 13:54:54 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Try to translate the following simple OCaml expression into
# `Add(`Mul(`Int 3, `Int 4), `Int 5);;
- : [> `Add of
[> `Mul of [> `Int of int ] * [> `Int of int ] ] * [> `Int of
int ] ]
= `Add (`Mul (`Int 3, `Int 4), `Int 5)
Note how OCaml infers the algebraic datatype.
I'm sorry, but my knowledge of OCaml is too limited. I don't
understand your code. Could you please describe in words, what it
does?
That is just a value representing the symbolic expression 3*4+5.
data Expr = Int Int
| Add Expr Expr
| Mul Expr Expr
deriving Show
main = do
print $ Add (Mul (Int 3) (Int 4)) (Int 5)
Note that you must define the algebraic datatype (called "Expr" here)
in Haskell but not in OCaml because it can infer them.
OCaml also infers object types.
Okay, OCaml doesn't need me to define my types, but what's wrong with
having to do so? I mean, C doesn't require you to write function
prototypes, too, which is rather bad than good. I'm not saying that
that OCaml feature is bad, but that I'd rather define my type
explicitly. After all that's where type safety comes from. We know how
awful PHP is, because it tries hard to make sense of your statements,
even if they _are_ totally pointless and _should_ cause an error.

Also it's not the same kind of type inference that I'm talking about.
Usually type inference finds the type of an expression instead of
creating one that fits. Haskell's type system is very powerful at that.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-12 21:11:04 UTC
Permalink
Post by Ertugrul Söylemez
Okay, OCaml doesn't need me to define my types, but what's wrong with
having to do so?
Having to write out definitions for types manually can damage productivity.

The first order effect is that these unnecessary type definitions bloat the
code, making it harder to read, write and maintain. In the "Expr" example I
just gave, the Haskell code is several times longer. This is exactly the
same effect you cited with Haskell vs C++.

A secondary effect is that there is no definition to refer to when
constructing a value of the type. For example, using .NET from F# you might
write:

System.Windows.Media.Effects.SamplingMode.Bilinear

whereas with polymorphic variants you can just write:

`Bilinear

In essence, tuples in languages like Haskell can be hugely beneficial over
the nearest equivalent in C++ (e.g. std::pair<..>) or record types for
exactly the same reason that polymorphic variants can be better than
ordinary variants.

There is no logical reason to stop at tuples and force programmers to
declare sum and class types. Moreover, OCaml has proven that polymorphic
variants are useful.
Post by Ertugrul Söylemez
I mean, C doesn't require you to write function prototypes, too, which is
rather bad than good.
I disagree. I don't want to have to write and maintain function prototypes
by hand. I prefer lots of inference with throwback of inferred types from
the compiler to the IDE.
Post by Ertugrul Söylemez
I'm not saying that that OCaml feature is bad, but that I'd rather define
my type explicitly.
That is often the right thing to do but I still want the choice.
Post by Ertugrul Söylemez
After all that's where type safety comes from.
To some extent, yes. Polymorphic variants are still completely type safe, of
course. There is a larger possibility of error but, in practice, that isn't
their main disadvantage. The main problem is that inference over sum types
(particularly recursive sum types) gives the compiler more freedom to be
misled and, consequently, can lead to obfuscated error messages. Hence
explicitly-declared ordinary variant types are still very common in OCaml.

However, by far the most common application of polymorphic variants in
practice is non-recursive sum types like enumerations. For example, the
following type from LablGL represents the different OpenGL buffers:

type read_buffer =
[ `aux of int
| `back
| `back_left
| `back_right
| `front
| `front_left
| `front_right
| `left
| `right ]

As this is almost an enumeration, there is almost no room for error. Many
practical applications fall into this category.

My use of polymorphic variants for the recursive type representing HLVM
types is more adventurous but requires minimal type annotations to make it
easy to develop with:

https://forge.ocamlcore.org/scm/viewvc.php/hlvm.ml?view=markup&revision=42&root=hlvm
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-14 23:13:33 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Okay, OCaml doesn't need me to define my types, but what's wrong
with having to do so?
Having to write out definitions for types manually can damage
productivity.
The first order effect is that these unnecessary type definitions
bloat the code, making it harder to read, write and maintain. In the
"Expr" example I just gave, the Haskell code is several times
longer. This is exactly the same effect you cited with Haskell vs C++.
A secondary effect is that there is no definition to refer to when
constructing a value of the type. For example, using .NET from F# you
System.Windows.Media.Effects.SamplingMode.Bilinear
`Bilinear
In essence, tuples in languages like Haskell can be hugely beneficial
over the nearest equivalent in C++ (e.g. std::pair<..>) or record
types for exactly the same reason that polymorphic variants can be
better than ordinary variants.
There is no logical reason to stop at tuples and force programmers to
declare sum and class types. Moreover, OCaml has proven that
polymorphic variants are useful.
Well, when you choose Haskell, you specifically choose type safety over
convenience at certain places. Haskell code is often way shorter than
equivalent code in other languages (especially in recent versions of
GHC, which are much smarter and require far less explicit performance
hints than earlier), but it may not always be the shortest code, and it
may be so, because the type system gets in your way. But the average
Haskell programmer will prefer to define the type explicitly anyway.

Whether or not you consider this good or bad is probably a matter of
personal preference. There are certain things, which are easier/faster
to do/express in OCaml and other things are easier/faster to do/express
in Haskell. I don't think that any of the two languages is less
powerful than the other, but rather that they make different
performance, safety and productivity tradeoffs.

An interesting consequence of this is that few people will like both
languages at the same time. People preferring OCaml will write their
code in it regardless of whether it would be more convenient to write in
Haskell and vice versa.
Post by Jon Harrop
Post by Ertugrul Söylemez
I mean, C doesn't require you to write function prototypes, too,
which is rather bad than good.
I disagree. I don't want to have to write and maintain function
prototypes by hand. I prefer lots of inference with throwback of
inferred types from the compiler to the IDE.
It's bad the way C does it. It just assumes you are right. OCaml
assumes you're right, too, but you can't be right in two different,
contradicting ways at the same time. In C you can, so the prototype is
actually needed to enforce correctness. Otherwise you may end up with
strange bugs, which are hard to find.
Post by Jon Harrop
Post by Ertugrul Söylemez
After all that's where type safety comes from.
To some extent, yes. Polymorphic variants are still completely type
safe, of course. There is a larger possibility of error but, in
practice, that isn't their main disadvantage. The main problem is that
inference over sum types (particularly recursive sum types) gives the
compiler more freedom to be misled and, consequently, can lead to
obfuscated error messages. Hence explicitly-declared ordinary variant
types are still very common in OCaml.
However, by far the most common application of polymorphic variants in
practice is non-recursive sum types like enumerations. For example,
the following type from LablGL represents the different OpenGL
type read_buffer =
[ `aux of int
| `back
| `back_left
| `back_right
| `front
| `front_left
| `front_right
| `left
| `right ]
As this is almost an enumeration, there is almost no room for
error. Many practical applications fall into this category.
My use of polymorphic variants for the recursive type representing
HLVM types is more adventurous but requires minimal type annotations
https://forge.ocamlcore.org/scm/viewvc.php/hlvm.ml?view=markup&revision=42&root=hlvm
Yes, it can be more productive and more convenient, too. I still
wouldn't use it, because if your type is simple, it won't hurt you to
define it explicitly. On the other hand, if your type is complex, you
generally want an explicit, clear description of it in your code anyway.
This is, of course, Haskell thinking.

If you prefer to use it, there is probably nothing wrong with it. OCaml
is still ages ahead of most other languages in wide-spread use today.


Side note: I like your new tone. Discussing with you has become more
pleasant than in the past. Keep it that way.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-15 02:46:50 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
There is no logical reason to stop at tuples and force programmers to
declare sum and class types. Moreover, OCaml has proven that
polymorphic variants are useful.
Well, when you choose Haskell, you specifically choose type safety over
convenience at certain places.
Yes.
Post by Ertugrul Söylemez
Haskell code is often way shorter than
equivalent code in other languages (especially in recent versions of
GHC, which are much smarter and require far less explicit performance
hints than earlier), but it may not always be the shortest code, and it
may be so, because the type system gets in your way.
Other languages can obviously be far more concise than Haskell in specific
domains. In general, Haskell is clearly more concise than C++ but I'm not
sure where it sits among languages such as APL and Mathematica or even F#.

On the u64 shootout, Haskell is ~2x shorter than OCaml on chameneos-redux,
pidigits and binary-trees but ~2x longer on spectral-norm,
reverse-complement, k-nucleotide, fannkuch and regex-dna. However, these
are optimized programs. Moreover, F# is usually significantly more concise
than OCaml.

On Rosetta code, my F# is shorter than the Haskell on Anagrams, Bresenham,
Game of Life and Evolutionary Algorithm. Haskell is shorter than F# on LZW.
These programs are not optimized.

The two most compelling examples of Haskell I have ever seen were LZW:

http://www.haskell.org/haskellwiki/Toy_compression_implementations

and power series:

http://www.cs.dartmouth.edu/~doug/powser.html
Post by Ertugrul Söylemez
But the average Haskell programmer will prefer to define the type
explicitly anyway.
Yes, of course. They are a self selected group of people who are, by
definition, willing to endure that.
Post by Ertugrul Söylemez
Whether or not you consider this good or bad is probably a matter of
personal preference. There are certain things, which are easier/faster
to do/express in OCaml and other things are easier/faster to do/express
in Haskell. I don't think that any of the two languages is less
powerful than the other, but rather that they make different
performance, safety and productivity tradeoffs.
Yes. OCaml is unusually unsafe among MLs.
Post by Ertugrul Söylemez
An interesting consequence of this is that few people will like both
languages at the same time. People preferring OCaml will write their
code in it regardless of whether it would be more convenient to write in
Haskell and vice versa.
There seem to be a growing number of people fluent in both but, as you say,
they do seem to stick with one regardless.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
I mean, C doesn't require you to write function prototypes, too,
which is rather bad than good.
I disagree. I don't want to have to write and maintain function
prototypes by hand. I prefer lots of inference with throwback of
inferred types from the compiler to the IDE.
It's bad the way C does it. It just assumes you are right. OCaml
assumes you're right, too, but you can't be right in two different,
contradicting ways at the same time. In C you can, so the prototype is
actually needed to enforce correctness. Otherwise you may end up with
strange bugs, which are hard to find.
Yes. C is absurd. OCaml tried to address that problem by including MD5 sums
of APIs in object files but that shifts the problem to one of brittle
bindings. Now, a new minor-minor OCaml release forces all downstream
developers to completely recompile from scratch which takes days for the
Debian maintainers because there is so much OCaml code out there now. For
example, my commercial Smoke Vector Graphics library for OCaml requires
different binaries to be shipped for every different permutation of minor
versions of libraries (including the compiler) that it depends upon. That
is obviously not practicable.
Post by Ertugrul Söylemez
Yes, it can be more productive and more convenient, too. I still
wouldn't use it, because if your type is simple, it won't hurt you to
define it explicitly.
Surely you use tuples without explicitly naming them? What's the difference?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-16 10:28:21 UTC
Permalink
Post by Jon Harrop
http://www.haskell.org/haskellwiki/Toy_compression_implementations
http://www.cs.dartmouth.edu/~doug/powser.html
I'd like to add another one, which I wrote myself. This code is
insanely difficult to write in other languages:

isPrime :: Integral i => i -> Bool
isPrime n = all (\d -> rem n d /= 0) . takeWhile (\d -> d^2 <= n) $ primes

primes :: Integral i => [i]
primes = 2 : filter isPrime [3..]

Note that this is not a prime sieve, but more intelligent than the naive
filtering approach. If you replace 'primes' in the second line by
'[2..]', the code runs considerably slower.
Post by Jon Harrop
Post by Ertugrul Söylemez
Yes, it can be more productive and more convenient, too. I still
wouldn't use it, because if your type is simple, it won't hurt you
to define it explicitly.
Surely you use tuples without explicitly naming them? What's the difference?
let n = (Left (Right 15), Right 17)
:t n
n :: (Either (Either a Integer) b, Either a1 Integer)

So there is also a straightforward translation of your example to
Post by Jon Harrop
let add = Left
let mul = Right
let n = add (mul (3, 4), 5)
:t n
n :: Either (Either a (Integer, Integer), Integer) b

Unfortunately you can't pattern-match against 'add' and 'mul', but need
to match against Left and Right, so it's still better to make explicit
what you mean:

data Term a
= Literal a
| Add (Term a) (Term a)
| Mul (Term a) (Term a)


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Dan Doel
2010-02-16 11:50:06 UTC
Permalink
Post by Ertugrul Söylemez
So there is also a straightforward translation of your example to
Post by Jon Harrop
let add = Left
let mul = Right
let n = add (mul (3, 4), 5)
:t n
n :: Either (Either a (Integer, Integer), Integer) b
Unfortunately you can't pattern-match against 'add' and 'mul'
You can if you download the she preprocessor:

{-# OPTIONS_GHC -F -pgmF she #-}
{-# LANGUAGE TypeOperators #-}

data Free f a = Unit a | Roll (f (Free f a))

newtype Join f a = J (f a a)
newtype (f :. g) a = O (f (g a))

pattern Add x y = Roll (O (J (Left (J (x, y)))))
pattern Mul x y = Roll (O (J (Right (J (x, y)))))

type Term a = Free (Join Either :. Join (,)) a

n = Add (Mul (Unit 3) (Unit 4)) (Unit 5)

q = case n of Add x y -> x


*Main> :t n
n :: (Num t) => Free (Join Either :. Join (,)) t
*Main> :t q
q :: (Num t) => Free (Join Either :. Join (,)) t

Of course, that's not all it does.

http://personal.cis.strath.ac.uk/~conor/pub/she/
Jon Harrop
2010-02-17 06:18:19 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
http://www.haskell.org/haskellwiki/Toy_compression_implementations
http://www.cs.dartmouth.edu/~doug/powser.html
I'd like to add another one, which I wrote myself. This code is
isPrime :: Integral i => i -> Bool
isPrime n = all (\d -> rem n d /= 0) . takeWhile (\d -> d^2 <= n) $ primes
primes :: Integral i => [i]
primes = 2 : filter isPrime [3..]
Note that this is not a prime sieve, but more intelligent than the naive
filtering approach. If you replace 'primes' in the second line by
'[2..]', the code runs considerably slower.
Interesting challenge. I'll have a fiddle! :-)
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Yes, it can be more productive and more convenient, too. I still
wouldn't use it, because if your type is simple, it won't hurt you
to define it explicitly.
Surely you use tuples without explicitly naming them? What's the difference?
let n = (Left (Right 15), Right 17)
:t n
n :: (Either (Either a Integer) b, Either a1 Integer)
Does it handle recursive types?

For example, what happens if you try to write a function to compute symbolic
derivative in that style:

# let rec d f x =
match f with
| `Int n -> `Int 0
| `Var y -> `Int(if x=y then 1 else 0)
| `Add(f, g) -> `Add(d f x, d g x)
| `Mul(f, g) -> `Add(`Mul(f, d g x), `Mul(g, d f x));;
val d :
([< `Add of 'a * 'a | `Int of 'b | `Mul of 'a * 'a | `Var of 'c ]
as 'a) ->
'c -> ([> `Add of 'd * 'd | `Int of int | `Mul of 'a * 'd ] as 'd) = <fun>

For example:

# let x = `Var "x";;
val x : [> `Var of string ] = `Var "x"

# let f = `Add(`Add(`Mul(`Mul(x, x), x), `Mul(`Int(-1), x)), `Int(-1));;
val f :
[> `Add of
[> `Add of
[> `Mul of
[> `Mul of [> `Var of string ] * [> `Var of string ] ] *
[> `Var of string ] ] *
[> `Mul of [> `Int of int ] * [> `Var of string ] ] ] *
[> `Int of int ] ] =
`Add
(`Add
(`Mul (`Mul (`Var "x", `Var "x"), `Var "x"),
`Mul (`Int (-1), `Var "x")),
`Int (-1))

# d f "x";;
- : [> `Add of 'a * 'a
| `Int of int
| `Mul of
([ `Add of 'b * 'b | `Int of int | `Mul of 'b * 'b | `Var of
string ]
as 'b) *
'a ]
as 'a
=
`Add
(`Add
(`Add
(`Mul (`Mul (`Var "x", `Var "x"), `Int 1),
`Mul
(`Var "x",
`Add (`Mul (`Var "x", `Int 1), `Mul (`Var "x", `Int 1)))),
`Add (`Mul (`Int (-1), `Int 1), `Mul (`Var "x", `Int 0))),
`Int 0)
Post by Ertugrul Söylemez
So there is also a straightforward translation of your example to
Post by Jon Harrop
let add = Left
let mul = Right
let n = add (mul (3, 4), 5)
:t n
n :: Either (Either a (Integer, Integer), Integer) b
Unfortunately you can't pattern-match against 'add' and 'mul', but need
to match against Left and Right, so it's still better to make explicit
data Term a
= Literal a
| Add (Term a) (Term a)
| Mul (Term a) (Term a)
A view pattern would handle that.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-18 12:32:27 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
http://www.haskell.org/haskellwiki/Toy_compression_implementations
http://www.cs.dartmouth.edu/~doug/powser.html
I'd like to add another one, which I wrote myself. This code is
isPrime :: Integral i => i -> Bool
isPrime n = all (\d -> rem n d /= 0) . takeWhile (\d -> d^2 <= n) $ primes
primes :: Integral i => [i]
primes = 2 : filter isPrime [3..]
Note that this is not a prime sieve, but more intelligent than the
naive filtering approach. If you replace 'primes' in the second
line by '[2..]', the code runs considerably slower.
Interesting challenge. I'll have a fiddle! :-)
Have fun. This is one case, where laziness really pays off. I'd like
to see your results. =)
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Yes, it can be more productive and more convenient, too. I still
wouldn't use it, because if your type is simple, it won't hurt
you to define it explicitly.
Surely you use tuples without explicitly naming them? What's the difference?
let n = (Left (Right 15), Right 17)
:t n
n :: (Either (Either a Integer) b, Either a1 Integer)
Does it handle recursive types?
For example, what happens if you try to write a function to compute
[...]
Unfortunately no. At least I haven't found a way to encode it (easily).
Post by Jon Harrop
Post by Ertugrul Söylemez
So there is also a straightforward translation of your example to
Post by Jon Harrop
let add = Left
let mul = Right
let n = add (mul (3, 4), 5)
:t n
n :: Either (Either a (Integer, Integer), Integer) b
Unfortunately you can't pattern-match against 'add' and 'mul', but
need to match against Left and Right, so it's still better to make
data Term a
= Literal a
| Add (Term a) (Term a)
| Mul (Term a) (Term a)
A view pattern would handle that.
Well, it would, but not very nicely I think.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-19 01:23:27 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
I'd like to add another one, which I wrote myself. This code is
isPrime :: Integral i => i -> Bool
isPrime n = all (\d -> rem n d /= 0) . takeWhile (\d -> d^2 <= n) $ primes
primes :: Integral i => [i]
primes = 2 : filter isPrime [3..]
Note that this is not a prime sieve, but more intelligent than the
naive filtering approach. If you replace 'primes' in the second
line by '[2..]', the code runs considerably slower.
Interesting challenge. I'll have a fiddle! :-)
Have fun. This is one case, where laziness really pays off. I'd like
to see your results. =)
That is a nice program. :-)

Here is a simple translation to F#:

let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache

Takes about the same time as the Haskell to find the 100,000th prime.

Would be interesting to convert it into imperative style, accumulating a
resizeable array.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Keith H Duggar
2010-02-20 06:57:08 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
I'd like to add another one, which I wrote myself. This code is
isPrime :: Integral i => i -> Bool
isPrime n = all (\d -> rem n d /= 0) . takeWhile (\d -> d^2 <= n) $ primes
primes :: Integral i => [i]
primes = 2 : filter isPrime [3..]
Note that this is not a prime sieve, but more intelligent than the
naive filtering approach. If you replace 'primes' in the second
line by '[2..]', the code runs considerably slower.
Interesting challenge. I'll have a fiddle! :-)
Have fun. This is one case, where laziness really pays off. I'd like
to see your results. =)
That is a nice program. :-)
let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
How much time?
Post by Jon Harrop
Would be interesting to convert it into imperative style, accumulating a
resizeable array.
Ignoring integer overflow Here is a C++ program that uses
the isPrime function to print primes:

#include <iostream>
#include <vector>

bool isPrime ( unsigned n ) ;

//Primes is an iterator over the prime numbers
class Primes
{
unsigned _idx ;
static std::vector<unsigned> _primes ;
void grow ( ) {
unsigned t = _primes[_idx] ;
while ( not isPrime(++t) ) ;
_primes.push_back(t) ;
}
public :
Primes ( ) : _idx(0) { }
unsigned operator* ( ) { return _primes[_idx] ; }
Primes & operator++ ( ) {
if ( _idx+1 == _primes.size() ) grow() ;
++_idx ;
return *this ;
}
} ;

std::vector<unsigned> Primes::_primes(1,2) ;

bool isPrime ( unsigned n ) {
Primes primes ;
unsigned d = *primes ;
while ( d*d <= n and n % d != 0 ) d = *(++primes) ;
return d*d > n ;
}

int main ( )
{
for ( unsigned i = 2 ; i < ~0 ; ++i ) {
if ( isPrime(i) ) {
std::cout << i << '\n' ;
}
}

return 0 ;
}
Jon Harrop
2010-02-21 06:35:16 UTC
Permalink
Post by Keith H Duggar
Post by Jon Harrop
let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
How much time?
About 4.5s for GHC 6.12.1 and F# 2.0, IIRC.
Post by Keith H Duggar
Post by Jon Harrop
Would be interesting to convert it into imperative style, accumulating a
resizeable array.
Ignoring integer overflow Here is a C++ program that uses
I'm not loving it. ;-)

Here's an imperative F# implementation of the same algorithm that is only
50% longer than the pure lazy Haskell but ~25x faster (0.18s):

let primes =
let a = ResizeArray[2]
let rec loop i p =
let d = a.[i]
if d*d>p then p else
if p%d=0 then loop 0 (p+1) else
loop (i+1) p
fun n ->
while n >= a.Count do
a.[a.Count - 1] + 1 |> loop 0 |> a.Add
a.[n]

You can get the 100,000th prime with "primes 100000".
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-21 17:19:19 UTC
Permalink
Post by Jon Harrop
Post by Keith H Duggar
Post by Jon Harrop
let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
How much time?
About 4.5s for GHC 6.12.1 and F# 2.0, IIRC.
Sorry, with GHC 6.12 the Haskell actually takes 6.6s.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
frankenstein
2010-02-20 19:33:38 UTC
Permalink
  let rec isPrime n =
    primes
    |> Seq.takeWhile (fun d -> d*d <= n)
    |> Seq.forall (fun d -> n%d <> 0)
  and primes =
    let rec f n =
      if isPrime n then Some(n, n+1) else f(n+1)
    Seq.unfold f 3
    |> Seq.append [2]
    |> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
Would be interesting to convert it into imperative style, accumulating a
resizeable array.
Hi: I somehow manage to understand the Haskell version based on lazy
evaluation but gave in in grasping the OCaml version.

What is 'Some' do? What is 'f' supposed to stand for in 'rec f n' and
how is it all processed (what is the starting point)? I do not
understand how the 'guards |>' in the OCaml code accomplish the
behavior of 'laziness' here.

Thanks,
Europaer
Ertugrul Söylemez
2010-02-20 22:39:28 UTC
Permalink
Post by frankenstein
  let rec isPrime n =
    primes
    |> Seq.takeWhile (fun d -> d*d <= n)
    |> Seq.forall (fun d -> n%d <> 0)
  and primes =
    let rec f n =
      if isPrime n then Some(n, n+1) else f(n+1)
    Seq.unfold f 3
    |> Seq.append [2]
    |> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
Would be interesting to convert it into imperative style,
accumulating a resizeable array.
Hi: I somehow manage to understand the Haskell version based on lazy
evaluation but gave in in grasping the OCaml version.
What is 'Some' do? What is 'f' supposed to stand for in 'rec f n' and
how is it all processed (what is the starting point)? I do not
understand how the 'guards |>' in the OCaml code accomplish the
behavior of 'laziness' here.
Note that it's F# code. It doesn't emulate laziness, but takes the
imperative approach of establishing a cache explicitly. It's quite
elegant code, but the elegance of the Haskell version can only be
matched with true laziness.

BTW, as an interesting side note, this example shows that the "laziness"
construct in F# is not real laziness, but rather just a wrapper around
unary functions, because it doesn't help here. A "lazy" value in F# is
the same as a function, which returns the value in question when called.
In fact using functions directly leads to more readable code in F#, at
least in my experience.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-21 05:51:45 UTC
Permalink
Post by Ertugrul Söylemez
Note that it's F# code. It doesn't emulate laziness, but takes the
imperative approach of establishing a cache explicitly.
There is no mutation so it is not really "imperative" code.
Post by Ertugrul Söylemez
It's quite elegant code, but the elegance of the Haskell version can only
be matched with true laziness.
Well, your original Haskell was 171 bytes of code whereas my F# is 186
bytes. Moreover, Haskell's 8% brevity advantage only appears to apply in
the tiny niche of performance-agnostic infinite lazy sequences.
Post by Ertugrul Söylemez
BTW, as an interesting side note, this example shows that the "laziness"
construct in F# is not real laziness, but rather just a wrapper around
unary functions, because it doesn't help here.
My code didn't even use F#'s laziness construct. I think you are referring
to .NET's IEnumerable sequences which, as you say, are not lazy.
Post by Ertugrul Söylemez
A "lazy" value in F# is the same as a function, which returns the value in
question when called.
No, a "lazy" value in F# is a real lazy value exactly the same as Haskell.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-21 06:43:43 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Note that it's F# code. It doesn't emulate laziness, but takes the
imperative approach of establishing a cache explicitly.
There is no mutation so it is not really "imperative" code.
It doesn't need to use mutation to be in an imperative style. You're
establishing a cache explicitly.
Post by Jon Harrop
Post by Ertugrul Söylemez
It's quite elegant code, but the elegance of the Haskell version can
only be matched with true laziness.
Well, your original Haskell was 171 bytes of code whereas my F# is 186
bytes. Moreover, Haskell's 8% brevity advantage only appears to apply
in the tiny niche of performance-agnostic infinite lazy sequences.
Note that the type signatures in the Haskell version are strictly
optional. Without them it has only 107 bytes (not counting line feeds).
Infinite lazy sequences are not a niche. I use them a lot for all kinds
of stuff. Remember that in Haskell you encode algorithms in a different
way than in F#.
Post by Jon Harrop
Post by Ertugrul Söylemez
BTW, as an interesting side note, this example shows that the
"laziness" construct in F# is not real laziness, but rather just a
wrapper around unary functions, because it doesn't help here.
My code didn't even use F#'s laziness construct. I think you are
referring to .NET's IEnumerable sequences which, as you say, are not
lazy.
No, I really meant the Lazy type. Your code doesn't use it, that's why
it's just an interesting side note.
Post by Jon Harrop
Post by Ertugrul Söylemez
A "lazy" value in F# is the same as a function, which returns the
value in question when called.
No, a "lazy" value in F# is a real lazy value exactly the same as Haskell.
You can't pass a lazy value to a strict function. There is no implicit
demand-driven execution, but rather explicit forcing. That's the reason
why the laziness construct in F# does not help to solve this problem.
You can't really have useful laziness without nonstrict semantics.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-21 18:30:02 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Note that it's F# code. It doesn't emulate laziness, but takes the
imperative approach of establishing a cache explicitly.
There is no mutation so it is not really "imperative" code.
It doesn't need to use mutation to be in an imperative style. You're
establishing a cache explicitly.
If Haskell had an "unfold" function that took a function enumeration (e.g.
Int -> Int) and returned an infinite lazy list of those values, you would
not claim it was imperative because it used caching. The F# is not
imperative in exactly the same way.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
It's quite elegant code, but the elegance of the Haskell version can
only be matched with true laziness.
Well, your original Haskell was 171 bytes of code whereas my F# is 186
bytes. Moreover, Haskell's 8% brevity advantage only appears to apply
in the tiny niche of performance-agnostic infinite lazy sequences.
Note that the type signatures in the Haskell version are strictly
optional.
You annotated the types by hand because that is idiomatic Haskell. That is
idiomatic Haskell because the type system is too easily misled by design
(flaw). Type annotations are one of the main sources of verbosity in real
Haskell code.

Just look at the source code to darcs, for example:

http://allmydata.org/trac/darcs-2/browser/src/Progress.hs

Every single definition there has a type annotation. I see no merit in
pretending that doesn't bloat real Haskell source code.
Post by Ertugrul Söylemez
Without them it has only 107 bytes (not counting line feeds).
There are still only two significant differences in verbosity between your
Haskell and my equivalent F#. Firstly, your Haskell used the syntactic
sugar [3..] whereas my F# used:

unfold (fun n -> Some(n, n+1)) 3

Secondly, your Haskell benefitted from functions being in the global
namespace whereas my F# qualified them explicitly.

Excluding these mere syntactic differences, my F# is still only 18% longer
than your Haskell. In contrast, the real Sieve of Eratosthenes can be
implemented in APL in only 17 chars:

http://www.vaxman.de/publications/apl_slides.pdf

And that will be a *lot* faster than your 6x longer Haskell solution.
Post by Ertugrul Söylemez
Infinite lazy sequences are not a niche. I use them a lot for all kinds
of stuff. Remember that in Haskell you encode algorithms in a different
way than in F#.
As I've shown here, you can encode algorithms the same way in F# if you want
to. The fact that F# programmers rarely choose to is a statement about the
practicality of the Haskell-style of encoding algorithms.

Indeed, implementing the Sieve of Eratosthenes in Haskell was an unsolved
problem for many years and, even today, is still the stuff of research.
Melissa O'Neil wrote a seminal paper on it:

http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

The nearest she came to a decent Haskell implementation is 4,690 bytes of
Haskell code (!) and still runs several times slower than my 436 byte F#
solution.
Post by Ertugrul Söylemez
You can't pass a lazy value to a strict function. There is no implicit
demand-driven execution, but rather explicit forcing. That's the reason
why the laziness construct in F# does not help to solve this problem.
I solved the problem in F# using laziness.
Post by Ertugrul Söylemez
You can't really have useful laziness without nonstrict semantics.
Can you have useful laziness with any semantics?

As I've shown here, F# is more than capable of expressing lazy Haskell-like
solutions. I have implemented many as academic exercises but I have never
adopted that style in production code precisely because it always suffers
from these kinds of problems.

Even when I want something like the sequence of prime numbers (and our F#
for Numerics product does provide one), there's no way I'm going to use a
two line solution that is hundreds of times slower than a dozen line
solution. Nor will I adopt a style that saves me a couple of lines of code
at the cost of orders of magnitude in performance or costs me an order of
magnitude in bloat when I come to optimize it.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-22 11:13:19 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Well, your original Haskell was 171 bytes of code whereas my F# is
186 bytes. Moreover, Haskell's 8% brevity advantage only appears to
apply in the tiny niche of performance-agnostic infinite lazy
sequences.
Note that the type signatures in the Haskell version are strictly
optional.
You annotated the types by hand because that is idiomatic
Haskell. That is idiomatic Haskell because the type system is too
easily misled by design (flaw). Type annotations are one of the main
sources of verbosity in real Haskell code.
http://allmydata.org/trac/darcs-2/browser/src/Progress.hs
Every single definition there has a type annotation. I see no merit in
pretending that doesn't bloat real Haskell source code.
Unless you use Haskell extensions, annotating types is strictly
optional. In general, the type system will infer the type of an
expression even better (i.e. more polymorphic) than you. The type
system of standard Haskell is designed such that it always succeeds or
the code is wrong.

There is one exception to this rule, though: the monomorphism
restriction, but it seldomly applies.
Post by Jon Harrop
Post by Ertugrul Söylemez
Without them it has only 107 bytes (not counting line feeds).
There are still only two significant differences in verbosity between
your Haskell and my equivalent F#. Firstly, your Haskell used the
unfold (fun n -> Some(n, n+1)) 3
Secondly, your Haskell benefitted from functions being in the global
namespace whereas my F# qualified them explicitly.
Excluding these mere syntactic differences, my F# is still only 18%
longer than your Haskell. In contrast, the real Sieve of Eratosthenes
http://www.vaxman.de/publications/apl_slides.pdf
And that will be a *lot* faster than your 6x longer Haskell solution.
That may be because my Haskell version is not the sieve of
Eratosthenes. ;)

Of course you cannot implement it that short in Haskell, because APL has
good support for matrix operations built into the language. But anyway,
see my other post. The SoE is very fast in Haskell (the C version is
only 1.6 times faster) and needs little code. The performance
difference is related to the fact that standard Haskell arrays are safe.
I left out the bounds checking from the C code, which makes it faster.
Post by Jon Harrop
Post by Ertugrul Söylemez
Infinite lazy sequences are not a niche. I use them a lot for all
kinds of stuff. Remember that in Haskell you encode algorithms in a
different way than in F#.
As I've shown here, you can encode algorithms the same way in F# if
you want to. The fact that F# programmers rarely choose to is a
statement about the practicality of the Haskell-style of encoding
algorithms.
Indeed, implementing the Sieve of Eratosthenes in Haskell was an
unsolved problem for many years and, even today, is still the stuff of
http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
The nearest she came to a decent Haskell implementation is 4,690 bytes
of Haskell code (!) and still runs several times slower than my 436
byte F# solution.
See my other post. Haskell sieve of Eratosthenes in 7 lines of code.
Two unnecessary writeArrays are there just for efficiency. And it gets
near C speed.
Post by Jon Harrop
Post by Ertugrul Söylemez
You can't pass a lazy value to a strict function. There is no
implicit demand-driven execution, but rather explicit forcing.
That's the reason why the laziness construct in F# does not help to
solve this problem.
I solved the problem in F# using laziness.
You used features of Seq, not laziness.
Post by Jon Harrop
Post by Ertugrul Söylemez
You can't really have useful laziness without nonstrict semantics.
Can you have useful laziness with any semantics?
As I've shown here, F# is more than capable of expressing lazy
Haskell-like solutions. I have implemented many as academic exercises
but I have never adopted that style in production code precisely
because it always suffers from these kinds of problems.
Even when I want something like the sequence of prime numbers (and our
F# for Numerics product does provide one), there's no way I'm going to
use a two line solution that is hundreds of times slower than a dozen
line solution. Nor will I adopt a style that saves me a couple of
lines of code at the cost of orders of magnitude in performance or
costs me an order of magnitude in bloat when I come to optimize it.
Then use the imperative style SoE. It's short, concise and fast.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-23 02:38:31 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Well, your original Haskell was 171 bytes of code whereas my F# is
186 bytes. Moreover, Haskell's 8% brevity advantage only appears to
apply in the tiny niche of performance-agnostic infinite lazy
sequences.
Note that the type signatures in the Haskell version are strictly
optional.
You annotated the types by hand because that is idiomatic
Haskell. That is idiomatic Haskell because the type system is too
easily misled by design (flaw). Type annotations are one of the main
sources of verbosity in real Haskell code.
http://allmydata.org/trac/darcs-2/browser/src/Progress.hs
Every single definition there has a type annotation. I see no merit in
pretending that doesn't bloat real Haskell source code.
Unless you use Haskell extensions, annotating types is strictly
optional.
Type annotations do bloat real Haskell code. There is no merit in trying to
pretend otherwise.
Post by Ertugrul Söylemez
The type system of standard Haskell is designed such that it always
succeeds or the code is wrong.
The following traditional implementation of the y-combinator is correct
code:

fix f = (\x -> f (x x)) (\x -> f (x x))

Yet the type system of standard Haskell fails on it.
Post by Ertugrul Söylemez
Post by Jon Harrop
And that will be a *lot* faster than your 6x longer Haskell solution.
That may be because my Haskell version is not the sieve of
Eratosthenes. ;)
Exactly.
Post by Ertugrul Söylemez
Of course you cannot implement it that short in Haskell, because APL has
good support for matrix operations built into the language. But anyway,
see my other post. The SoE is very fast in Haskell (the C version is
only 1.6 times faster) and needs little code. The performance
difference is related to the fact that standard Haskell arrays are safe.
I left out the bounds checking from the C code, which makes it faster.
That speculation is obviously wrong: my F# is bounds checked is still much
faster than your Haskell.
Post by Ertugrul Söylemez
Post by Jon Harrop
The nearest she came to a decent Haskell implementation is 4,690 bytes
of Haskell code (!) and still runs several times slower than my 436
byte F# solution.
See my other post. Haskell sieve of Eratosthenes in 7 lines of code.
No. Your incomplete Haskell solution is already 15 lines of code.
Post by Ertugrul Söylemez
Two unnecessary writeArrays are there just for efficiency. And it gets
near C speed.
On my machine your Haskell is over 3x slower than your C.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
You can't pass a lazy value to a strict function. There is no
implicit demand-driven execution, but rather explicit forcing.
That's the reason why the laziness construct in F# does not help to
solve this problem.
I solved the problem in F# using laziness.
You used features of Seq, not laziness.
No, I used features of Seq in one solution and laziness in another.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-23 10:51:33 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
The type system of standard Haskell is designed such that it always
succeeds or the code is wrong.
The following traditional implementation of the y-combinator is
fix f = (\x -> f (x x)) (\x -> f (x x))
Yet the type system of standard Haskell fails on it.
This is the traditional Y combinator of untyped lambda calculus.
Haskell implements the typed lambda calculus, in which you can't express
the Y combinator. It is correct and expected that this code does not
typecheck.

This is not related to any shortcoming of Haskell's type system. It
does its job properly. It's a shortcoming of your understanding of the
lambda calculus. Note that the typed lambda calculus is not Turing
complete.
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
And that will be a *lot* faster than your 6x longer Haskell
solution.
That may be because my Haskell version is not the sieve of
Eratosthenes. ;)
Exactly.
Good. Then your whole statement is void, because you're comparing the
running times of two different algorithms with different complexities,
which is obviously pointless.
Post by Jon Harrop
Post by Ertugrul Söylemez
Of course you cannot implement it that short in Haskell, because APL
has good support for matrix operations built into the language. But
anyway, see my other post. The SoE is very fast in Haskell (the C
version is only 1.6 times faster) and needs little code. The
performance difference is related to the fact that standard Haskell
arrays are safe. I left out the bounds checking from the C code,
which makes it faster.
That speculation is obviously wrong: my F# is bounds checked is still
much faster than your Haskell.
You're saying that F# outperforms C. I used to program in F# at work.
My code was always much slower than the equivalent F# code.
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
The nearest she came to a decent Haskell implementation is 4,690
bytes of Haskell code (!) and still runs several times slower than
my 436 byte F# solution.
See my other post. Haskell sieve of Eratosthenes in 7 lines of code.
No. Your incomplete Haskell solution is already 15 lines of code.
My solution is not incomplete and I could remove what would be "bloat"
in your terminology: type signatures, convenience functions and
imports. But I find the code more readable the way I've written it.
The core sieve can be implemented in only three lines. With combinators
and less syntactic sugar I can even squeeze it further:

main = fmap (map read) getArgs >>=
mapM_ (\n -> print . last . map fst . filter snd . assocs . runSTUArray $
newArray (2,n) True >>= \arr ->
arr <$ forM_ [2..n] (\i -> readArray arr i >>=
(`when` forM_ [2*i, 3*i .. n] (flip (writeArray arr) False))))

This is not merely the sieve. It's a complete program, which expects
sieve sizes as parameters, runs the sieve for each of these sizes and
prints the largest prime found. It will be slower than the original
sieve by a small constant factor, because this one uses a 2-based array.
Post by Jon Harrop
Post by Ertugrul Söylemez
Two unnecessary writeArrays are there just for efficiency. And it
gets near C speed.
On my machine your Haskell is over 3x slower than your C.
Haskell code is always orders of magnitude slower, as soon as you
compile it.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-23 21:05:36 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
The type system of standard Haskell is designed such that it always
succeeds or the code is wrong.
The following traditional implementation of the y-combinator is
fix f = (\x -> f (x x)) (\x -> f (x x))
Yet the type system of standard Haskell fails on it.
This is the traditional Y combinator of untyped lambda calculus.
Haskell implements the typed lambda calculus, in which you can't express
the Y combinator. It is correct and expected that this code does not
typecheck.
This is not related to any shortcoming of Haskell's type system. It
does its job properly. It's a shortcoming of your understanding of the
lambda calculus.
If it is not a shortcoming of Haskell's type system, why does it work in
OCaml:

$ ocaml -rectypes
Objective Caml version 3.11.1

# let fix f = (fun x -> f(x x)) (fun x y -> f(x x) y);;
val fix : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>

# fix (fun f -> function 0 -> 1 | n -> n*f(n-1)) 10;;
- : int = 3628800
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
That may be because my Haskell version is not the sieve of
Eratosthenes. ;)
Exactly.
Good. Then your whole statement is void, because you're comparing the
running times of two different algorithms with different complexities,
which is obviously pointless.
You cannot see the point of observing that other languages let you solve
problems more succinctly and efficiently than Haskell?
Post by Ertugrul Söylemez
Post by Jon Harrop
That speculation is obviously wrong: my F# is bounds checked is still
much faster than your Haskell.
You're saying that F# outperforms C.
C compiled with gcc, yes.
Post by Ertugrul Söylemez
I used to program in F# at work. My code was always much slower than the
equivalent F# code.
My F# code often outperforms C. My F# code has even outperformed
vendor-tuned Fortran on linear algebra.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
The nearest she came to a decent Haskell implementation is 4,690
bytes of Haskell code (!) and still runs several times slower than
my 436 byte F# solution.
See my other post. Haskell sieve of Eratosthenes in 7 lines of code.
No. Your incomplete Haskell solution is already 15 lines of code.
My solution is not incomplete
Your solution does not extend its sieve => it is incomplete.
Post by Ertugrul Söylemez
and I could remove what would be "bloat"
in your terminology: type signatures, convenience functions and
imports. But I find the code more readable the way I've written it.
The core sieve can be implemented in only three lines. With combinators
main = fmap (map read) getArgs >>=
mapM_ (\n -> print . last . map fst . filter snd . assocs . runSTUArray $
newArray (2,n) True >>= \arr ->
arr <$ forM_ [2..n] (\i -> readArray arr i >>=
(`when` forM_ [2*i, 3*i .. n] (flip (writeArray arr) False))))
That's 7 lines of code, is still incomplete by design and doesn't even
compile:

$ ghc-6.12.1 --make -O2 sieve.hs -o sieve
[1 of 1] Compiling Main ( sieve.hs, sieve.o )

sieve.hs:1:23: Not in scope: `getArgs'

sieve.hs:2:59: Not in scope: `assocs'

sieve.hs:2:68: Not in scope: `runSTUArray'

sieve.hs:3:10: Not in scope: `newArray'

sieve.hs:4:14: Not in scope: `<$'

sieve.hs:4:17: Not in scope: `forM_'

sieve.hs:4:37: Not in scope: `readArray'

sieve.hs:5:13: Not in scope: `when'

sieve.hs:5:20: Not in scope: `forM_'

sieve.hs:5:49: Not in scope: `writeArray'
Post by Ertugrul Söylemez
This is not merely the sieve.
Indeed, it doesn't even implement the extensible sieve correctly.
Post by Ertugrul Söylemez
It's a complete program,
Then why doesn't it compile or work?
Post by Ertugrul Söylemez
which expects sieve sizes as parameters,
A complete program would not even require sieve sizes.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Two unnecessary writeArrays are there just for efficiency. And it
gets near C speed.
On my machine your Haskell is over 3x slower than your C.
Haskell code is always orders of magnitude slower, as soon as you
compile it.
I suppose its my fault your programs are so long and incomplete as well?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-24 14:42:13 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
The following traditional implementation of the y-combinator is
fix f = (\x -> f (x x)) (\x -> f (x x))
Yet the type system of standard Haskell fails on it.
This is the traditional Y combinator of untyped lambda calculus.
Haskell implements the typed lambda calculus, in which you can't
express the Y combinator. It is correct and expected that this code
does not typecheck.
This is not related to any shortcoming of Haskell's type system. It
does its job properly. It's a shortcoming of your understanding of
the lambda calculus.
If it is not a shortcoming of Haskell's type system, why does it work
$ ocaml -rectypes
Objective Caml version 3.11.1
# let fix f = (fun x -> f(x x)) (fun x y -> f(x x) y);;
val fix : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>
# fix (fun f -> function 0 -> 1 | n -> n*f(n-1)) 10;;
- : int = 3628800
I don't know OCaml's type system, but obviously it doesn't implement the
typed lambda calculus exactly, at least not with the '-rectypes'
argument, you're passsing to it. Note that I'm talking about standard
Haskell without extensions.
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
That may be because my Haskell version is not the sieve of
Eratosthenes. ;)
Exactly.
Good. Then your whole statement is void, because you're comparing
the running times of two different algorithms with different
complexities, which is obviously pointless.
You cannot see the point of observing that other languages let you
solve problems more succinctly and efficiently than Haskell?
If your point is something completely different and unrelated to your
actual statement, no I can't see your point. I don't care either,
because your point is wrong anyway.
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
The nearest she came to a decent Haskell implementation is 4,690
bytes of Haskell code (!) and still runs several times slower
than my 436 byte F# solution.
See my other post. Haskell sieve of Eratosthenes in 7 lines of code.
No. Your incomplete Haskell solution is already 15 lines of code.
My solution is not incomplete
Your solution does not extend its sieve => it is incomplete.
It is complete, because we were talking about the sieve of Eratosthenes.
The original description does not include extension. If it did, I would
have written a growing sieve and it wouldn't have 4690 bytes. Just
because someone else fails to do it proper doesn't mean it's not
possible.
Post by Jon Harrop
Post by Ertugrul Söylemez
and I could remove what would be "bloat" in your terminology: type
signatures, convenience functions and imports. But I find the code
more readable the way I've written it. The core sieve can be
implemented in only three lines. With combinators and less syntactic
main = fmap (map read) getArgs >>=
mapM_ (\n -> print . last . map fst . filter snd . assocs .
runSTUArray $
newArray (2,n) True >>= \arr ->
arr <$ forM_ [2..n] (\i -> readArray arr i >>=
(`when` forM_ [2*i, 3*i .. n] (flip (writeArray arr)
False))))
That's 7 lines of code, is still incomplete by design and doesn't even
I posted five lines of code, not seven. Not my fault, if your
newsreader messes up my post. Also I don't count imports, because I
could use fully qualified names just as well, but it would be braindead
to do it.
Post by Jon Harrop
Post by Ertugrul Söylemez
This is not merely the sieve.
Indeed, it doesn't even implement the extensible sieve correctly.
Post by Ertugrul Söylemez
It's a complete program,
Then why doesn't it compile or work?
Post by Ertugrul Söylemez
which expects sieve sizes as parameters,
A complete program would not even require sieve sizes.
Thousands of implementations in history disagree with you.
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Two unnecessary writeArrays are there just for efficiency. And it
gets near C speed.
On my machine your Haskell is over 3x slower than your C.
Haskell code is always orders of magnitude slower, as soon as you
compile it.
I suppose its my fault your programs are so long and incomplete as well?
If I would write a "complete" (in your sense) implementation, it would
be shorter than your F# code. But I don't feel like wasting my time to
prove things I don't care about to people I don't care about. I've
already done much more than I wanted to.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-24 21:09:57 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
The following traditional implementation of the y-combinator is
fix f = (\x -> f (x x)) (\x -> f (x x))
Yet the type system of standard Haskell fails on it.
This is the traditional Y combinator of untyped lambda calculus.
Haskell implements the typed lambda calculus, in which you can't
express the Y combinator. It is correct and expected that this code
does not typecheck.
This is not related to any shortcoming of Haskell's type system. It
does its job properly. It's a shortcoming of your understanding of
the lambda calculus.
If it is not a shortcoming of Haskell's type system, why does it work
$ ocaml -rectypes
Objective Caml version 3.11.1
# let fix f = (fun x -> f(x x)) (fun x y -> f(x x) y);;
val fix : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>
# fix (fun f -> function 0 -> 1 | n -> n*f(n-1)) 10;;
- : int = 3628800
I don't know OCaml's type system, but obviously it doesn't implement the
typed lambda calculus exactly, at least not with the '-rectypes'
argument, you're passsing to it. Note that I'm talking about standard
Haskell without extensions.
The problem is that you have been confusing "the typed lambda calculus" with
the *simply* typed lambda calculus.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
See my other post. Haskell sieve of Eratosthenes in 7 lines of code.
No. Your incomplete Haskell solution is already 15 lines of code.
My solution is not incomplete
Your solution does not extend its sieve => it is incomplete.
It is complete, because we were talking about the sieve of Eratosthenes.
No, we were talking specifically about my extensible implementation of it
here:

http://groups.google.com/group/comp.lang.functional/msg/3659ea9adaa46653?hl=en

You can tell that by the way everyone else posted solutions equivalent to
that one.
Post by Ertugrul Söylemez
The original description does not include extension.
Yes, it did. I know that because I posted the original description.
Post by Ertugrul Söylemez
If it did, I would
have written a growing sieve and it wouldn't have 4690 bytes. Just
because someone else fails to do it proper doesn't mean it's not
possible.
Nobody claimed it was impossible to implement in Haskell. You claimed it
would be "easy" to write and "fast" to run and then failed to produce any
correct or performant solutions in Haskell.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
and I could remove what would be "bloat" in your terminology: type
signatures, convenience functions and imports. But I find the code
more readable the way I've written it. The core sieve can be
implemented in only three lines. With combinators and less syntactic
main = fmap (map read) getArgs >>=
mapM_ (\n -> print . last . map fst . filter snd . assocs .
runSTUArray $
newArray (2,n) True >>= \arr ->
arr <$ forM_ [2..n] (\i -> readArray arr i >>=
(`when` forM_ [2*i, 3*i .. n] (flip (writeArray arr)
False))))
That's 7 lines of code, is still incomplete by design and doesn't even
I posted five lines of code, not seven. Not my fault, if your
newsreader messes up my post.
But it is your fault that your code overran 80 char lines. You simply put
more code on fewer lines and switched from counting chars to counting lines
in order to substantiate the pro-Haskell conclusion you started out with.

You'll notice that my original F# solution was properly formatted and its
longest line was only 43 chars.
Post by Ertugrul Söylemez
Also I don't count imports, because...
because you want to conclude that Haskell is concise.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
This is not merely the sieve.
Indeed, it doesn't even implement the extensible sieve correctly.
Post by Ertugrul Söylemez
It's a complete program,
Then why doesn't it compile or work?
Post by Ertugrul Söylemez
which expects sieve sizes as parameters,
A complete program would not even require sieve sizes.
Thousands of implementations in history disagree with you.
Three other people responded with competing implementations to the challenge
I set. Only you got it wrong.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Two unnecessary writeArrays are there just for efficiency. And it
gets near C speed.
On my machine your Haskell is over 3x slower than your C.
Haskell code is always orders of magnitude slower, as soon as you
compile it.
I suppose its my fault your programs are so long and incomplete as well?
If I would write a "complete" (in your sense) implementation, it would
be shorter than your F# code...
Wow.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Matthias
2010-02-25 02:16:48 UTC
Permalink
Post by Ertugrul Söylemez
I don't know OCaml's type system, but obviously it doesn't implement the
typed lambda calculus exactly, at least not with the '-rectypes'
argument, you're passsing to it.
You are talking about "the" typed lambda calculus as if there were
only one. There are many typed lambda calculi, some for which your
claim is true, and others for which your claim is false. In
particular, in a lambda-calculus with equi-recursive types the Y
combinator can be typed. OCaml with -rectypes happens to implement
such a lambda calculus.

(However, it is widely acknowledged that the behavior that you get
from -rectypes is not usually desirable for many practical reasons.
That's why the option is not on by default.)

Cheers,
Matthias
Juan Pedro Bolivar Puente
2010-02-22 23:35:01 UTC
Permalink
Post by Jon Harrop
You annotated the types by hand because that is idiomatic Haskell. That is
idiomatic Haskell because the type system is too easily misled by design
(flaw). Type annotations are one of the main sources of verbosity in real
Haskell code.
How can ever one single line per top-level declaration be considered
bloat? Really, I did not want to participate in this discussion because
I am far from an expert on FP; but you keep repeating this argument and
it is completely non-sense. As Ertugrul has just said, the type
annotations are **completely optional**; the type system will always
infer the right most general type --except with certain odd extensions
and in the monomorphic restriction case.

Why are then the type annotations widely used? Because they help the
programmer. I don't know what kind of code do you write, but, I guess
that you write comments. It is idiomatic in any language to write one
comment per top-level function unless you want to make everybody else in
your team angry. Just consider those type annotations formal
documentation. A very nice form of documentation though, because if the
code is wrong the compiler will tell you that it doesn't do what you
meant in the type signature.

My two cents,

JP
Jon Harrop
2010-02-23 02:36:18 UTC
Permalink
Post by Juan Pedro Bolivar Puente
Post by Jon Harrop
You annotated the types by hand because that is idiomatic Haskell. That
is idiomatic Haskell because the type system is too easily misled by
design (flaw). Type annotations are one of the main sources of verbosity
in real Haskell code.
How can ever one single line per top-level declaration be considered
bloat?
Type annotations are a significant proportion of the total number of lines
of code in real Haskell code bases (but not in other FPLs).
Post by Juan Pedro Bolivar Puente
Really, I did not want to participate in this discussion because
I am far from an expert on FP; but you keep repeating this argument and
it is completely non-sense.
I already gave you one example where this is the case. Look at any
production Haskell code and you will find the same thing. In Ertugral's
original Haskell code on this thread, these unnecessary type annotations
*doubled* the size of his code. That is not uncommon.
Post by Juan Pedro Bolivar Puente
As Ertugrul has just said, the type annotations are **completely
optional**; the type system will always infer the right most general
type --except with certain odd extensions and in the monomorphic
restriction case.
You are describing a property of Hindley-Milner yet other languages based on
HM are not afflicted by these type annotations. How do you explain that?
Post by Juan Pedro Bolivar Puente
Why are then the type annotations widely used? Because they help the
programmer. I don't know what kind of code do you write, but, I guess
that you write comments. It is idiomatic in any language to write one
comment per top-level function unless you want to make everybody else in
your team angry. Just consider those type annotations formal
documentation. A very nice form of documentation though, because if the
code is wrong the compiler will tell you that it doesn't do what you
meant in the type signature.
You are just repeating the same bullshit "explanation" that Ertugral posted.
If that really were the motivation for bloating every definition with a
type annotation in production code you would see the same effect in all
FPLs but you do not. Why?

The real reason is, of course, that those theoretically-optional type
annotations are a practically-essential in Haskell for developer
productivity but not in most other FPLs. They are a manifestation of a
design flaw in Haskell's type system: its "power" makes it too easily
misled so it needs that extra guidance. If you removed those type
annotations you would suffer from incomprehensible error messages in
Haskell but not in other FPLs with simpler type systems.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Erik de Castro Lopo
2010-02-23 01:40:38 UTC
Permalink
Post by Jon Harrop
Type annotations are a significant proportion of the total number of lines
of code in real Haskell code bases (but not in other FPLs).
Comments take up far more lines than type annotations. Are they also
bloat?

Idiot!

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/
Jon Harrop
2010-02-23 03:22:37 UTC
Permalink
Post by Erik de Castro Lopo
Post by Jon Harrop
Type annotations are a significant proportion of the total number of
lines of code in real Haskell code bases (but not in other FPLs).
Comments take up far more lines than type annotations.
Where did this "fact" come from?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Paul Rubin
2010-02-23 03:05:31 UTC
Permalink
Post by Jon Harrop
Post by Juan Pedro Bolivar Puente
As Ertugrul has just said, the type annotations are **completely
optional**; the type system will always infer the right most general
type --except with certain odd extensions and in the monomorphic
restriction case.
You are describing a property of Hindley-Milner yet other languages based on
HM are not afflicted by these type annotations. How do you explain that?
That claim is pretty bogus. ML, for example, has an entire separate
sublanguage (its module system) for writing type annotations, that
Haskell handles with a class constraint here or there. Even to add two
plus two in Ocaml, you have to annotate both the "+" and the "2"
(i.e. "2+2" or "2. +. 2.") to distinguish integer from floating versions
of the numeric literal and the addition function. I think other ML
dialects get around that problem somehow, but Haskell's scheme for it is
quite natural and extensible and the automatic inference works pretty
well.

Writing a top-level function in Haskell, I usually either write the type
annotation first, which then guides writing the code; or else I write
the code first, then let ghci infer the type, then paste the type back
into the program for later reference. Either way, the annotation feels
natural and useful as machine-checked documentation, even though it's
not necessary.
Jon Harrop
2010-02-23 05:44:08 UTC
Permalink
Post by Paul Rubin
Post by Jon Harrop
Post by Juan Pedro Bolivar Puente
As Ertugrul has just said, the type annotations are **completely
optional**; the type system will always infer the right most general
type --except with certain odd extensions and in the monomorphic
restriction case.
You are describing a property of Hindley-Milner yet other languages based
on HM are not afflicted by these type annotations. How do you explain
that?
That claim is pretty bogus. ML, for example, has an entire separate
sublanguage (its module system) for writing type annotations,
No, that is not what the module system is for at all. The primary use of the
module system is to generate new data structures from one or more existing
data structures using higher-order modules. Consequently, type annotations
appear only in the library's abstract signatures and not in the user's
concrete structures at all.

For example, using the ocamlgraph library to create a data structure
representing an undirected network of coordinates:

module G = Imperative.Graph.Abstract(struct type t = int * int end)

From: http://ocamlgraph.lri.fr/sudoku.ml

There is a type definition but no annotation.

Creating an undirected network with integer coordinates for nodes and
integer labels on edges:

module IntInt = struct
type t = int * int
end
module Int = struct
type t = int
let compare = compare
let hash = Hashtbl.hash
let equal = (=)
let default = 0
end
module G = Imperative.Graph.AbstractLabeled(IntInt)(Int)

From: http://ocamlgraph.lri.fr/color.ml

Again, two type definitions and the rest are definitions without
annotations.

As the types appear only in the reusable abstract interfaces, they are
comparatively rare whereas type annotations in Haskell appear on almost
every definition. Annotating the signature argument to a higher-order
module is not comparable to annotating the type of a definition.
Post by Paul Rubin
that Haskell handles with a class constraint here or there.
No, Haskell cannot express this because it doesn't have a higher order
module system. See:

http://augustss.blogspot.com/2008/12/somewhat-failed-adventure-in-haskell.html
Post by Paul Rubin
Even to add two
plus two in Ocaml, you have to annotate both the "+" and the "2"
(i.e. "2+2" or "2. +. 2.") to distinguish integer from floating versions
of the numeric literal and the addition function. I think other ML
dialects get around that problem somehow,
Standard ML ad-hoc overloads "+" for int and float but no other types and
you must still write 2.0 instead of 2 if you want a float. F# has a limited
form of type classes for operators but, again, you must write the literals
correctly.
Post by Paul Rubin
but Haskell's scheme for it is quite natural and extensible and the
automatic inference works pretty well.
Type classes are great for some very specific problems like operator
overloading but they are abused for many other things in Haskell because it
lacks alternatives like a higher-order module system, polymorphic variants
an object system and so on.

You're right that there are other forms of bloat buried in OCaml code
though. Perhaps the only logical course is to compare verbosity.
Post by Paul Rubin
Writing a top-level function in Haskell, I usually either write the type
annotation first, which then guides writing the code; or else I write
the code first, then let ghci infer the type, then paste the type back
into the program for later reference. Either way, the annotation feels
natural and useful as machine-checked documentation, even though it's
not necessary.
What about source files for batch compilation? Don't you annotate almost all
definitions? What happens if you try to develop a complicated program
without writing and maintaining those type annotations by hand?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-23 03:16:56 UTC
Permalink
[a lot of bullshit, which is entirely wrong]
Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Ertugrul Söylemez
2010-02-23 03:18:16 UTC
Permalink
[a lot of bullshit, which is entirely wrong]
Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Andreas Rossberg
2010-02-24 23:22:09 UTC
Permalink
I can't refrain from a couple of comments on Jon's agitation, though
I'm sure I gonna regret posting them...
[...] Look at any
production Haskell code and you will find the same thing. In Ertugral's
original Haskell code on this thread, these unnecessary type annotations
*doubled* the size of his code. That is not uncommon.
That is a wild claim, and I doubt that you have any serious statistics
to back it up.

Moreover, if the declaration-to-definition ratio is really higher in
Haskell code than ML then that probably says more about the high-
levelness of Haskell code than about alleged weaknesses of its type
inference.
Post by Juan Pedro Bolivar Puente
As Ertugrul has just said, the type annotations are **completely
optional**; the type system will always infer the right most general
type --except with certain odd extensions and in the monomorphic
restriction case.
You are describing a property of Hindley-Milner yet other languages based on
HM are not afflicted by these type annotations. How do you explain that?
Haskell's monomorphism restriction is a relaxed version of ML's value
restriction. And while Haskell at least allows you to get around it
with a type annotation, there is no way to do so in ML. Are you truly
arguing that that is an advantage and makes ML type inference "more
powerful"?

Generally speaking, annotations are a moot point anyway. It has long
been acknowledged by most experts that complete type inference is a
dead end. As a trade-off for increased expressiveness, future type
systems will almost certainly require some amount of annotation. In
fact, they already do in all existing languages, including all MLs.
Post by Juan Pedro Bolivar Puente
Why are then the type annotations widely used? Because they help the
programmer. [...]
You are just repeating the same bullshit "explanation" that Ertugral posted.
If that really were the motivation for bloating every definition with a
type annotation in production code you would see the same effect in all
FPLs but you do not. Why?
A reasonable ML programmer will typically put a similar amount of type
declarations into module signatures. The fact that you never do that
probably says more about your approach to software engineering than
about the alleged verbosity of ML vs Haskell.


[in another subthread:]
The following traditional implementation of the y-combinator is correct
fix f = (\x -> f (x x)) (\x -> f (x x))
Yet the type system of standard Haskell fails on it.
And so does any of the MLs, except if you intentionally run OCaml with
rectypes on. Arbitrary recursive types are usually disallowed for very
good reasons that you should be aware of before using them as an
argument. Incidentally, these reasons have to do with too much power
and incomprehensible error messages.
No, Haskell cannot express this because it doesn't have a higher order
module system.
Interestingly, with the same verve, you have denied any need for
functors in previous discussions (where you happened to pitch F#,
which does not have them either).
Jon Harrop
2010-02-25 03:38:42 UTC
Permalink
Post by Andreas Rossberg
[...] Look at any
production Haskell code and you will find the same thing. In Ertugral's
original Haskell code on this thread, these unnecessary type annotations
*doubled* the size of his code. That is not uncommon.
That is a wild claim, and I doubt that you have any serious statistics
to back it up.
I meant that type annotations often amount to a significant proportion of
Haskell code bases, not that they commonly amount to 50%.
Post by Andreas Rossberg
Moreover, if the declaration-to-definition ratio is really higher in
Haskell code than ML then that probably says more about the high-
levelness of Haskell code than about alleged weaknesses of its type
inference.
What exactly do you mean by "high-levelness" there?
Post by Andreas Rossberg
Post by Juan Pedro Bolivar Puente
As Ertugrul has just said, the type annotations are **completely
optional**; the type system will always infer the right most general
type --except with certain odd extensions and in the monomorphic
restriction case.
You are describing a property of Hindley-Milner yet other languages based
on HM are not afflicted by these type annotations. How do you explain
that?
Haskell's monomorphism restriction is a relaxed version of ML's value
restriction. And while Haskell at least allows you to get around it
with a type annotation, there is no way to do so in ML. Are you truly
arguing that that is an advantage and makes ML type inference "more
powerful"?
No, I was not referring to the "monomorphic restriction" part.

My point was that there are trade-offs between weak and powerful type
systems in terms of usability. More powerful => incomprehensible errors =>
add more type annotations to keep it usable.
Post by Andreas Rossberg
Generally speaking, annotations are a moot point anyway. It has long
been acknowledged by most experts that complete type inference is a
dead end. As a trade-off for increased expressiveness, future type
systems will almost certainly require some amount of annotation.
Sure.
Post by Andreas Rossberg
In fact, they already do in all existing languages, including all MLs.
What problem requires a solution using type annotations in OCaml?
Post by Andreas Rossberg
Post by Juan Pedro Bolivar Puente
Why are then the type annotations widely used? Because they help the
programmer. [...]
You are just repeating the same bullshit "explanation" that Ertugral
posted. If that really were the motivation for bloating every definition
with a type annotation in production code you would see the same effect
in all FPLs but you do not. Why?
A reasonable ML programmer will typically put a similar amount of type
declarations into module signatures. The fact that you never do that
probably says more about your approach to software engineering than about
the alleged verbosity of ML vs Haskell.
Firstly, Standard ML programmers might put that density of annotations in
their code but OCaml programmers certainly do not. Look at the examples I
already cited.

Secondly, I said I don't annotate every definition not that I never annotate
any definition. Look at my open source code.

Thirdly, why the strawman-powered ad-hominem?
Post by Andreas Rossberg
[in another subthread:]
The following traditional implementation of the y-combinator is correct
fix f = (\x -> f (x x)) (\x -> f (x x))
Yet the type system of standard Haskell fails on it.
And so does any of the MLs, except if you intentionally run OCaml with
rectypes on. Arbitrary recursive types are usually disallowed for very
good reasons that you should be aware of before using them as an
argument. Incidentally, these reasons have to do with too much power
and incomprehensible error messages.
Sure. And because the workaround is generally easy:

# let fix f =
(fun (`X x) -> f(x (`X x))) (`X(fun (`X x) y -> f(x (`X x)) y));;
val fix : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>
Post by Andreas Rossberg
No, Haskell cannot express this because it doesn't have a higher order
module system.
Interestingly, with the same verve, you have denied any need for
functors in previous discussions (where you happened to pitch F#,
which does not have them either).
There are alternatives to a higher-order module system. Haskell doesn't have
them either.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-25 16:22:14 UTC
Permalink
Post by Jon Harrop
Post by Andreas Rossberg
Post by Jon Harrop
No, Haskell cannot express this because it doesn't have a higher
order module system.
Interestingly, with the same verve, you have denied any need for
functors in previous discussions (where you happened to pitch F#,
which does not have them either).
There are alternatives to a higher-order module system. Haskell
doesn't have them either.
It has type classes, which cover 90% of use cases in standard Haskell.
If they don't cover your use case, there are numerous extensions that
will.

Now be a mature boy and don't try to make up a difficult use case,
forcing us to prove things with actual code. I don't feel like wasting
my time anymore. Haskell works great for me for all of my use cases,
much greater than OCaml will ever will.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-21 05:20:34 UTC
Permalink
Post by frankenstein
Post by Jon Harrop
let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
Would be interesting to convert it into imperative style, accumulating a
resizeable array.
Hi: I somehow manage to understand the Haskell version based on lazy
evaluation but gave in in grasping the OCaml version.
Note that this is F# code and not OCaml code.
Post by frankenstein
What is 'Some' do? What is 'f' supposed to stand for in 'rec f n' and
how is it all processed (what is the starting point)? I do not
understand how the 'guards |>' in the OCaml code accomplish the
behavior of 'laziness' here.
This code is confusing and fragile primarily because I (ab)used the .NET
IEnumerable (called Seq in F#) as a poor man's lazy infinite sequence.

The function "f" is a generator that takes one prime and produces the next
prime. The Seq.unfold function uses "f" to create the infinite sequence of
primes. The "|>" operator just pipes a value through a function.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-21 05:35:50 UTC
Permalink
Post by Jon Harrop
let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
Here is a shorter, simpler and more direct translation that just uses F#'s
LazyList data structure:

open LazyList

let rec isPrime n =
Seq.takeWhile (fun d -> d*d <= n) primes
|> Seq.forall (fun d -> n%d <> 0)
and primes =
unfold (fun n -> Some(n, n+1)) 3
|> filter isPrime
|> cons 2

As you can see, the differences are merely syntactic. Moreover, the
performance is still comparable.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-21 06:58:57 UTC
Permalink
Post by Jon Harrop
Post by Jon Harrop
let rec isPrime n =
primes
|> Seq.takeWhile (fun d -> d*d <= n)
|> Seq.forall (fun d -> n%d <> 0)
and primes =
let rec f n =
if isPrime n then Some(n, n+1) else f(n+1)
Seq.unfold f 3
|> Seq.append [2]
|> Seq.cache
Takes about the same time as the Haskell to find the 100,000th prime.
Here is a shorter, simpler and more direct translation that just uses F#'s
open LazyList
let rec isPrime n =
Seq.takeWhile (fun d -> d*d <= n) primes
|> Seq.forall (fun d -> n%d <> 0)
and primes =
unfold (fun n -> Some(n, n+1)) 3
|> filter isPrime
|> cons 2
As you can see, the differences are merely syntactic. Moreover, the
performance is still comparable.
Here is the Sieve of Eratosthenes implemented in imperative F# such that the
number of primes sieved roughly doubles with each iteration:

let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]

This is 2.5x more chars and over 100x faster than the original Haskell. For
example, you can get the 10,000,000th prime in only a few seconds.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-21 19:07:41 UTC
Permalink
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such that
let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
This is 2.5x more chars and over 100x faster than the original Haskell.
For example, you can get the 10,000,000th prime in only a few seconds.
Compare with the Haskell implementation of the same simple sieve by Melissa
O'Neill (that is slower and 10x longer!):

module ONeillPrimes (primes, sieve, calcPrimes, primesToNth,
primesToLimit) where

data PriorityQ k v = Lf
| Br {-# UNPACK #-} !k v !(PriorityQ k v) !(PriorityQ k
v)
deriving (Eq, Ord, Read, Show)

emptyPQ :: PriorityQ k v
emptyPQ = Lf

isEmptyPQ :: PriorityQ k v -> Bool
isEmptyPQ Lf = True
isEmptyPQ _ = False

minKeyValuePQ :: PriorityQ k v -> (k, v)
minKeyValuePQ (Br k v _ _) = (k,v)
minKeyValuePQ _ = error "Empty heap!"

minKeyPQ :: PriorityQ k v -> k
minKeyPQ (Br k v _ _) = k
minKeyPQ _ = error "Empty heap!"

minValuePQ :: PriorityQ k v -> v
minValuePQ (Br k v _ _) = v
minValuePQ _ = error "Empty heap!"

insertPQ :: Ord k => k -> v -> PriorityQ k v -> PriorityQ k v
insertPQ wk wv (Br vk vv t1 t2)
| wk <= vk = Br wk wv (insertPQ vk vv t2) t1
| otherwise = Br vk vv (insertPQ wk wv t2) t1
insertPQ wk wv Lf = Br wk wv Lf Lf

siftdown :: Ord k => k -> v -> PriorityQ k v -> PriorityQ k v -> PriorityQ
k v
siftdown wk wv Lf _ = Br wk wv Lf Lf
siftdown wk wv (t @ (Br vk vv _ _)) Lf
| wk <= vk = Br wk wv t Lf
| otherwise = Br vk vv (Br wk wv Lf Lf) Lf
siftdown wk wv (t1 @ (Br vk1 vv1 p1 q1)) (t2 @ (Br vk2 vv2 p2 q2))
| wk <= vk1 && wk <= vk2 = Br wk wv t1 t2
| vk1 <= vk2 = Br vk1 vv1 (siftdown wk wv p1 q1) t2
| otherwise = Br vk2 vv2 t1 (siftdown wk wv p2 q2)

deleteMinAndInsertPQ :: Ord k => k -> v -> PriorityQ k v -> PriorityQ k v
deleteMinAndInsertPQ wk wv Lf = error "Empty PriorityQ"
deleteMinAndInsertPQ wk wv (Br _ _ t1 t2) = siftdown wk wv t1 t2

leftrem :: PriorityQ k v -> (k, v, PriorityQ k v)
leftrem (Br vk vv Lf Lf) = (vk, vv, Lf)
leftrem (Br vk vv t1 t2) = (wk, wv, Br vk vv t t2) where
(wk, wv, t) = leftrem t1
leftrem _ = error "Empty heap!"

deleteMinPQ :: Ord k => PriorityQ k v -> PriorityQ k v
deleteMinPQ (Br vk vv Lf _) = Lf
deleteMinPQ (Br vk vv t1 t2) = siftdown wk wv t2 t where
(wk,wv,t) = leftrem t1
deleteMinPQ _ = error "Empty heap!"

type HybridQ k v = (PriorityQ k v, [(k,v)])

initHQ :: PriorityQ k v -> [(k,v)] -> HybridQ k v
initHQ pq feeder = (pq, feeder)

insertHQ :: (Ord k) => k -> v -> HybridQ k v -> HybridQ k v
insertHQ k v (pq, q) = (insertPQ k v pq, q)

deleteMinAndInsertHQ :: (Ord k) => k -> v -> HybridQ k v -> HybridQ k v
deleteMinAndInsertHQ k v (pq, q) = postRemoveHQ(deleteMinAndInsertPQ k v
pq, q)
where
postRemoveHQ mq@(pq, []) = mq
postRemoveHQ mq@(pq, (qk,qv) : qs)
| qk < minKeyPQ pq = (insertPQ qk qv pq, qs)
| otherwise = mq

minKeyHQ :: HybridQ k v -> k
minKeyHQ (pq, q) = minKeyPQ pq

minKeyValueHQ :: HybridQ k v -> (k, v)
minKeyValueHQ (pq, q) = minKeyValuePQ pq

{-# SPECIALIZE wheel :: [Int] #-}
{-# SPECIALIZE wheel :: [Integer] #-}
wheel :: Integral a => [a]
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:4:8:6:4:6:2:4:6
:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel

{-# SPECIALIZE calcPrimes :: () -> [Int] #-}
{-# SPECIALIZE calcPrimes :: () -> [Integer] #-}
calcPrimes :: Integral a => () -> [a]
calcPrimes () = 2 : 3 : 5 : 7 : sieve 11 wheel

{-# SPECIALIZE primes :: [Int] #-}
{-# SPECIALIZE primes :: [Integer] #-}
primes :: Integral a => [a]
primes = calcPrimes ()

{-# SPECIALIZE primesToNth :: Int -> [Integer] #-}
{-# SPECIALIZE primesToNth :: Int -> [Int] #-}
primesToNth :: Integral a => Int -> [a]
primesToNth n = take n (calcPrimes ())

{-# SPECIALIZE primesToLimit :: Integer -> [Integer] #-}
{-# SPECIALIZE primesToLimit :: Int -> [Int] #-}
primesToLimit :: Integral a => a -> [a]
primesToLimit limit = takeWhile (< limit) (calcPrimes ())

{-# SPECIALIZE sieve :: Int -> [Int] -> [Int] #-}
{-# SPECIALIZE sieve :: Integer -> [Integer] -> [Integer] #-}
sieve :: Integral a => a -> [a] -> [a]
sieve n [] = []
sieve n wheel@(d:ds) = n : (map (\(p,wheel) -> p) primes1) where
primes1 = sieve' (n+d) ds initialTable
primes2 = sieve' (n+d) ds initialTable
initialTable = initHQ (insertPQ (n*n) (n, wheel) emptyPQ)
(map (\(p,wheel) -> (p*p,(p,wheel))) primes2)
sieve' n [] table = []
sieve' n wheel@(d:ds) table
| nextComposite <= n = sieve' (n+d) ds (adjust table)
| otherwise = (n,wheel) : sieve' (n+d) ds table
where
nextComposite = minKeyHQ table
adjust table
| m <= n = adjust (deleteMinAndInsertHQ m' (p, ds)
table)
| otherwise = table
where
(m, (p, d:ds)) = minKeyValueHQ table
m' = m + p * d
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
frankenstein
2010-02-21 20:49:07 UTC
Permalink
Post by Jon Harrop
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such that
  let primes =
    let a = ResizeArray[2]
    let grow() =
      let p0 = a.[a.Count-1]+1
      let b = Array.create p0 true
      for di in a do
        let i0 = p0/di*di
        let i0 = if i0<p0 then i0+di else i0
        let rec loop i =
          if i<b.Length then
            b.[i] <- false
            loop(i+di)
        loop(i0-p0)
      for i=0 to b.Length-1 do
        if b.[i] then a.Add(p0+i)
    fun n ->
      while n >= a.Count do
        grow()
      a.[n]
This is 2.5x more chars and over 100x faster than the original Haskell.
For example, you can get the 10,000,000th prime in only a few seconds.
Compare with the Haskell implementation of the same simple sieve by Melissa
Sod! What's that? As far as I remember the Sieve could be estimated by
using list comprehension. I haven't been using Clean (a language
similar to Haskell) for years but the Clean book showed how to
calculate the Sieve in a clever way. I am not 100% sure though think I
came across the Sieve in the Clean book.

Hasn't Haskell similar build-in capabilities than Clean?
frankenstein
2010-02-21 21:02:42 UTC
Permalink
Post by Jon Harrop
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such that
  let primes =
    let a = ResizeArray[2]
    let grow() =
      let p0 = a.[a.Count-1]+1
      let b = Array.create p0 true
      for di in a do
        let i0 = p0/di*di
        let i0 = if i0<p0 then i0+di else i0
        let rec loop i =
          if i<b.Length then
            b.[i] <- false
            loop(i+di)
        loop(i0-p0)
      for i=0 to b.Length-1 do
        if b.[i] then a.Add(p0+i)
    fun n ->
      while n >= a.Count do
        grow()
      a.[n]
This is 2.5x more chars and over 100x faster than the original Haskell.
For example, you can get the 10,000,000th prime in only a few seconds.
Compare with the Haskell implementation of the same simple sieve by Melissa
nicked from the Clean book (page 70 in the Acrobat reader):
http://www.st.cs.ru.nl/papers/cleanbook/CleanBookI.pdf

==
This algorithm to compute prime numbers is called the sieve of
Eratosthenes. Eratosthenes
was a greek mathematician born in Cyrene who lived 276-196 BC. His
algorithm can be
expressed slightly more elegant using list comprehensions:

primes :: [Int]
primes = sieve [2..]

sieve :: [Int] -> [Int]
sieve [prime:rest] = [prime: sieve [i \\ i <- rest | i mod prime <>
0]]
==
Jon Harrop
2010-02-22 04:09:35 UTC
Permalink
Post by frankenstein
http://www.st.cs.ru.nl/papers/cleanbook/CleanBookI.pdf
==
This algorithm to compute prime numbers is called the sieve of
Eratosthenes. Eratosthenes
was a greek mathematician born in Cyrene who lived 276-196 BC. His
algorithm can be
primes :: [Int]
primes = sieve [2..]
sieve :: [Int] -> [Int]
sieve [prime:rest] = [prime: sieve [i \\ i <- rest | i mod prime <>
0]]
That is actually Turner's Sieve and not the Sieve of Eratosthenes. This
common misconception was described in detail in the following paper:

http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
frankenstein
2010-02-22 10:43:13 UTC
Permalink
Post by Jon Harrop
Post by frankenstein
http://www.st.cs.ru.nl/papers/cleanbook/CleanBookI.pdf
==
This algorithm to compute prime numbers is called the sieve of
Eratosthenes. Eratosthenes
was a greek mathematician born in Cyrene who lived 276-196 BC. His
algorithm can be
primes :: [Int]
primes = sieve [2..]
sieve :: [Int] -> [Int]
sieve [prime:rest] = [prime: sieve [i \\ i <- rest | i mod prime <>
0]]
That is actually Turner's Sieve and not the Sieve of Eratosthenes. This
 http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.http://www.ffconsultancy.com/?u
Hi

I am left in the dark. Why might it be that hard to translate your F#
version into a similar concise Haskell version? Or are people trying
here (the posted Haskell version) to avoid imperative styles in
Haskell by any means? Or is it just impossible imitating the
imperative F# code in Haskell?
Ertugrul Söylemez
2010-02-22 15:06:06 UTC
Permalink
Post by frankenstein
Post by Jon Harrop
Post by frankenstein
This algorithm to compute prime numbers is called the sieve of
Eratosthenes. Eratosthenes was a greek mathematician born in
Cyrene who lived 276-196 BC. His algorithm can be expressed
primes :: [Int]
primes = sieve [2..]
sieve :: [Int] -> [Int]
sieve [prime:rest] = [prime: sieve [i \\ i <- rest | i mod prime <>
0]]
That is actually Turner's Sieve and not the Sieve of
Eratosthenes. This common misconception was described in detail in
 http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
I am left in the dark. Why might it be that hard to translate your F#
version into a similar concise Haskell version? Or are people trying
here (the posted Haskell version) to avoid imperative styles in
Haskell by any means? Or is it just impossible imitating the
imperative F# code in Haskell?
See the Haskell code in my other post. It is not a direct translation,
because my code uses a fixed sieve, but making it self-growing wouldn't
be difficult using the 'vector' package. It's also quite fast, slower
than C by a factor of only 1.6.

Also note when talking to Dr. Harrop that he's a notorious Haskell
hater. =)


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
frankenstein
2010-02-22 13:52:46 UTC
Permalink
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such that
  let primes =
    let a = ResizeArray[2]
    let grow() =
      let p0 = a.[a.Count-1]+1
      let b = Array.create p0 true
      for di in a do
        let i0 = p0/di*di
        let i0 = if i0<p0 then i0+di else i0
        let rec loop i =
          if i<b.Length then
            b.[i] <- false
            loop(i+di)
        loop(i0-p0)
      for i=0 to b.Length-1 do
        if b.[i] then a.Add(p0+i)
    fun n ->
      while n >= a.Count do
        grow()
      a.[n]
Over the weekend I'd like to give it a try with Bigloo. However, the
only issue I have: ResizeArray[2] what does it do? It will create a
resizable list of [0,1,2], or [1,2] or just [2] or [0,0] or [1,1] or
what?

I cannot find any information at
http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/manual/fsharp.powerpack/Microsoft.FSharp.Collections.ResizeArray.html

regarding ResizeArray[]. Last but not least "a.[a.Count-1]+1' is
indexing in F# zero based (like in Scheme) or 1 based (like in
Fortran).

Thanks,
Foerster vom Silberwald
Jon Harrop
2010-02-22 21:36:01 UTC
Permalink
Post by frankenstein
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such
let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
Over the weekend I'd like to give it a try with Bigloo. However, the
only issue I have: ResizeArray[2] what does it do? It will create a
resizable list of [0,1,2], or [1,2] or just [2] or [0,0] or [1,1] or
what?
Creates an extensible array containing the single element 2.
Post by frankenstein
regarding ResizeArray[]. Last but not least "a.[a.Count-1]+1' is
indexing in F# zero based (like in Scheme) or 1 based (like in
Fortran).
Zero based. That expression gives 1 + the last known prime.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-22 10:55:30 UTC
Permalink
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such
let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
This is 2.5x more chars and over 100x faster than the original
Haskell. For example, you can get the 10,000,000th prime in only a few
seconds.
I have no F# compiler here, but I don't believe it. I have just written
a non-growing (i.e. faster) sieve in both C and Haskell and both need a
lot of time just sieving the first 100 million numbers, which don't even
contain the 10 millionth prime number. The C version takes 4.4 seconds,
the Haskell version takes 7.0 seconds. Following are the source codes.

Haskell version:

import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed

soeST :: Int -> ST s (STUArray s Int Bool)
soeST n = do
arr <- newArray (0,n) True
writeArray arr 0 False
writeArray arr 1 False
forM_ [2..n] $ \i -> do
b <- readArray arr i
when b $ forM_ [2*i, 3*i .. n] (\j -> writeArray arr j False)
return arr

soe :: Int -> [Int]
soe n = map fst . filter snd . assocs . runSTUArray $ soeST n

C version:

#include <stdint.h>
#include <stdio.h>
#include <string.h>

#define SIEVE (100000000)
#define SIEVEARRAY (SIEVE / 32 + 1)

inline void setBit(uint32_t *p, uint64_t i) {
p[i / 32] |= 1 << (i & 31);
}

inline int getBit(uint32_t *p, uint64_t i) {
return p[i / 32] & (1 << (i % 32));
}

int main() {
static uint32_t sieve[SIEVEARRAY];
uint32_t primes = 0;

memset(sieve, 0, SIEVEARRAY * sizeof(uint32_t));
for (uint64_t i = 2; i < SIEVE; i++) {
if (getBit(sieve, i)) continue;
primes++;
for (uint64_t j = 2*i; j < SIEVE; j += i)
setBit(sieve, j);
}
printf("%u\n", primes);
return 0;
}


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-22 21:36:42 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such
let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
This is 2.5x more chars and over 100x faster than the original
Haskell. For example, you can get the 10,000,000th prime in only a few
seconds.
I have no F# compiler here, but I don't believe it. I have just written
a non-growing (i.e. faster)
It isn't significantly faster.
Post by Ertugrul Söylemez
sieve in both C and Haskell and both need a
lot of time just sieving the first 100 million numbers, which don't even
contain the 10 millionth prime number.
Your C code takes only a few seconds to sieve 200M numbers and find the 10
millionth prime here.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-22 23:02:36 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such
that the number of primes sieved roughly doubles with each
let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
This is 2.5x more chars and over 100x faster than the original
Haskell. For example, you can get the 10,000,000th prime in only a
few seconds.
I have no F# compiler here, but I don't believe it. I have just
written a non-growing (i.e. faster)
It isn't significantly faster.
Sure, but that's not the point. I'm just saying that a nongrowing sieve
is not slower to justify the comparison.
Post by Jon Harrop
Post by Ertugrul Söylemez
sieve in both C and Haskell and both need a lot of time just sieving
the first 100 million numbers, which don't even contain the 10
millionth prime number.
Your C code takes only a few seconds to sieve 200M numbers and find
the 10 millionth prime here.
Then the Haskell version should behave similarly. My point is that it's
easy to write the imperative-style, fast sieve of Eratosthenes in
Haskell, and if your F# version is really that fast, then you've got a
much faster computer than I have and C and Haskell should behave
similarly.

To sieve the first 200M numbers the C version takes 9.4 seconds and the
Haskell version takes 14.7 seconds. Note that I'm compiling with GHC
6.10.4, which is a bit outdated. As soon as GHC 6.12 is in my
distribution, I'll test again.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-23 03:07:11 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
I have no F# compiler here, but I don't believe it. I have just
written a non-growing (i.e. faster)
It isn't significantly faster.
Sure, but that's not the point. I'm just saying that a nongrowing sieve
is not slower to justify the comparison.
There is no evidence that the nongrowing sieves are not slower.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
sieve in both C and Haskell and both need a lot of time just sieving
the first 100 million numbers, which don't even contain the 10
millionth prime number.
Your C code takes only a few seconds to sieve 200M numbers and find
the 10 millionth prime here.
Then the Haskell version should behave similarly.
With that hardcoded limit, your Haskell takes 36.4s and your C takes 12.7s.
Post by Ertugrul Söylemez
My point is that it's easy to write the imperative-style, fast sieve of
Eratosthenes in Haskell,
Given that your Haskell solution introduced a hard-coded limit that my F#
did not have and your Haskell runs several times slower than everything
else, how can you possibly justify your claim?
Post by Ertugrul Söylemez
To sieve the first 200M numbers the C version takes 9.4 seconds
In your last post you said the C programs took "a lot of time" but now
you're quoting times under 10s. Is that what you meant by "a lot of time"?
Post by Ertugrul Söylemez
and the
Haskell version takes 14.7 seconds. Note that I'm compiling with GHC
6.10.4, which is a bit outdated. As soon as GHC 6.12 is in my
distribution, I'll test again.
I managed to get GHC 6.12 working. Even if I statically encode the *optimal*
sieve size (a wildly unfair advantage) into your incomplete Haskell
implementation it is still 6x slower than my F#:

F#: 5.9s (parallel)
F#: 15.9s (serial)
C++: 7.9s
C: 10.8s (with optimal limit)
Haskell: 33.4s (with optimal limit)

Here is my C++ with the extensible sieve and no hard-coded limit:

#include <iostream>
#include <vector>
#include <cstring>
#include <cstdlib>

std::vector<int> a;

void grow() {
const int p0 = a[a.size()-1]+1;
std::vector<bool> b(p0);
fill(b.begin(), b.end(), true);
for (int j=0; j<a.size(); ++j) {
const int di = a[j];
const int i0 = p0/di*di;
for (int i=(i0<p0 ? i0+di-p0 : i0-p0); i<b.size(); i+=di)
b[i] = false;
}
for (int i=0; i<b.size(); ++i)
if (b[i]) a.push_back(p0+i);
}

int primes(int n) {
while (n >= a.size())
grow();
return a[n];
}

int main(int argc, char *argv[]) {
if (argc != 2) {
std::cerr << "Usage: sieve <n>\n";
return 1;
}
int n = atoi(argv[1]);
a.push_back(2);
std::cout << "Prime " << n << " is " << primes(n) << std::endl;
return 0;
}
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-23 03:14:32 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
sieve in both C and Haskell and both need a lot of time just
sieving the first 100 million numbers, which don't even contain
the 10 millionth prime number.
Your C code takes only a few seconds to sieve 200M numbers and find
the 10 millionth prime here.
Then the Haskell version should behave similarly.
With that hardcoded limit, your Haskell takes 36.4s and your C takes 12.7s.
I wonder why Haskell code is always much slower as soon as you
compile/use it. I'm using an older GHC version and yet the speed
difference factor between C and Haskell is much smaller here. Also note
that my code is not parallelized, so claiming that your F# version is
six times faster is not fair. It's only two times faster, namely
comparing unparallelized to unparallelized.
Post by Jon Harrop
Post by Ertugrul Söylemez
To sieve the first 200M numbers the C version takes 9.4 seconds
In your last post you said the C programs took "a lot of time" but now
you're quoting times under 10s. Is that what you meant by "a lot of time"?
You said "a few seconds" to find the 10 millionth prime number. I
reached "a few seconds" already to sieve 100 million numbers, which
contain only around the first five million primes. Hence "a lot of
time".


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
Jon Harrop
2010-02-23 04:38:43 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
sieve in both C and Haskell and both need a lot of time just
sieving the first 100 million numbers, which don't even contain
the 10 millionth prime number.
Your C code takes only a few seconds to sieve 200M numbers and find
the 10 millionth prime here.
Then the Haskell version should behave similarly.
With that hardcoded limit, your Haskell takes 36.4s and your C takes 12.7s.
I wonder why Haskell code is always much slower as soon as you
compile/use it. I'm using an older GHC version and yet the speed
difference factor between C and Haskell is much smaller here.
I've discovered why:

ghc 6.10: 16.5s
ghc 6.12: 36.4s

So the new version of GHC is generating code over 2x slower than before. How
unpredictable...
Post by Ertugrul Söylemez
Also note
that my code is not parallelized, so claiming that your F# version is
six times faster is not fair. It's only two times faster, namely
comparing unparallelized to unparallelized.
Let's make it "fair" by parallelizing the Haskell rather than crippling my
F#. If you cannot parallelize it, let's keep it fair by comparing the
fastest implementations.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
To sieve the first 200M numbers the C version takes 9.4 seconds
In your last post you said the C programs took "a lot of time" but now
you're quoting times under 10s. Is that what you meant by "a lot of time"?
You said "a few seconds" to find the 10 millionth prime number. I
reached "a few seconds" already to sieve 100 million numbers, which
contain only around the first five million primes. Hence "a lot of
time".
Ok. When you said "a lot of time" I thought you mean minutes or hours rather
than 10 seconds.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Keith H Duggar
2010-02-23 06:12:55 UTC
Permalink
Post by Jon Harrop
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
I have no F# compiler here, but I don't believe it. I have just
written a non-growing (i.e. faster)
It isn't significantly faster.
Sure, but that's not the point. I'm just saying that a nongrowing sieve
is not slower to justify the comparison.
There is no evidence that the nongrowing sieves are not slower.
Post by Ertugrul Söylemez
Post by Jon Harrop
Post by Ertugrul Söylemez
sieve in both C and Haskell and both need a lot of time just sieving
the first 100 million numbers, which don't even contain the 10
millionth prime number.
Your C code takes only a few seconds to sieve 200M numbers and find
the 10 millionth prime here.
Then the Haskell version should behave similarly.
With that hardcoded limit, your Haskell takes 36.4s and your C takes 12.7s.
Post by Ertugrul Söylemez
My point is that it's easy to write the imperative-style, fast sieve of
Eratosthenes in Haskell,
Given that your Haskell solution introduced a hard-coded limit that my F#
did not have and your Haskell runs several times slower than everything
else, how can you possibly justify your claim?
Post by Ertugrul Söylemez
To sieve the first 200M numbers the C version takes 9.4 seconds
In your last post you said the C programs took "a lot of time" but now
you're quoting times under 10s. Is that what you meant by "a lot of time"?
Post by Ertugrul Söylemez
and the
Haskell version takes 14.7 seconds. Note that I'm compiling with GHC
6.10.4, which is a bit outdated. As soon as GHC 6.12 is in my
distribution, I'll test again.
I managed to get GHC 6.12 working. Even if I statically encode the *optimal*
sieve size (a wildly unfair advantage) into your incomplete Haskell
F#: 5.9s (parallel)
F#: 15.9s (serial)
C++: 7.9s
C: 10.8s (with optimal limit)
Haskell: 33.4s (with optimal limit)
Which is twice slower (on my machine) than my equivalent C++
version (code below) for the 10meg prime

g++ -O3 harrop.cpp -o harrop
g++ -O3 duggar.cpp -o duggar

time ./harrop 10000000
Prime 10000000 is 179424691

real 0m20.05s
user 0m19.76s
sys 0m0.25s

time ./duggar 10000000
179424691

real 0m8.12s
user 0m7.95s
sys 0m0.14s

time ./harrop 50000000
Prime 50000000 is 982451707

real 3m27.12s
user 3m11.42s
sys 0m2.14s

time ./duggar 50000000
982451707

real 0m45.55s
user 0m44.34s
sys 0m0.66s

and 4.6 times slower for the 50meg prime. I think the main reason
is the reduced memory consumption and improved memory locality in
my version due to scanning a (more) mimimal gap size when growing.
Also it can be futher optimized (at least in C++) should one need:

#include <cstdlib>
#include <iostream>
#include <vector>

using std::vector ;

class Primes
{
unsigned _idx ;
static unsigned g ;
static vector<unsigned> _primes ;

void grow ( ) {
unsigned g0 = g*g - g ; ++g ;
unsigned g1 = g*g - g ;
sieve(g0,g1) ;
}

void sieve ( unsigned g0, unsigned g1 ) {
vector<bool> cross(g1-g0,0) ;
vector<unsigned>::iterator pi = _primes.begin() ;
vector<unsigned>::iterator pe = _primes.end() ;
for ( ; pi != pe ; ++pi ) {
unsigned p = *pi ;
if ( p*p >= g1 ) break ;
unsigned m = (g0+p-1)/p*p ;
for ( ; m < g1 ; m += p ) cross[m-g0] = true ;
}
for ( unsigned ci = 0 ; ci < cross.size() ; ++ci ) {
if ( not cross[ci] ) _primes.push_back(g0+ci) ;
}
}

public :
Primes ( ) : _idx(0) { }
unsigned operator * ( ) { return _primes[_idx] ; }
Primes & operator ++ ( ) {
++_idx ;
if ( _idx >= _primes.size() ) grow() ;
return *this ;
}
} ;

unsigned Primes::g = 2 ;
std::vector<unsigned> Primes::_primes(1,2) ;

unsigned nthPrime ( unsigned n ) {
Primes primes ;
for ( unsigned i = 0 ; i < n ; ++i ) ++primes ;
return *primes ;
}

int main ( int argc, char * argv[])
{
unsigned const i = std::atoi(argv[1]) ;
std::cout << nthPrime(i) << '\n' ;
return 0 ;
}

KHD
Jon Harrop
2010-02-24 00:30:19 UTC
Permalink
Post by Keith H Duggar
Post by Jon Harrop
F#: 5.9s (parallel)
F#: 15.9s (serial)
C++: 7.9s
C: 10.8s (with optimal limit)
Haskell: 33.4s (with optimal limit)
Which is twice slower (on my machine) than my equivalent C++
version (code below) for the 10meg prime
Your C++ is 4x faster than my C++ on this machine, taking only 2s.
Post by Keith H Duggar
and 4.6 times slower for the 50meg prime. I think the main reason
is the reduced memory consumption and improved memory locality in
my version due to scanning a (more) mimimal gap size when growing.
I'll have to check it out. Thanks!
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-23 01:07:51 UTC
Permalink
Post by Jon Harrop
Here is the Sieve of Eratosthenes implemented in imperative F# such that
let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
for di in a do
let i0 = p0/di*di
let i0 = if i0<p0 then i0+di else i0
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
loop(i0-p0)
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
This is 2.5x more chars and over 100x faster than the original Haskell.
For example, you can get the 10,000,000th prime in only a few seconds.
Parallelizing the loop over "a" requires tiny changes and provides a ~3x
performance improvement on my 8 core:

let primes =
let a = ResizeArray[2]
let grow() =
let p0 = a.[a.Count-1]+1
let b = Array.create p0 true
System.Threading.Tasks.Parallel.For(0, a.Count, fun j ->
let di = a.[j]
let rec loop i =
if i<b.Length then
b.[i] <- false
loop(i+di)
let i0 = p0/di*di
loop(if i0<p0 then i0+di-p0 else i0-p0)) |> ignore
for i=0 to b.Length-1 do
if b.[i] then a.Add(p0+i)
fun n ->
while n >= a.Count do
grow()
a.[n]
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
frankenstein
2010-02-23 07:07:11 UTC
Permalink
Post by Jon Harrop
Parallelizing the loop over "a" requires tiny changes and provides a ~3x
  let primes =
    let a = ResizeArray[2]
    let grow() =
      let p0 = a.[a.Count-1]+1
      let b = Array.create p0 true
      System.Threading.Tasks.Parallel.For(0, a.Count, fun j ->
        let di = a.[j]
        let rec loop i =
          if i<b.Length then
            b.[i] <- false
            loop(i+di)
        let i0 = p0/di*di
        loop(if i0<p0 then i0+di-p0 else i0-p0)) |> ignore
      for i=0 to b.Length-1 do
        if b.[i] then a.Add(p0+i)
    fun n ->
      while n >= a.Count do
        grow()
      a.[n]
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.http://www.ffconsultancy.com/?u
I am not interested in micro benchmark times because they are
completely useless in real life since how a language handles memory
and large data sets would be important.

However, general programming concepts are still important.

It seems to be the case that F# is way ahead of lets say Bigloo when
it comes to parallelizing since it comes out of the box as far as I
can tell from your code. I would have no idea how to parallelize the
Bigloo version by using POSIX.

But F# as any other Lisp and Scheme out there: no one can tell me F#
leads to user friendly code which can easily be read by others.

No one knows the type of this by just reading the code:

==
i0 = p0/di*di
==

Is it an int or a float? Not to speak how F# will handle it: i0 = (p0/
di)*di or p0/(di*di)?

Context that i0 should mean an integer is no excuse here. It is just
funny: a strict type sytem which only the compiler know.

This is one of the reasons why I would never use Common Lisp or any
other Scheme for writing numerical code. Granted Bigloo only has basic
types on top of defined class types. In Bigloo I would write:

==
(let ((p0::int (* (/ p0 di) di)
==

Clean paved the way for how to document your code: every example in
the Clean book is annotated by its respective types. This is good
style.
frankenstein
2010-02-23 07:13:23 UTC
Permalink
Post by frankenstein
==
(let ((p0::int (* (/ p0 di) di)
==
Clean paved the way for how to document your code: every example in
the Clean book is annotated by its respective types. This is good
style.
I think it should be:

==
(let ((p0::int (inexact->exact (* (/ p0 di) di))
==

I still haven't figured out what the F# line actually does.
Jon Harrop
2010-02-23 14:46:37 UTC
Permalink
Post by frankenstein
It seems to be the case that F# is way ahead of lets say Bigloo when
it comes to parallelizing since it comes out of the box as far as I
can tell from your code. I would have no idea how to parallelize the
Bigloo version by using POSIX.
F# is way ahead of everything open source when it comes to parallelism.
Post by frankenstein
But F# as any other Lisp and Scheme out there: no one can tell me F#
leads to user friendly code which can easily be read by others.
==
i0 = p0/di*di
==
An F# developer would hover the mouse over those identifiers to see that
they correspond to variables with the inferred type "int".
Post by frankenstein
Clean paved the way for how to document your code: every example in
the Clean book is annotated by its respective types. This is good
style.
I prefer automatic inference with throwback of inferred data in the IDE.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
frankenstein
2010-02-23 16:55:38 UTC
Permalink
Post by Jon Harrop
Post by frankenstein
Clean paved the way for how to document your code: every example in
the Clean book is annotated by its respective types. This is good
style.
I prefer automatic inference with throwback of inferred data in the IDE.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.http://www.ffconsultancy.com/?u
Hi Jon

It took me a long time to figure out how to avoid too much consing.
Push in C++ seems to be VERY, VERY efficient and so is cache (if wanna
believe your timings) in F#! I do not think that any Common Lisp out
there would stand any chance against C++ its push.

My Bigloo (which is a mix of your F# and the C++ version) version
(code attached) for 1e7 (10 million): The version might be up for some
more improvements.


==
bigloo -Obench sieve.scm

time ./a.out
==

delivers:
==
10466504
prime 179424691
#<output_port:stdout>

real 0m22.969s
user 0m20.678s
sys 0m1.068s
==


The C version you posted:

==
g++ -O3 sieve.c

time ./a.out
==


delivers:

==
a size 10466504
Prime 10000000 is 179424691 1

real 0m14.299s
user 0m13.310s
sys 0m0.237s
==


==
(module sieve)


(define-inline (acc::vector i::bint p0::bint b::vector a::pair)
(cond ((=fx i (vector-length b))
(list->vector (reverse a)))
((= (vector-ref b i) 1)
(acc (+fx i 1) p0 b (append! (list (+fx i p0)) a)))
(else
(acc (+fx i 1) p0 b a))))



(define (primes::obj n::bint)
(let ([a::vector (vector 2)])
(define (grow a)
(let* ([p0::bint (+fx (vector-ref a
(-fx (vector-length a) 1)) 1)]
[b::vector (make-vector p0 1)]
[dim::bint p0)
(dim-a::bint (vector-length a)))
(do ([jj 0 (+fx jj 1)])
([=fx jj dim-a])
(let* ([di::bint (vector-ref a jj)]
(i0::bint (*fx (inexact->exact (/ p0 di)) di))
[i0::bint (if (<fx i0 p0)
(-fx (+fx i0 di) p0)
(-fx i0 p0))])
(do ([j i0 (+fx j di)])
([>fx j (-fx dim 1)])
(vector-set! b j 0))))
(acc 0 p0 b (vector->list a))))
(let while ((a a))
(cond ([>= n (vector-length a)]
(print "Doing n " (vector-length a))
(while (grow a)))
(else
(print (vector-length a))
(print "prime " (vector-ref a n)))))))


(print (primes 10000000))
==
frankenstein
2010-02-23 16:57:55 UTC
Permalink
Post by frankenstein
==
g++ -O3 sieve.c
time ./a.out
==
typo: the C++ version:

I ment:

g++ -O3 sieve.cpp
frankenstein
2010-02-23 17:07:57 UTC
Permalink
Post by frankenstein
g++ -O3 sieve.cpp
To put the figures into the right light:

my machine: Mac OS X, MacBook laptop, dual core 2GHz processor with
4GB RAM and gnu gcc compiler (both for Bigloo and C++)

I am not so interested into the raw metal timings because they are
useless. However, consing is still after all the years yet a crucial
problem in every of the Scheme and Lisp implementations out there.
frankenstein
2010-02-23 17:24:07 UTC
Permalink
Post by frankenstein
==
(module sieve)
(define-inline (acc::vector i::bint p0::bint b::vector a::pair)
   (cond ((=fx i (vector-length b))
          (list->vector (reverse a)))
         ((= (vector-ref b i) 1)
          (acc (+fx i 1) p0 b (append! (list (+fx i p0)) a)))
         (else
          (acc (+fx i 1) p0 b a))))
(define (primes::obj n::bint)
   (let ([a::vector (vector 2)])
      (define (grow a)
         (let* ([p0::bint (+fx (vector-ref a
                                           (-fx (vector-length a) 1)) 1)]
                [b::vector (make-vector p0 1)]
                [dim::bint p0)
                (dim-a::bint (vector-length a)))
            (do ([jj 0 (+fx jj 1)])
                ([=fx jj dim-a])
                (let* ([di::bint (vector-ref a jj)]
                       (i0::bint (*fx (inexact->exact (/ p0 di)) di))
                       [i0::bint (if (<fx i0 p0)
                                     (-fx (+fx i0 di) p0)
                                     (-fx i0 p0))])
                   (do ([j i0 (+fx j di)])
                       ([>fx j (-fx dim 1)])
                       (vector-set! b j 0))))
            (acc 0 p0 b (vector->list a))))
      (let while ((a a))
         (cond ([>= n (vector-length a)]
                (print "Doing n " (vector-length a))
                (while (grow a)))
               (else
                (print (vector-length a))
                (print "prime " (vector-ref a n)))))))
(print (primes 10000000))
==
Additonal note: Using Bigloo its (append-vector) would shift the
timings into infinity. Seems to be append-vector has to traverse the
vector every time a new element is being attached:

==

(define-inline (acc::vector i::bint p0::bint b::vector a::vector)
(cond ((=fx i (vector-length b))
a)
((= (vector-ref b i) 1)
(acc (+fx i 1) p0 b (vector-append a (vector (+fx i p0)))))
(else
(acc (+fx i 1) p0 b a))))
==


The secret lies here: a) convert the passed vector into a list, b)
accumulate the list by using append , c) upon exit convert back the
list into a vector. The main program (grow) will suffer if you are
using lists instead of vectors because access time is not constant and
you need a vector here.

==
(define-inline (acc::vector i::bint p0::bint b::vector a::pair)
(cond ((=fx i (vector-length b))
(list->vector (reverse a)))
((= (vector-ref b i) 1)
(acc (+fx i 1) p0 b (append! (list (+fx i p0)) a)))
(else
(acc (+fx i 1) p0 b a))))
==
Jussi Piitulainen
2010-02-23 18:08:52 UTC
Permalink
... (append! (list (+fx i p0)) a) ...
You could use cons here if cons wasn't such a performance killer.
Right?
frankenstein
2010-02-23 18:21:11 UTC
Permalink
Post by Jussi Piitulainen
           ... (append! (list (+fx i p0)) a) ...
You could use cons here if cons wasn't such a performance killer.
Right?
The strange thing: it wouldn't make any difference (the timings are
still fast and more or less the same: 23 seconds): using cons, append,
or append! here.

At least for Bigloo. I haven't observed the cost of memory whilst
using cons though.

Again a comment to the append vector:

Even when appending the element at the beginning of the vector:

(vector-append (vector value) a)

and reversing it upon exit will not put the timings down and they are
still highly n^2 or n^3. Not sure why append-vector in Bigloo is such
a dog.
Jussi Piitulainen
2010-02-23 19:24:06 UTC
Permalink
Post by frankenstein
Post by Jussi Piitulainen
           ... (append! (list (+fx i p0)) a) ...
You could use cons here if cons wasn't such a performance killer.
Right?
The strange thing: it wouldn't make any difference (the timings are
still fast and more or less the same: 23 seconds): using cons,
append, or append! here.
(cons x a) simply allocates the cell you want.

(append (list x) a) allocates at least two cells, at least one of
which can be collected immediately: append copies all but the last
argument.

(append! (list x) a) allocates one cell and then mutates it.

(append! a (list x)) would also only allocate one cell, but it needs
to mutate the last cell of a, which requires traversal of whole a.
Post by frankenstein
At least for Bigloo. I haven't observed the cost of memory whilst
using cons though.
Are you aware that (list x) is equivalent to (cons x '())?
Post by frankenstein
(vector-append (vector value) a)
and reversing it upon exit will not put the timings down and they
are still highly n^2 or n^3. Not sure why append-vector in Bigloo is
such a dog.
It allocates a whole new vector and stores the contents of both old
vectors to it.

Least work and least allocation here is done by cons, though I suppose
when lispers talk about consing they may refer to allocation of memory
in general.
frankenstein
2010-02-23 19:32:21 UTC
Permalink
Post by Jussi Piitulainen
Post by frankenstein
Post by Jussi Piitulainen
           ... (append! (list (+fx i p0)) a) ...
You could use cons here if cons wasn't such a performance killer.
Right?
The strange thing: it wouldn't make any difference (the timings are
still fast and more or less the same: 23 seconds): using cons,
append, or append! here.
(cons x a) simply allocates the cell you want.
(append (list x) a) allocates at least two cells, at least one of
which can be collected immediately: append copies all but the last
argument.
(append! (list x) a) allocates one cell and then mutates it.
(append! a (list x)) would also only allocate one cell, but it needs
to mutate the last cell of a, which requires traversal of whole a.
Post by frankenstein
At least for Bigloo. I haven't observed the cost of memory whilst
using cons though.
Are you aware that (list x) is equivalent to (cons x '())?
Post by frankenstein
(vector-append (vector value) a)
and reversing it upon exit will not put the timings down and they
are still highly n^2 or n^3. Not sure why append-vector in Bigloo is
such a dog.
It allocates a whole new vector and stores the contents of both old
vectors to it.
Least work and least allocation here is done by cons, though I suppose
when lispers talk about consing they may refer to allocation of memory
in general.
Thanks for the lesson Jussi. Believe it or not over the years I have
been always admiring your deep knowledge of the internals of the
Scheme language and how things work.

I haven't had a deeper look into the other C++ version for the reason
to learning if it might be possible to aplly some more tricks to the
Scheme version. However, I am not sure why the guy (no pun intented)
wrote a C++ version which shows all the signs of bad C++ programming
only for the fact to show off even though it might be very fast. His
version reminds me on old Fortran code with lots of GOTOS. I liked
Jon's version for that matter: it is readable.
frankenstein
2010-02-25 16:20:56 UTC
Permalink
Post by frankenstein
My Bigloo (which is a mix of your F# and the C++ version) version
(code attached) for 1e7 (10 million): The version might be up for some
more improvements.
==
bigloo -Obench sieve.scm
time ./a.out
==
==
10466504
prime 179424691
#<output_port:stdout>
real    0m22.969s
user    0m20.678s
sys     0m1.068s
==
==
g++ -O3 sieve.c
time ./a.out
==
==
a size 10466504
Prime 10000000 is 179424691 1
real    0m14.299s
user    0m13.310s
sys     0m0.237s
==
Update: Using strict srfi-4 vectors (s8 and u32, respectively) results
in timings near on par with Jon's C++ version. Code structure though
is still the same:


$ bigloo -Obench sieve.scm
$ time a.out

==
10466504
prime 179424691
#<output_port:stdout>

real 0m17.240s
user 0m15.086s
sys 0m0.640s
==

==
(module Jon)

(define (primes-u32::obj n::int)
(let ([a::u32vector (u32vector 2)])
(define (grow a)
(let* ([p0::int (+fx (u32vector-ref a
(-fx (u32vector-length a) 1)) 1)]
[b::s8vector (make-s8vector p0 1)]
[dim::int p0)
(dim-a::int (u32vector-length a)))
(do ([jj 0 (+fx jj 1)])
([=fx jj dim-a])
(let* ([di::int (u32vector-ref a jj)]
[i0::int (*fx (/fx p0 di) di)))
(do ([j (if (<fx i0 p0)
(-fx (+fx i0 di) p0)
(-fx i0 p0))
(+fx j di)])
([>fx j (-fx dim 1)])
(s8vector-set! b j 0))))
(acc-u32 0 p0 b (u32vector->list a))))
(let while ((a a))
(cond ([>= n (u32vector-length a)]
(print "Doing n " (u32vector-length a))
(while (grow a)))
(else
(print (u32vector-length a))
(print "prime " (u32vector-ref a n)))))))



(define-inline (acc-u32::u32vector i::bint p0::bint b::s8vector
a::pair)
(cond ((=fx i (s8vector-length b))
(list->u32vector (reverse a)))
((=fx (s8vector-ref b i) 1)
(acc-u32 (+fx i 1) p0 b (cons (+fx i p0) a)))
(else
(acc-u32 (+fx i 1) p0 b a))))

(print (primes 10000000))
==
toby
2010-02-24 01:03:30 UTC
Permalink
Post by Jon Harrop
Post by frankenstein
It seems to be the case that F# is way ahead of lets say Bigloo when
it comes to parallelizing since it comes out of the box as far as I
can tell from your code. I would have no idea how to parallelize the
Bigloo version by using POSIX.
F# is way ahead of everything open source when it comes to parallelism.
How about Erlang?

System.Threading.Tasks.Parallel.For(0, a.Count, fun j ->
^^
looks like something only a Microsoftie could love.
Jon Harrop
2010-02-24 12:12:04 UTC
Permalink
Post by toby
Post by Jon Harrop
Post by frankenstein
It seems to be the case that F# is way ahead of lets say Bigloo when
it comes to parallelizing since it comes out of the box as far as I
can tell from your code. I would have no idea how to parallelize the
Bigloo version by using POSIX.
F# is way ahead of everything open source when it comes to parallelism.
How about Erlang?
Erlang is great for concurrency but useless for parallelism.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Paul Rubin
2010-02-09 08:24:55 UTC
Permalink
Post by raould
the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap. but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches.
There was a study by Garcia, Jarvi et al that Keith mentioned. A
working url is here:

http://www.osl.iu.edu/publications/prints/2005/garcia05:_extended_comparing05.pdf

It compared a bunch of different languages and Haskell basically came
out on top, except in one area, which was that it didn't support
associated types as well as could be done in C++ through template
programming. In response to the paper, SPJ et al added type families to
GHC:

http://haskell.org/haskellwiki/Simonpj/Talk:FunWithTypeFuns

and that took care of the weak spot. It mostly seems to me that we hear
more about generics in connection to C++ than to functional languages,
because the C++ community uses the term a lot more, and puts a lot of
energy into the template hacking needed to make it happen in C++. In
functional languages like ML or Haskell, it's just called "polymorphism"
and taken for granted. The Haskell community actually uses the term
"generic" to mean something different, i.e. the automatic derivation of
code from algebraic types, like the "deriving" mechanism or in libraries
like Scrap Your Boilerplate.

I only spent about a minute looking at the thesis about Scala that you
cited, but it looks interesting and I will put it on my reading list.
The thesis advisor, Jaakko Jarvi, is one of the co-authors of the
comparison paper that I linked to.
raould
2010-02-09 18:50:15 UTC
Permalink
hi,
Post by Paul Rubin
programming. In response to the paper, SPJ et al added type families to
http://haskell.org/haskellwiki/Simonpj/Talk:FunWithTypeFuns
thanks!
raould
2010-02-10 18:13:32 UTC
Permalink
Post by raould
programming.  In response to the paper, SPJ et al added type families to
 http://haskell.org/haskellwiki/Simonpj/Talk:FunWithTypeFuns
thanks!
although i have to say that originally i was hoping not to have to use
ghc specificness ;-) oh well.
Andreas Rossberg
2010-02-09 19:56:04 UTC
Permalink
Unfortunately, I can't remember the exact quote, nor who said it, but:
people always use "generic" to refer to the kind of polymorphism their
favorite language does not have yet (or acquired late).
Dan Doel
2010-02-09 20:15:24 UTC
Permalink
Post by raould
the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap. but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches. of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.
Perhaps this is not relevant to what you're interested in, but "generic
programming" has a relatively different (than in C++/Java/C#...) meaning in
the Haskell community (at least, and I suspect elsewhere as well), probably
because parametric polymorphism is so fundamental and almost trivial in a
Haskell (or ML, or...)-like language.

Anyhow, generic programming in Haskell is typically meant to refer to
functions that are able to work over multiple different structures, by
somehow using a description of how the structure is built from its
constituent parts. One basic way of doing this is to treat all data as fixed
points of functors, and so you get functions like:

cata :: forall f r. Functor f => (f r -> r) -> Mu f -> r
ana :: forall f s. Functor f => (s -> f s) -> s -> Nu f

which are in some sense parameterized by the shape of the structure (f),
rather than just being parameterized by what you can put in some fixed
structure. I think there is at least one framework building on the above
(the name of which escapes me), but there are others that use other ideas,
like Uniplate and Scrap Your Boilerplate.

Clean also has built-in support for this kind of generic programming, but
finding information on it is difficult, as I recall, because the section on
it in the Clean report is blank for some reason.

With regard to another message, GHC at least now implements associated
types, so that bubble can be filled in all the way for it now. :)

-- Dan
raould
2010-02-10 18:12:57 UTC
Permalink
Post by Dan Doel
Clean also has built-in support for this kind of generic programming, but
finding information on it is difficult, as I recall, because the section on
it in the Clean report is blank for some reason.
i wish clean were doing better. it seemed like a nice idea in lots of
ways, but i think it really seriously fell down in lots of other ways,
mostly usability related (their IDE is horrible evilness, and the
uniqueness typing apparently becomes a bear according to the DDC
thesis).

sounds like they are "not dead yet" and really /are/ turning clean
into a competitor for ghc; clean will be a compiler for haskell.
Post by Dan Doel
With regard to another message, GHC at least now implements associated
types, so that bubble can be filled in all the way for it now. :)
yes, that is very cool, it seems to me!
Jon Harrop
2010-02-11 01:02:53 UTC
Permalink
Post by raould
the term generic seems to be over-used, and i'm not sure i grok all
the uses and/or how they overlap.
Everyone uses the term "generic programming" to refer to making definitions
more widely applicable (i.e. general) which seems to be indistinguishable
from code reuse and/or factoring to me. This is, in turn, strongly related
to brevity.

Lisp's macros were one of the earliest forms of generic programming and,
yet, modern uses of the term "generic programming" often refer specifically
to the expressiveness of static type systems. This seems to be particularly
prevalent in the context of Haskell where generic programming is conflated
with type constraints and any relevance to the true objective of code reuse
is lost.

For example, Haskellers Garcia et al. published a paper "An extended
comparative study of language support for generic programming" that was
actually more about static type constraints than it was about generalizing
definitions or improving brevity and reuse. Although they claimed to have
implemented graph theoretic code in several languages, their OCaml code
seems to have remained secret:

http://www.osl.iu.edu/research/comparing/

Garcia had published research on Haskell and apparently wrote great Haskell
solutions in this paper but I think their OCaml code leaves a lot to be
desired (and probably all of the other non-Haskell solutions as well) and
their relevant conclusions are simply wrong. Apparently they managed to get
this published without obtaining peer review from anyone familiar with the
other languages.

Even their first example (complete with superfluous parentheses and
semicolons):

class type comparable = object ('a) method better : 'a -> bool end

class apple init = object (self: 'a)
val value_ : int = init
method better (y: 'a) = self#value > y#value
method value = value_;
end

let pick ((x: #comparable as 'a), (y: 'a)) : 'a =
if x#better y then x else y

let a1 = (new apple 3);;
let a2 = (new apple 1);;
let a3 = pick (a1, a2);;

would normally be written:

let apple i = object
method better y = i > y#value
method value = i
end

let pick x y = if x#better y then x else y

let a1 = apple 3 and a2 = apple 1
let a3 = pick a1 a2

The essential difference is that OCaml can infer the relevant types and,
consequently, there is no need to define these types and their
relationships by hand. Ironically, their OCaml code inherited superfluous
definitions from the Haskell and they then concluded that these were a
disadvantage for OCaml because non-trivial combinations must be maintained
by hand (which is simply wrong: they are all automatically inferred).

Now look at their keystone OCaml solution for the graph theoretic problem:

class type ['vertex_t] vertex_list_graph = object
method vertices : 'vertex_t list
method num_vertices : int
end

class type ['vertex_t] edge = object
method source : 'vertex_t
method target : 'vertex_t
end

class type ['vertex_x, 'edge_t] incidence_graph = object
constraint 'edge_t = 'vertex_t #edge
method out_edges : 'vertex_t -> 'edge_t list
end

In the context of generic programming, these type definitions are entirely
superfluous. Sadly, they use these superfluous definitions to justify the
conclusion that OCaml is needlessly verbose.

They do give the following concrete implementation of an adjacency-list
based graph type in their paper:

class algraph_edge (s: int) (t: int) = object
val src = s
val tgt = t
method source = src
method target = tgt
end

class adjacency_list (num_vertices_) = object
val g = Array.make num_vertices_ []

method add_edge (src, tgt: int * int) = g.(src) <- (tgt::g.(src))

method vertices =
let rec floop (i: int) =
if i = num_vertices_ then [] else i::floop(i+1) in
floop 0

method num_vertices = num_vertices_

method out_edges v = List.map (fun n -> new algraph_edge v n) g.(v)

method adjacent_vertices v = g.(v)

method edges =
let (_, result) = Array.fold_left
(fun (src, (sofar: (algraph_edge) list)) (tgts: int list) ->
(src+1, List.append
(List.map (fun n -> new algraph_edge src n)
tgts) sofar))
(0, []) g in
result

method create_property_map : 'a. 'a -> 'a array =
fun def -> Array.make num_vertices_ def
end

let graph_search
(graph: (('vertex_t, 'edge_t) #incidence_graph) as 'graph_t)
(v: 'vertex_t)
(q: 'value_t #buffer)
(vis: ('graph_t, 'vertex_t, 'edge_t) #visitor)
(map: ('color_t, 'key_t) #color_map) = ...

This is just abhorrent and could not be less representative of a real OCaml
solution.

Why the superfluous immutable private data in the "algraph_edge" class?

Why are they defining classes that only have the effect of making the code
less generic?

Why are they mixing mutable (array) and immutable (list) data structures
like this?

Why the unnecessary nested function in the "vertices" method to create the
list [0..n-1]?

Why did they not annotate the type of "g" where it might actually help but
did annotate the arguments to the inner anonymous function in the "edges"
method?

You can rewrite their 46-line OCaml solution with the following 14-line
OCaml solution:

let edge s t = object method source = s method target = t end

let adjacency_list n = object
val g = Array.init n (fun _ -> Stack.create())

method add_edge (src, tgt) = Stack.push tgt g.(src)
method vertices = Array.init n id
method num_vertices = n
method out_edges v = g.(v)
method adjacent_vertices v = g.(v)
method edges =
let es = Stack.create() in
Array.iteri (fun u -> Stack.iter (fun v -> Stack.push (edge u v) es));
es
end

This uses more idiomatic data structures but that is irrelevant in this
context.

Finally, they claim that the OCaml solution cannot support separate
compilation. Is that true? I doubt it...
Post by raould
but anyway, i'm attempting to learn
more about the subject. some things i've read [1] make it sound to me
like c++ is the most complete language when it comes to generic
approaches.
The problem with C++ (and Haskell) is that the Turing complete type system
gives you complete generality in some uselessly-impractical sense, i.e. it
is a Turing argument.
Post by raould
of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.
You must take into account the applicability of these "solutions" in real
programs. The .NET guys are acutely aware of many of the academic features
found in languages like Haskell but they choose not to adopt them because
they do not solve real problems.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Jon Harrop
2010-02-11 01:37:00 UTC
Permalink
Post by Jon Harrop
For example, Haskellers Garcia et al. published a paper "An extended
comparative study of language support for generic programming" that was
actually more about static type constraints than it was about generalizing
definitions or improving brevity and reuse.
Incidentally, I was surprised they did not even mention genericity over data
structures because this is a well-known example where Haskell falls down on
something MLs have handled well for decades using their higher-order module
system (which Haskell lacks):

See the article "A SOMEWHAT FAILED ADVENTURE IN HASKELL ABSTRACTION" by
Lennart Augustsson, for example:

http://augustss.blogspot.com/2008/12/somewhat-failed-adventure-in-haskell.html
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
Ertugrul Söylemez
2010-02-11 09:17:53 UTC
Permalink
Post by Jon Harrop
but anyway, i'm attempting to learn more about the subject. some
things i've read [1] make it sound to me like c++ is the most
complete language when it comes to generic approaches.
The problem with C++ (and Haskell) is that the Turing complete type
system gives you complete generality in some uselessly-impractical
sense, i.e. it is a Turing argument.
Once again you're just proving that you have no idea what you're talking
about. Haskell's type system is not Turing-complete. You specifically
need language extensions (like UndecidableInstances) for that. Secondly
if you decide to make it Turing-complete, you get a few interesting
features like type arithmetic.

You may want to tell us what's wrong with a Turing-complete type system
anyway.
Post by Jon Harrop
of course that makes me sad. (maybe d does well, i dunno.)
supposedly scala fares better than most, but still falls at some
point. any thoughts? i'm looking to learn about such things in a
concrete real world language rather than e.g. an extension to ghc.
You must take into account the applicability of these "solutions" in
real programs. The .NET guys are acutely aware of many of the academic
features found in languages like Haskell but they choose not to adopt
them because they do not solve real problems.
What do you think C# LINQ and F# computation expressions are? Yes,
Haskell monads! Why does F# support a lazy evaluation construct
(despite that it's just function wrapping)? Some other F# features
include ADTs and pattern matching. There are not much more "academic"
features left.

Clearly even Microsoft knows the practical value of Haskell's language
features. You on the other hand don't.


Greets
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/
raould
2010-02-11 19:36:14 UTC
Permalink
Post by Ertugrul Söylemez
You may want to tell us what's wrong with a Turing-complete type system
anyway.
this is a complete aside, but i think Qi-Lisp purports to have a
turning-complete and user-extensibleish type system.
raould
2010-02-11 19:43:06 UTC
Permalink
Post by Ertugrul Söylemez
Clearly even Microsoft knows the practical value of Haskell's language
features.  You on the other hand don't.
i do not currently work at MSFT, and when i was there i was not in
the .NET team, but nevertheless i believe i have enough insight to
say:

As with any team of size and age, the .NET team does not have one
single perspective on programming language design or theory. i would
say that the original C# team was less PLT, and the later folks who
came along and influenced .NET, like Meijer and Syme and SPL,
obviously were more PLT.

point being that when you look at .NET as it is today, it is a
melange.

sincerely.
Jon Harrop
2010-02-12 00:17:22 UTC
Permalink
Post by Ertugrul Söylemez
Post by Jon Harrop
The problem with C++ (and Haskell) is that the Turing complete type
system gives you complete generality in some uselessly-impractical
sense, i.e. it is a Turing argument.
...
Haskell's type system is not Turing-complete. You specifically
need language extensions (like UndecidableInstances) for that.
So Haskell's type system can be Turing complete, just as I said.
Post by Ertugrul Söylemez
You may want to tell us what's wrong with a Turing-complete type system
anyway.
They are only used to emulate features that are better implemented directly,
e.g. metaprogramming using C++ templates.
Post by Ertugrul Söylemez
Post by Jon Harrop
You must take into account the applicability of these "solutions" in
real programs. The .NET guys are acutely aware of many of the academic
features found in languages like Haskell but they choose not to adopt
them because they do not solve real problems.
What do you think C# LINQ and F# computation expressions are? Yes,
Haskell monads! Why does F# support a lazy evaluation construct
(despite that it's just function wrapping)? Some other F# features
include ADTs and pattern matching. There are not much more "academic"
features left.
To the extent which they are present, I do not consider those features to be
academic.

I was referring to features such as:

. Purity.
. Non-strict evaluation as the default.
. Multiparameter type classes.
. Monad transformers.
. Stream fusion.
. Generalized algebraic datatypes.
...
Post by Ertugrul Söylemez
Clearly even Microsoft knows the practical value of Haskell's language
features. You on the other hand don't.
I write commercial literature on all of the topics you just mentioned and
none of the ones I just mentioned. And I'm not the only one...
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u
raould
2010-02-11 19:44:24 UTC
Permalink
Post by Jon Harrop
Garcia had published research on Haskell and apparently wrote great Haskell
solutions in this paper but I think their OCaml code leaves a lot to be
desired (and probably all of the other non-Haskell solutions as well) and
their relevant conclusions are simply wrong. Apparently they managed to get
this published without obtaining peer review from anyone familiar with the
other languages.
thanks for the review, it is interesting to me, and reassuring that
O'Caml isn't out of the running as it were.

sincerely.
Continue reading on narkive:
Loading...