-
Notifications
You must be signed in to change notification settings - Fork 13
Parallel fetching #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is fantastic, thanks for doing it! |
Hello, it's an honor to see this used in quicklisp. @orivej, it's not clear what you mean by uninterruptable. If you expected that interrupting (C-c-c) None of the lparallel API defaults to such child-killing behavior (ouch, bad turn of phrase). lparallel can't know what the user wants, of course, so it defaults to the most general case. I originally wrote the ptree algorithm to parallelize Makefile-like tasks, a case where you probably don't want to kill child tasks. For example just because one C file failed to compile doesn't mean that compilation of unrelated files should be aborted. [Update] Note this fetching could also be done by partitioning the data according to hostname then using pmap,
Ptrees have an advantage for long-running stuff like fetching because a computation can be resumed after being aborted (either by error or user-interrupt), if you ever have need for that functionality. |
@lmj, I have not yet looked at the implementation to give an informed answer, yet I mean that after interrupting CALL-PTREE, new tasks continue to be spawned, and KILL-TASKS can not stop it. For example, this runs to completion under SBCL after being interrupted with C-c C-c and calling (let ((tree (lparallel:make-ptree))
(range (loop for i from 0 to 100 collect i)))
(dolist (i range) (lparallel:ptree-fn
i nil (lambda (&aux (i i)) (print i) (sleep 1)) tree))
(lparallel:ptree-fn 'all range (constantly nil) tree)
(lparallel:call-ptree 'all tree)) |
@orivej Yes you're right, I hadn't considered that task production can exceed consumption in this case. This particular tree has many parallelizable tasks, presumably many more than the number of workers one would choose. The ptree algorithm finds all the parallelizable tasks and queues them up. After Instead of
In I also notice the need for
In general deciding how to cancel a computation gracefully may require human judgement, as in the above case where |
Reposting my comment at the blog:
I explored an implementation with lparallel ptrees.
PMAP-SOURCES builds a dependency ptree where each source, except for the first ones, depends on the previous source with the same hostname; then it adds a node which depends on all of the sources, and makes lparallel carry out its computation. CALL-WITH-SKIPPING is adjusted to establish its restart in worker treads and to guard its debug output with a mutex. UPDATE-WHAT-YOU-CAN takes additional optional argument to turn parallel run on.
The downside of ptrees is that they are, as it seems, uninterruptible.