Re: Parallel sec scan in plpgsql - Mailing list pgsql-hackers
From | Alex Ignatov |
---|---|
Subject | Re: Parallel sec scan in plpgsql |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: Parallel sec scan in plpgsql (Amit Kapila <[email protected]>) |
List | pgsql-hackers |
On 18.09.2016 06:54, Amit Kapila wrote: > On Fri, Sep 16, 2016 at 8:48 PM, Alex Ignatov <[email protected]> wrote: >> >> On 16.09.2016 16:50, Amit Kapila wrote: >>> >>> >>> Can you try by setting force_parallel_mode = off;? I think it is >>> sending the whole function execution to worker due to >>> force_parallel_mode. >>> >>> >> >> No changes: >> > > Okay, it just skipped from my mind that we don't support parallel > queries for SQL statement execution (or statements executed via > exec_stmt_execsql) from plpgsql. For detailed explanation of why that > is not feasible you can refer one of my earlier e-mails [1] on similar > topic. I think if we can somehow get the results via Perform > statement, then it could be possible to use parallelism via plpgsql. > > However, you can use it via SQL functions, an example is below: > > set min_parallel_relation_size =0; > set parallel_tuple_cost=0; > set parallel_setup_cost=0; > > Load 'auto_explain'; > set auto_explain.log_min_duration = 0; > set auto_explain.log_analyze = true; > set auto_explain.log_nested_statements = true; > > create table test_plpgsql(c1 int, c2 char(1000)); > insert into test_plpgsql values(generate_series(1,100000),'aaa'); > > create or replace function parallel_test_set_sql() returns > setof bigint as $$ > select count(*) from test_plpgsql; > $$language sql PARALLEL SAFE STRICT STABLE; > > Then execute function as: select * from parallel_test_set_sql(); You > can see below plan if auto_explain module is loaded. > > Finalize Aggregate (cost=14806.85..14806.86 rows=1 width=8) (actual tim > e=1094.966..1094.967 rows=1 loops=1) > -> Gather (cost=14806.83..14806.84 rows=2 width=8) (actual time=472. > 216..1094.943 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate (cost=14806.83..14806.84 rows=1 width=8) > (actual time=177.867..177.868 rows=1 loops=3) > -> Parallel Seq Scan on test_plpgsql (cost=0.00..14702.6 > 7 rows=41667 width=0) (actual time=0.384..142.565 rows=33333 loops=3) > CONTEXT: SQL function "parallel_test_set_sql" statement 1 > LOG: duration: 2965.040 ms plan: > Query Text: select * from parallel_test_set_sql(); > Function Scan on parallel_test_set_sql (cost=0.25..10.25 rows=1000 widt > h=8) (actual time=2538.620..2776.955 rows=1 loops=1) > > > In general, I think we should support the cases as required (or > written) by you from plpgsql or sql functions. We need more work to > support such cases. There are probably two ways of supporting such > cases, we can build some intelligence in plpgsql execution such that > it can recognise such queries and allow to use parallelism or we need > to think of enabling parallelism for cases where we don't run the plan > to completion. Most of the use cases from plpgsql or sql function > fall into later category as they don't generally run the plan to > completion. > > > [1] - https://www.postgresql.org/message-id/CAA4eK1K8kaO_jRk42-o2rmhSRbKV-3mR%2BiNVcONLdbcSXW5TfQ%40mail.gmail.com > Thank you for you sugestion! That works. But what we can do with this function: create or replace function parallel_test_sql(t int) returns setof bigint as $$ select count(*) from (select a,b,c,d,e,sum(bytes) from test where a>= $1 group by a,b,c,d,e)t; $$ language sql PARALLEL SAFE STRICT STABLE; explain (analyze,buffers) select * from parallel_test_sql(2); "Function Scan on parallel_test_sql (cost=0.25..10.25 rows=1000 width=8) (actual time=2410.789..2410.790 rows=1 loops=1)" " Buffers: shared hit=63696" "Planning time: 0.082 ms" "Execution time: 2410.841 ms" 2016-09-20 14:09:04 MSK [13037]: [75-1] user=ipdr,db=ipdr,app=pgAdmin III - Query Tool,client=127.0.0.1 LOG: duration: 2410.135ms plan: Query Text: select count(*) from (select a,b,c,d,e,sum(bytes) from test where a>= $1 groupby a,b,c,d,e)t; Aggregate (cost=230701.42..230701.43 rows=1 width=8) -> HashAggregate (cost=230363.59..230513.74 rows=15015width=28) Group Key: test.a, test.b, test.c, test.d, test.e -> Seq Scan on test (cost=0.00..188696.44 rows=3333372 width=20) Filter: (a >= $1) No parallelism again. Looks like that Filter: (a >= $1) breaks parallelism Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: