Re: Fast COPY FROM based on batch insert - Mailing list pgsql-hackers
From | Andrey Lepikhov |
---|---|
Subject | Re: Fast COPY FROM based on batch insert |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: Fast COPY FROM based on batch insert (Etsuro Fujita <[email protected]>) |
Responses |
Re: Fast COPY FROM based on batch insert
|
List | pgsql-hackers |
On 18/7/2022 13:22, Etsuro Fujita wrote: > On Thu, Mar 24, 2022 at 3:43 PM Andrey V. Lepikhov > <[email protected]> wrote: >> On 3/22/22 06:54, Etsuro Fujita wrote: >>> * To allow foreign multi insert, the patch made an invasive change to >>> the existing logic to determine whether to use multi insert for the >>> target relation, adding a new member ri_usesMultiInsert to the >>> ResultRelInfo struct, as well as introducing a new function >>> ExecMultiInsertAllowed(). But I’m not sure we really need such a >>> change. Isn’t it reasonable to *adjust* the existing logic to allow >>> foreign multi insert when possible? >> Of course, such approach would look much better, if we implemented it. > >> I'll ponder how to do it. > > I rewrote the decision logic to something much simpler and much less > invasive, which reduces the patch size significantly. Attached is an > updated patch. What do you think about that? > > While working on the patch, I fixed a few issues as well: > > + if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize != NULL) > + resultRelInfo->ri_BatchSize = > + > resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo); > > When determining the batch size, I think we should check if the > ExecForeignBatchInsert callback routine is also defined, like other > places such as execPartition.c. For consistency I fixed this by > copying-and-pasting the code from that file. > > + * Also, the COPY command requires a non-zero input list of attributes. > + * Therefore, the length of the attribute list is checked here. > + */ > + if (!cstate->volatile_defexprs && > + list_length(cstate->attnumlist) > 0 && > + !contain_volatile_functions(cstate->whereClause)) > + target_resultRelInfo->ri_usesMultiInsert = > + ExecMultiInsertAllowed(target_resultRelInfo); > > I think “list_length(cstate->attnumlist) > 0” in the if-test would > break COPY FROM; it currently supports multi-inserting into *plain* > tables even in the case where they have no columns, but this would > disable the multi-insertion support in that case. postgres_fdw would > not be able to batch into zero-column foreign tables due to the INSERT > syntax limitation (i.e., the syntax does not allow inserting multiple > empty rows into a zero-column table in a single INSERT statement). > Which is the reason why this was added to the if-test? But I think > some other FDWs might be able to, so I think we should let the FDW > decide whether to allow batching even in that case, when called from > GetForeignModifyBatchSize. So I removed the attnumlist test from the > patch, and modified postgresGetForeignModifyBatchSize as such. I > might miss something, though. Thanks a lot, maybe you forgot this code: /* * If a partition's root parent isn't allowed to use it, neither is the * partition. */ if (rootResultRelInfo->ri_usesMultiInsert) leaf_part_rri->ri_usesMultiInsert = ExecMultiInsertAllowed(leaf_part_rri); Also, maybe to describe in documentation, if the value of batch_size is more than 1, the ExecForeignBatchInsert routine have a chance to be called? -- regards, Andrey Lepikhov Postgres Professional
pgsql-hackers by date: