-ECPG modifies and extends the core grammar in a way that
-1) every token in ECPG is <str> type. New tokens are
- defined in ecpg.tokens, types are defined in ecpg.type
-2) most tokens from the core grammar are simply converted
- to literals concatenated together to form the SQL string
- passed to the server, this is done by parse.pl.
-3) some rules need side-effects, actions are either added
- or completely overridden (compared to the basic token
- concatenation) for them, these are defined in ecpg.addons,
- the rules for ecpg.addons are explained below.
-4) new grammar rules are needed for ECPG metacommands.
- These are in ecpg.trailer.
-5) ecpg.header contains common functions, etc. used by
- actions for grammar rules.
-
-In "ecpg.addons", every modified rule follows this pattern:
- ECPG: dumpedtokens postfix
-where "dumpedtokens" is simply tokens from core gram.y's
-rules concatenated together. e.g. if gram.y has this:
- ruleA: tokenA tokenB tokenC {...}
-then "dumpedtokens" is "ruleAtokenAtokenBtokenC".
-"postfix" above can be:
-a) "block" - the automatic rule created by parse.pl is completely
- overridden, the code block has to be written completely as
- it were in a plain bison grammar
-b) "rule" - the automatic rule is extended on, so new syntaxes
- are accepted for "ruleA". E.g.:
- ECPG: ruleAtokenAtokenBtokenC rule
- | tokenD tokenE { action_code; }
- ...
- It will be substituted with:
- ruleA: <original syntax forms and actions up to and including
- "tokenA tokenB tokenC">
- | tokenD tokenE { action_code; }
- ...
-c) "addon" - the automatic action for the rule (SQL syntax constructed
- from the tokens concatenated together) is prepended with a new
- action code part. This code part is written as is's already inside
- the { ... }
-
-Multiple "addon" or "block" lines may appear together with the
-new code block if the code block is common for those rules.
+ECPG's grammar (preproc.y) is built by parse.pl from the
+backend's grammar (gram.y) plus various add-on rules.
+Some notes:
+
+1) Most input matching core grammar productions is simply converted
+ to strings and concatenated together to form the SQL string
+ passed to the server. parse.pl can automatically build the
+ grammar actions needed to do this.
+2) Some grammar rules need special actions that are added to or
+ completely override the default token-concatenation behavior.
+ This is controlled by ecpg.addons as explained below.
+3) Additional grammar rules are needed for ECPG's own commands.
+ These are in ecpg.trailer, as is the "epilogue" part of preproc.y.
+4) ecpg.header contains the "prologue" part of preproc.y, including
+ support functions, Bison options, etc.
+5) Additional terminals added by ECPG must be defined in ecpg.tokens.
+ Additional nonterminals added by ECPG must be defined in ecpg.type.
+
+ecpg.header, ecpg.tokens, ecpg.type, and ecpg.trailer are just
+copied verbatim into preproc.y at appropriate points.
+
+ecpg.addons contains entries that begin with a line like
+ ECPG: concattokens ruletype
+and typically have one or more following lines that are the code
+for a grammar action. Any line not starting with "ECPG:" is taken
+to be part of the code block for the preceding "ECPG:" line.
+
+"concattokens" identifies which gram.y production this entry affects.
+It is simply the target nonterminal and the tokens from the gram.y rule
+concatenated together. For example, to modify the action for a gram.y
+rule like this:
+ target: tokenA tokenB tokenC {...}
+"concattokens" would be "targettokenAtokenBtokenC". If we want to
+modify a non-first alternative for a nonterminal, we still write the
+nonterminal. For example, "concattokens" should be "targettokenDtokenE"
+to affect the second alternative in:
+ target: tokenA tokenB tokenC {...}
+ | tokenD tokenE {...}
+
+"ruletype" is one of:
+
+a) "block" - the automatic action that parse.pl would create is
+ completely overridden. Instead the entry's code block is emitted.
+ The code block must include the braces ({}) needed for a Bison action.
+
+b) "addon" - the entry's code block is inserted into the generated
+ action, ahead of the automatic token-concatenation code.
+ In this case the code block need not contain braces, since
+ it will be inserted within braces.
+
+c) "rule" - the automatic action is emitted, but then the entry's
+ code block is added verbatim afterwards. This typically is
+ used to add new alternatives to a nonterminal of the core grammar.
+ For example, given the entry:
+ ECPG: targettokenAtokenBtokenC rule
+ | tokenD tokenE { custom_action; }
+ what will be emitted is
+ target: tokenA tokenB tokenC { automatic_action; }
+ | tokenD tokenE { custom_action; }
+
+Multiple "ECPG:" entries can share the same code block, if the
+same action is needed for all. When an "ECPG:" line is immediately
+followed by another one, it is not assigned an empty code block;
+rather the next nonempty code block is assumed to apply to all
+immediately preceding "ECPG:" entries.
+
+In addition to the modifications specified by ecpg.addons,
+parse.pl contains some tables that list backend grammar
+productions to be ignored or modified.
+
+Nonterminals that construct strings (as described above) should be
+given <str> type, which is parse.pl's default assumption for
+nonterminals found in gram.y. That can be overridden at need by
+making an entry in parse.pl's %replace_types table. %replace_types
+can also be used to suppress output of a nonterminal's rules
+altogether (in which case ecpg.trailer had better provide replacement
+rules, since the nonterminal will still be referred to elsewhere).
#!/usr/bin/perl
# src/interfaces/ecpg/preproc/parse.pl
-# parser generator for ecpg version 2
-# call with backend parser as stdin
+# parser generator for ecpg
+#
+# See README.parser for some explanation of what this does.
+#
+# Command-line options:
+# --srcdir: where to find ecpg-provided input files (default ".")
+# --parser: the backend gram.y file to read (required, no default)
+# --output: where to write preproc.y (required, no default)
#
# Copyright (c) 2007-2024, PostgreSQL Global Development Group
#
close($parserfh);
+# Cross-check that we don't have dead or ambiguous addon rules.
+foreach (keys %addons)
+{
+ die "addon rule $_ was never used\n" if $addons{$_}{used} == 0;
+ die "addon rule $_ was matched multiple times\n" if $addons{$_}{used} > 1;
+}
+
+
sub main
{
line: while (<$parserfh>)
my $rec = $addons{$block};
return 0 unless $rec;
- my $rectype = (defined $rec->{type}) ? $rec->{type} : '';
+ # Track usage for later cross-check
+ $rec->{used}++;
+
+ my $rectype = $rec->{type};
if ($rectype eq 'rule')
{
dump_fields($stmt_mode, $fields, ' { ');
}
=top
- load addons into cache
+ load ecpg.addons into %addons hash. The result is something like
%addons = {
- stmtClosePortalStmt => { 'type' => 'block', 'lines' => [ "{", "if (INFORMIX_MODE)" ..., "}" ] },
- stmtViewStmt => { 'type' => 'rule', 'lines' => [ "| ECPGAllocateDescr", ... ] }
+ stmtClosePortalStmt => { 'type' => 'block', 'lines' => [ "{", "if (INFORMIX_MODE)" ..., "}" ], 'used' => 0 },
+ stmtViewStmt => { 'type' => 'rule', 'lines' => [ "| ECPGAllocateDescr", ... ], 'used' => 0 }
}
=cut
my $filename = $srcdir . "/ecpg.addons";
open(my $fh, '<', $filename) or die;
- # there may be multiple lines starting ECPG: and then multiple lines of code.
- # the code need to be add to all prior ECPG records.
- my (@needsRules, @code, $record);
+ # There may be multiple "ECPG:" lines and then multiple lines of code.
+ # The block of code needs to be added to each of the consecutively-
+ # preceding "ECPG:" records.
+ my (@needsRules, @code);
- # there may be comments before the first ECPG line, skip them
+ # there may be comments before the first "ECPG:" line, skip them
my $skip = 1;
while (<$fh>)
{
- if (/^ECPG:\s(\S+)\s?(\w+)?/)
+ if (/^ECPG:\s+(\S+)\s+(\w+)\s*$/)
{
+ # Found an "ECPG:" line, so we're done skipping the header
$skip = 0;
+ # Validate record type and target
+ die "invalid record type $2 in addon rule for $1\n"
+ unless ($2 eq 'block' or $2 eq 'addon' or $2 eq 'rule');
+ die "duplicate addon rule for $1\n" if (exists $addons{$1});
+ # If we had some preceding code lines, attach them to all
+ # as-yet-unfinished records.
if (@code)
{
for my $x (@needsRules)
@code = ();
@needsRules = ();
}
- $record = {};
+ my $record = {};
$record->{type} = $2;
$record->{lines} = [];
- if (exists $addons{$1}) { die "Ga! there are dups!\n"; }
+ $record->{used} = 0;
$addons{$1} = $record;
push(@needsRules, $record);
}
+ elsif (/^ECPG:/)
+ {
+ # Complain if preceding regex failed to match
+ die "incorrect syntax in ECPG line: $_\n";
+ }
else
{
+ # Non-ECPG line: add to @code unless we're still skipping
next if $skip;
push(@code, $_);
}
}
close($fh);
+ # Deal with final code block
if (@code)
{
for my $x (@needsRules)