Parser¶
The Parser (c parser.c) performs an LR(1) parse of the input source using parse tables generated by the Parse Table Generator utility and tokens returned by the Scanner. For a description of the LR parsing method used, see references [6] and [7].
Refer to Appendix I for a listing of the grammar which drives the Parser.
Data Structures¶
Parse Stack — contains parse states and is maintained by the Parser. When a ‘read transition’ occurs, a state is added to the stack. When a reduction occurs, the stack is reduced by the number of tokens on the right hand side of the reduction minus one.
Semantic Stack — is maintained in parallel with the Parse Stack, and is
described in detail in section 5 since it is only used by the
Semantic Analyzer.
The variable,
stktop
,
is used to point to the current top of the semantic stack.
Whenever a read transition occurs, the Parser adds to the
semantic stack, information on the current token (for example symbol table
pointer) which the Scanner passed to it.
Parse Tables — tables created by the utility PRSTAB and which control the parsing process. These tables are described in the PRSTAB Maintenance and Operation Document, [7].
Processing Overview¶
The Parser uses parse tables generated by the Parse Table Generator utility (PRSTAB), and tokens returned by the Scanner. As each parse reduction is made, the Semantic Analyzer is called to perform the function associated with the particular reduction.
The parser maintains the parse stack. Semantic actions are done in parallel with reductions involving the parse stack. Figure 4-1 gives a pseudo-code description of the operation of the Parser.
loop for each statement in subprogram unit {
parse_init();
repeat /\* once for each input token in the statement \*/ {
get token;
while (a reduction can be made using the current
token as a look-ahead symbol) {
call Semantic Analyzer for this reduction;
if (sem.ignore_stmt) {
/\* skip this statement \*/
goto next_statement;
}
replace on the parse stack, the right hand side
symbols by the left hand side symbol for
this reduction;
current_state = new value determined from
Parse Tables;
stack current_state on Parse Stack;
}
if (there is a valid read transition) {
current_state = new value determined using
current_state, token, and Parse Tables;
stack current_state information on
Parse Stack.
}
else
error;
if (token == end_of_line) {
if end-of-program-unit return;
if at end-of-file { error; return; }
break;
}
}
next_statement:;
}
Figure 4-1: Parser Operation
The
parse_init
routine sets the Parse Stack to its initial
state.
parse_init
calls the
reset
routine in the lexical analyzer to begin processing tokens
from the next statement.
During semantic analysis, the semantic analyzer,
while processing reductions, may detect an error for which it’s
useless to process the remaining tokens in the statement.
For example, if an executable statement appears in an interface,
there’s no need to semantically analyze the statement.
The semantic analyzer can direct the parser to ignore the
remaining portion of the statement and to move on to the
next statement;
the global variable (a flag) for effecting this is
sem.ignore_stmt
.
The last token processed by the parser for a statement is
the
end of line
(
TK_EOL
)
token.
When this token is seen, the parser determines if the
END
statement is the current statement.
If it is, the parser returns to the caller (main)
since this represents the end of the current subprogram.
Parse Table Generator (PRSTAB)¶
Overview¶
The main function of PRSTAB is to generate data initializations
for the Parse Tables used by the Parser.
In addition, it generates three files of C
#define
directives
used to coordinate the operation of the Scanner, Parser, and
Semantic Analyzer, and two files of static character information
used for error messages and debugging.
PRSTAB generates a listing and cross reference of the grammar
which is used as Appendix I of this document.
Inputs¶
PRSTAB requires two input files:
Grammar Definition File. This file defines the grammar used by Flang. Reference [6] describes the format of this file; a short example is given here:
<entry statement> ::= <entry id> | <entry id> ( ) | <entry id> ( <id list> ) <entry id> ::= ENTRY <ident>
If the semantic actions (see section 5) are divided into multiple C source files, this file contains a
.BR
directive at the points in the grammar where the breaks occur.Token Definition File. This file contains one line for each terminal symbol of the grammar (token). Each line consists of the token name as it appears in the grammar definition file, followed by the name which will be used for the C constant symbol representing the token id number within Flang. These names must differ in the first 8 characters, and begin with
TK\_
. Example:<ident> TK_IDENT CALL TK_CALL + TK_PLUS
Outputs¶
There are six output files:
Parse Table Definition File. This file contains the C data definitions and initialization code for the Parse Tables. It is included directly into the Parser source module.
Token Id Number Definition File. This file contains a series of
#define
directives for the token id numbers. The symbol names used are those specified in the Token Definition File.Semantic Actions Definition File. This file contains a series of
#define
directives for the semantic action case labels (see section 5). There is one definition for each production of the grammar. The names for these symbols are constructed from the name of the symbol on the left hand side of each production, using an algorithm described in [6].Non-terminal Number Definition File. This file contains a series of
#define
directives for the grammar’s non-terminal symbols. The names for these symbols are of the formNT_<name>
where name is constructed from the symbol’s name in the grammar in an obvious fashion.Token Name Definition File. This file contains a statically initialized array indexed by terminal and non-terminal number giving the name (exactly as in the grammar) of each terminal and non-terminal symbol. It is used for error messages and debugging.
Grammar Listing and Cross Reference File. Lists grammar with alphabetic cross reference of grammar symbols at end. Used to create Appendix I of this document.