Semantic Analyzer¶
Overview¶
The Semantic Analyzer compiler module performs three major functions:
generates the first internal representation of the executable statements of the user’s subprogram (AST’s).
enters symbols and their attributes into the symbol table and related global data structures.
performs semantic error checking and issues appropriate diagnostic messages.
Data Structures¶
Global Data Structures¶
Symbol Table — described in section 11.
AST’s — internal representation of executable statements written to a temporary file. Refer to appendix IV for a list of the AST opcodes and descriptions of their meanings.
astb.df — external temporary file containing information generated from the processing of initializations. See section 14.
Data Initialization File — external temporary file containing information generated by the Semantic Analyzer and other phases to effect initializations of compiler-created variables. See section 14.
Reference File — external temporary file containing information on symbol usage for the Cross Reference Listing. See section 13.
Semantic Stack¶
The Semantic Stack is operated in parallel with the Parse Stack; the same variable is used to point to the top of each (see section 4).
Each stack entry consists of 5 words (of type INT
). The contents
of each stack entry depend on the corresponding grammar symbol.
For example, for the symbol \*(gf<ident>\*(rf, the stack contains a
symbol table pointer to the identifier; for the symbol \*(gf<arg list>\*(rf
the stack contains pointers to the beginning and end of the
list, etc.
Each stack entry consists of an AST field.
For terminal symbols of the grammar, the first word of the corresponding semantic stack entry is set to the token value returned by the Scanner (see section 3). For instance, for \*(gf<integer>\*(rf it is the 32-bit integer value of the constant. The exceptions to this rule are constants requiring more than 32-bits of storage such as a complex constant. In these cases the semantic stack entry contains a symbol table pointer to the constant.
For a non-terminal symbol, the sections of semantic code associated with productions which generate the symbol are responsible for ensuring that the proper values are put into the stack entry.
Macros defined in semant.h are used to access the words of
the semantic stack. These are of the form
SST_\\*(gf<name>\\*(cfP
and
SST_\\*(gf<name>\\*(cfG
.
For example,
SST_SYMG(s);
SST_SYMP(s, sptr);
The most important type of stack entry is the one used for \*(gf<expression>\*(rf and a number of other associated grammar symbols such as \*(gf<primary>\*(rf, \*(gf<postfix exp>\*(rf, \*(gf<add exp>\*(rf, etc. This entry itself has several formats, depending on the value of the stack identifier. The following macros are used to reference the various fields described below:
FIELD |
MACRO |
_ |
|
stack type |
SST_IDP, SST_IDG |
AST pointer |
SST_ASTP, SST_ASTG |
Constant value |
SST_CVALP, SST_CVALG |
Symbol pointer |
SST_SYMP, SST_SYMG |
S_EXPR
AST’s have been written for the associated expression.
- word2
data type of the expression (pointer into dtype area, see section 11.4.1).
- word5
pointer to AST.
S_LVALUE
The expression is one which can be used as an lvalue, and AST’s have been written for it.
- word2
dtype.
- word3
symbol of lvalue.
- word4
shape.
- word5
pointer to AST.
S_LOGEXPR
Logical expression. An expression of the form
e1 .and. e2
ore1 .or. e2
, or the logical negation (.not.
operator) of one of these. See discussion of logical expression processing below.- word1
pointer to LAND or LOR AST.
- word2
dtype =
DT_INT
.- word5
pointer to AST.
S_CONST
the associated expression is a constant.
- word1
32-bit constant value if data type of constant is an integer type, otherwise a symbol table pointer to an entry for the constant.
- word2
data type of the constant. Note that although the Scanner only returns constants of type
DT_INT
orDT_FLOAT
, many other data types are possible here because of the constant folding of type cast operations.- word5
pointer to AST.
S_STAR
created when
\*
is seen as a specifier for a dimension.S_VAL
created when the
%VAL
built-in is seen; word5 is the pointer to the AST of the operand.S_IDENT
identifier
- word1
a symbol table pointer to an entry for the identifier.
- word2
data type of the identifier.
- word5
pointer to A_ID AST.
S_LABEL
label
- word1
a symbol table pointer to an entry for the label.
- word5
pointer to the A_LABEL AST.
S_STFUNC
statement function definition
- word1
a symbol table pointer to an entry for the statement function which is being defined.
- word2
pointer to the list of the statement function’s formal arguments.
S_ACONST
array constant
- word1
a symbol table pointer to an entry for the array temporary representing the constant.
- word2
its data type record.
- word3
the pointer to the array constructor list representing the values of the array.
Semantic Stack Lists¶
Often a list of entities must be tied to a semantic stack entry. This happens for identifier lists to program entry points, array subscript lists, subprogram call argument lists, and character substring lists. The semantic analyzer uses item lists to retain semantic stack information until enough information is gathered about the statement being parsed. For example,
foo(a, b, 1, 17, c)
could be an array reference or a function call. Therefore, the semantic analyzer cannot generate ASTs while processing each of the arguments. The only alternative is to save any required information about the arguments in an item list until it is determined what kind of reference is being made.
The following structure defines an item list.
typedef struct xyyz {
struct xyyz \*next;
int ast;
union {
int sptr;
struct sst \*stkp;
INT conval;
} t;
} ITEM;
The first field of an item entry is used to point to the next item in the
list. A value of LIST_END
is used to mark the end of the list.
The second field is the AST pointer of the argument.
There are four types of item entries; symbol pointer, semantic stack
pointer, and constant value. They are each handled in a similar manner. The
semantic stack item list entry is discussed here because it is more complex.
An argument list that could represent an array reference, character substring, or function call is represented by an item list of semantic stack pointers. An argument can be:
- <expression>
a simple expression which denotes a function argument or a single subscript,
- <ident> = <expression>
an identifier keyword which is allowed in certain intrinsic or subprogram calls,
- <expression>:<expression>:<expression>
or a subscript triplet which denotes a vector slice.
An argument that is a simple expression is represented by a semantic stack
entry of the normal type S_IDENT
or S_EXPR
.
An argument that is a keyword entry is represented by the corresponding
semantic stack id of S_KEYWORD. The SST_E3
field of the semantic
stack entry points to another semantic stack entry containing the expression.
An argument that is a subscript triplet is represented by a semantic stack id
of S_TRIPLE
. The SST_E1
, SST_E2
, and
SST_E3
fields of that semantic stack entry point to other semantic
stack entries which represent the initial element, the ending element, and the
stride for the triplet notation.
Structure Stack¶
The structure stack is used to keep track of the current nesting of
STRUCTURE
, UNION
, and MAP
statements. The variables sem.stsk_base
and
sem.stsk_depth
are used to find the current stack top. The
“Structure Processing” section describes how the structure stack is used by the
semantic analyzer.
The form of the structure stack is:
typedef struct {
char type;
int sptr;
int dtype;
int last;
CONST \*ict_beg;
CONST \*ict_end;
} STSK;
.BS
- t``type``
the type of statement for which a stack entry represents (‘s’, ‘u’, or ‘m’ for
STRUCTURE
,UNION
, orMAP
, respectively).- t``sptr``
symbol table pointer to the first symbol in a linked list of symbols representing the field name list of a
STRUCTURE
statement. For example,STRUCTURE /A/ B,C,D(10)
, thesptr
would point toB
which is linked toC
andD
. ForUNION
andMAP
, this field is the symbol table pointer to the compiler-createdST_MEMBER
symbol of typeTY_UNION
orTY_STRUCT
, respectively.- t``dtype``
data type pointer to an entry of type
TY_STRUCT
(TY_UNION
) for the current structure.- t``last``
symbol table pointer to the last member which belongs to the structure with respect to the scope. All members which are at the same scoping level are linked together via the
VARIANT
field in reverse order.- t``ict_beg``
pointer to the beginning of the initializer constant tree for the current structure.
- t``ict_end``
pointer to the end of the initializer constant tree for the current structure. .BE
Initializer Constant Tree¶
An Initializer Constant Tree is built (in dynamic storage) as the initializer for a static or external variable is processed (for automatic variables, AST’s are generated as for an ordinary expression, then an assignment statement is simulated).
The tree consists of nodes linked by (absolute) pointers, and its structure
parallels the tree defined by the nesting of braces ({}) in the C language form
of the initializer and by STRUCTURE
’s in the Fortran language form
of the initializer.
Normally, when processing a data initialization statement, the initializer
constant tree is allocated, built, passed to dinit.c to generate
dinit records, and finally deallocated. This all occurs during the processing
of a single Fortran statement. This works for DATA
statements and
type declaration statements but will not work for STRUCTURE
statements.
STRUCTURE
statements should not cause dinit records to be written.
The dinit records should only be written for an instance of a structure declared
with a RECORD
statement. This implies that the initializer constant
tree cannot be deallocated during end of statement processing. In fact,
structure initializer constant trees must be allocated from memory that is not
deallocated until end of program module processing.
The format of nodes is defined by the following C structure declaration:
struct const {
struct const \*next;
struct const \*subc;
INT conval;
INT repeat;
int sptr;
int dtype;
}
There are two types of nodes (distinguished by the value of subc):
Set Node (subc != 0) — represents a set of constants in a structure group. .BS
- t``next``
pointer to next element of set (if any) which is contained within the parent set.
NULL
if this set is at the top level, or is the last element.- t``subc``
pointer to the first element in the initializer constant tree of the subordinate structure.
- t``conval``
not used.
- t``repeat``
not used.
- t``sptr``
Pointer to symbol table entry of variable, array, or structure member to initialize. If zero, the area we are currently initializing is continued to be initialized.
- t``dtype``
Data type record of the structure. .BE
Terminal Nodes (subc = 0) — represent the occurrence of a constant (or constant expression) in the initializer. .BS
- t``next``
same as for Set Nodes.
- t``subc``
NULL
.- t``conval``
32 bit constant value for integer types, else symbol table pointer to entry for constant.
- t``repeat``
The number of times to repeat the constant
in
conval.- t``sptr``
same as for Set Nodes.
- t``dtype``
data type of the constant. .BE
Initializer Variable List¶
An Initializer Variable List goes hand in hand with the Initializer Constant
Tree. It contains the list of variables to be initialized by the constants
in the Constant Initializer Tree. It is built in dynamic storage as the
initializer for a static or external variable is processed. The Initializer
Variable List is only used during DATA
statement processing.
Therefore, its memory space can be released during end of statement
processing.
The format of the Initializer Variable List is:
struct dinit_var {
short id; /\* {Dostart, Doend, Varref} \*/
union {
struct {
short indvar;
short lowbd, upbd, step;
} dostart;
struct {
struct dinit_var \*dostart;
} doend;
struct {
int id;
int ptr;
int dtype;
int shape;
} varref;
} u;
struct dinit_var \*next;
}
The next
field links together more than one variable list element.
A variable list element can be one of three types: a simple variable
reference, an implied-do start, or an implied-do end.
Simple Variable — This is a variable, array, or array element reference. Information from the semantic stack is copied to this entry. .BS
- tid
This holds the value of the
SST_ID
field in the semantic stack.- tptr
This can either be a symbol table pointer or an AST pointer. It will be an AST pointer for an array element reference (i.e. it points to an SUBSCR AST).
- tdtype
This contains the data type of the variable from the semantic stack.
- tshape;
This contains the shape of vector references from the semantic stack. .BE Do-start — This marks the beginning of an implied DO-loop. .BS
- tindvartlowbdtupbdtstep
This contains AST pointers to the index variable, the lower bound, the upper bound, and the step increment for the DO-loop.
- tstep
.BE Do-end — This marks the end of an implied DO-loop. It simply points back to the associated Do-start entry.
Loop Stack¶
The Loop Stack is used to keep track of the current nesting of do, while, and forall loops, and where and block if statements. It consists of fixed size records and is allocated a contiguous area of dynamic storage. The variable sem.loop_depth is used to find the current stack top.
Each record contains a field which is the beginning line number of the control statement. The remaining contents of a record depends on the type of loop it represents:
Do Loop — .BS
- tdo_label
symbol table pointer to the label of the last statement in the loop; this field may be zero.
- tdoinfo
pointer to the
DOINFO
record for the do/dowhile loop (see below).- tname
construct name. This is just an index into the symbol names area; this field is zero is the construct is unnamed.
- texit_label
pointer to the symbol table entry for the label of any EXIT statement which appeared in the DO body; 0 if an EXIT statement did not appear in the body.
- tcycle_label
pointer to the symbol table entry for the label of any CYCLE statement which appeared in the DO body; 0 if a CYCLE statement did not appear in the body.
.BE
Do While Loop — .BS
- tdo_label
symbol table pointer to the label of the last statement in the loop; this field may be zero.
- tdoinfo
pointer to the
DOINFO
record for the do/dowhile loop.- tname
construct name.
- texit_label
pointer to the symbol table entry for the label of any EXIT statement which appeared in the DO body; if an EXIT statement did not appear in the body, a label is created. When the terminating statement of the dowhile statement is processed, the do while loop is transformed into:
top_label: if (.not. <dowhile expr>) goto exit_label; <body of dowhile> goto top_label; exit_label: <statement after dowhile>
- tcycle_label
pointer to the symbol table entry for the label of any CYCLE statement which appeared in the DO body; 0 if a CYCLE statement did not appear in the body.
- top_label
pointer to the symbol table entry for the compiler-created label which represents the top of the loop.
.BE
Forall Loop — .BS
- tdo_label
0
- tdoinfo
pointer to the
DOINFO
record for the do/dowhile loop (see below).- tname
0
.BE
Block IF — .BS
- tdo_label
0
- tdoinfo
0
- tname
construct name
.BE
Where — .BS
- tdo_label
0
- tdoinfo
0
- tname
construct name
.BE
For each DO and DOWHILE loop, a DOINFO
record is created to record
additional information for the construct:
DOINFO — .BS
- tindex_var
pointer to the symbol table entry for the DO index variable (DO loop only).
- tinit_expr
ast of the initial expression (DO loop only).
- tstep_expr
ast of the increment expression (DO loop only).
- tlimit_expr
ast of the increment expression (DO loop only).
- tcount
ast of the expression which computes the loop count (DO loop only).
.BE
Array Constructor List¶
When an array constructor is parsed, a list of array constructor items is created. An item, during parsing, is either an expression or an implied do construct. Each item is built in dynamic storage which is released during end of statement processing.
The format of an array constructor item is:
typedef struct _acl {
int id;
struct _acl \*next;
union {
struct sst \*stkp;
int ast;
struct _acl \*aclp;
} t;
union {
DOINFO \*doinfo;
INT count;
} u;
} ACL;
The
next
field links together the items.
The
id
of an array constructor item while parsing is one of
AC_EXPR
and
AC_IDO
.
AC_EXPR — This represent an item which is an expression. Information from the semantic stack is copied to this entry. .BS
- tstkp
This holds the value of the semantic stack for the expression. .BE AC_IDO — This marks the beginning of an implied DO-loop. .BS
- taclp
This is a pointer to an array constructor list under the control of the implied do.
- tdoinfo
This is a pointer to the
DOINFO
record created for the implied do. .BEOther types of array constructor items are created when the list is actually processed (after the parsing is complete). The other fields in the structure are used during this processing.
Processing¶
Overview¶
The Semantic Analyzer code can be divided into three parts:
fBInitialization\*(rf - The fIsemant_init\*(rf routine which is called from the compiler fIinit\*(rf routine to initialize the Semantic Analyzer data items and allocate space for certain Semantic Analyzer data structures.
fBsemantic actions\*(rf - The body of the Semantic Analyzer logically consists of a large switch statement with one case for each production of the grammar (case labels are created by the parse table generator utility - see section 4). Because of the number of productions, the Semantic Analyzer is divided into 4 separate files: semant deals with declarations; semant2 deals with expressions and simple statements; semant3 deals with allocate statements, conditional statements, branching and call/function statements; semantio deals with I/O statements. Each semantic action is responsible for performing the processing associated with the particular production.
fButility routines\*(rf - routines called by the semantic actions to do such things as change expression types, perform constant folding, etc. These routines are found in the module fIsemutil.c\*(rf. The module
dinit.c
contains the routines to implement data initialization statements and is discussed in the chapter.
The remainder of this section discusses a number of the important semantic processing issues.
Declaration Processing¶
The base data type of a symbol is kept globally and modified by
length specifiers and KIND
specifiers.
For example, INTEGER\*2
will cause the
global data type to be a DT_SINT
. When the symbols in the
declaration list are analyzed the global data type is used by default.
Of course the symbol can have its own length specifier, in which case
its data type is chosen accordingly. For example,
INTEGER\*2 ZIGGY\*1
will result in the symbol ZIGGY
having a data type DT_BINT
. Length and KIND
specifiers are only
allowed in data type declaration statements. The statements
DIMENSION
and COMMON
are not allowed to use
length specifiers.
Data type declaration statements, DIMENSION
statements, and
COMMON
statements can modify a symbol to be an array. For
example, the symbol ZIGGY
could be defined as above and then
specified as an array in a DIMENSION
or COMMON
statement later. The general rule is that the symbol’s type can change
from a simple variable to an array once. If an attempt is made to change
to an array again an error is flagged.
A declaration problem arises because of intrinsic functions.
When the declaration statement REAL SIN
is encountered the
compiler does not know whether the application programmer meant to simply
reaffirm the declaration of the intrinsic function or that they are
declaring a local variable called SIN
. It becomes simple if the
symbol SIN
were declared an array as in REAL SIN(10)
,
because this tells the compiler unequivocally that SIN
is to be
used as a local array. Likewise, if the compiler encounters the symbol
SIN
later in a DIMENSION, COMMON
, or EQUIVALENCE
statement it knows that the symbol SIN
is to lose its intrinsic
properties.
If none of these events happens then the compiler must wait until the
first reference to the symbol to determine its intended use.
If the following first reference was encountered then the
symbol SIN
is assumed to be a local variable and the intrinsic
SIN
will not be available to the application programmer.
sin = sqrt(100)
If the following statement were encountered as the first reference then the
symbol SIN
is reaffirmed as the intrinsic SIN
and the
intrinsic is frozen, that is, it can only be used as an intrinsic in the
current program section.
x = sin(x)
Interface Blocks¶
A subprogram for which an interface is explicitly specified is entered
in the same scope (level 0) as the main subprogram and its variables.
Any variables declared, such as the dummy arguments, are entered into the
symbol table at a scope 1 greater than the current scope.
When the END
statement for the subprogram is seen, the scope
is
popped
(the symbols in scope are removed from the symbol table’s hash table),
thus hiding the symbols declared in the scope from the rest of the
subprogram.
Since subprograms declared in an interface block
do not inherit any information from their
host,
entities, such as the implicit rules and named (parameter) constants
are hidden from the semantic analysis of the interface block.
Hiding the implicit rules is performed by
pushing
the current rules onto a stack
and
references
to outer-scoped parameters are hidden by restricting the scope levels
which can be accessed if in an interface block.
Structure Processing¶
The top of the structure stack represents the
STRUCTURE
statement currently being parsed and the preceeding stack
entries represent uncompleted STRUCTURE
statements that were
interrupted by nested STRUCTURE
statements.
The goal of processing a STRUCTURE/END STRUCTURE
block of statements
is to create a data type for the structure being declared. The data type
entry has a pointer to a linked list of structure members and a pointer
to the initializer constant tree for the members that are data initialized
(see “Initializer Constant Tree” section). The following example will show
how the nesting of structures is handled by the structure stack.
structure /a/
integer b /2/
structure /c/ d
integer e /3/
end structure
end structure
The data type for structure tag a
will point to the list of members
b
and d
, and will point to an initializer constant tree
containing the constant 2
. At the point where structure c
is encountered, the state for structure a
must be saved on the
structure stack so that it is not confused with structure c
.
Structure c
has its own member list and initializer constant tree
built and initialized in a new data type of its own. When the \*(cfEND
STRUCTURE\*(rf for structure c
is encountered the structure stack is
popped and the processing for structure a
is continued.
Initializer Processing¶
The Initializer Constant Tree and Initializer Variable List are built as
the Semantic Analyzer processes data initialization statements.
These data structures are passed to the routine dinit
in the module
dinit.c
which
matches each variable with a constant and writes the necessary records
to the Data Initialization File. The Semantic Analyzer is responsible
for handling the various special features of Fortran initializers, such as:
repeated constant fields
whole array initializations
partial array initializations
three forms of initialization syntax;
DATA
statements, type declaration statements, andSTRUCTURE
statements
The three forms of data initialization statements are discussed in the following sections.
DATA Statements¶
The following DATA
statement will be used as an example.
INTEGER a, b(5), c(2,4,3)
DATA a, b, c(1,2,3), c(2,3,2) / 3\*1, 5\*4/
Data statement processing requires passing both an Initializer
Constant Tree and a Initializer Variable List to the dinit
routine. Processing requires a walk of both the Initializer Constant Tree and
the Initializer Variable List assigning constants to variables.
The DATA
statement is the only initializing statement that allows
implied DO
-loops and array element initialization.
Record references are not allowed in DATA
statements.
In the example the variables a
and the first 2 elements of
the array b
will be initialized to the constant 1. Three elements
of the array b
and two elements in the array c
will be initialized to the constant 4. Here is the Initializer Constant
Tree and the Initializer Variable List the dinit
would process.
IVL ---> a ---> b ---> c ---> c
ICT ---> next ---> next
Term Term
3\*1 5\*4
0 0
The entries for c
will contain AST pointers for array element
referencing. These AST pointers will be traced back by dinit
to determine the array element being initialized.
The DATA
statement is the only form of data initialization statement
that allows implied DO-loops. The following example is used to describe the
processing of implied DO-loops.
DATA ((a(i,j), i=1, 10), j = 21, 30) / 100\*42.0 /
IVL ---> dostartj ---> dostarti ---> a ---> doendi ---> doendj
j i
21 1
30 10
1 1
ICT ---> next
Term
100\*42.0
0
The values in the IVL
structures above are actually AST links.
When the array element reference to a
is encountered, the AST links
are scanned and the array indexes i
and j
are evaluated.
The offset from the beginning of array a
is computed so that the
dinit record can be generated.
When doendj
is encountered, the corresponding dostarti
is
located. Then the index variable i
is incremented by the step
amount and a test is made to see if the upper bound has been exceeded.
If the inner implied DO-loop controlled by index variable i
does not
exceed its upper bound then we reset the IVL
pointer back to process
a
again, this time with an updated index variable i
.
Later when the index variable i
does exceed its upper bound we do
not back up, we simply move forward to the doendj
entry. Processing
for the outer implied DO-loop is exactly the same as for the inner DO-loop.
Type Declaration Initializations¶
Data initializations can occur within a type declaration statement. The following example will be used.
INTEGER a(30) /5\*2, 5\*3, 10\*4/, b/1/
In this example the array a
will be initialized with 20 constants;
five locations with 2, five locations with 3, and ten locations with 4.
The variable b
will be initialized with the constant 1.
Notice that the type declaration statement does two things.
It declares the number of elements in an array and it initializes that array
starting from its base address.
Record references are not allowed with this form of data initialization.
Implied DO
-loop specifiers are not allowed with this
type of declaration.
In the example above, the \*cfdinit\*(rf routine will be called twice. An Initializer Variable List is not used. The variable information is embedded within the Initializer Constant Tree. Here is the Initializer Constant Tree for each call:
ICT ---> next ---> next ---> next
Term Term Term
5\*2 5\*3 10\*4
a 0 0
ICT ---> next
Term
1
b
The first ICT will cause the dinit
routine to begin with the base
address of a
. It will write dinit records to initialize the first
two elements of a
with the constant 2, then since the next ICT
entry does not have a new symbol table pointer entry, the next five constant
3’s will go into a(3), a(4),
and a(5)
. The remaining ten
constant 4’s are assigned locations similarly. Notice that it is not a
problem that there are not enough contants to initialize the entire array.
However, too many constants would be a problem.
STRUCTURE Data Initializations¶
The following STRUCTURE
statement will be used as an example.
structure /s0/
integer a(40) /10\*5, 10\*10, 10\*15/
real b(10)
integer c/6/
end structure
structure /s1/
integer a, b(30)/30\*42/
structure /s2/ c
integer a /42/
end structure
real d, e(30)
record /s0/ f
end structure
record /s1/ r1, r2(30)
Structure initializations differ from the other forms of data initializations.
The occurrance of data initialization within a structure statement does not
cause dinit records to be generated. The dinit records would be generated
when an instance of the structure was declared via the RECORD
statement. This implies that the Initializer Constant Tree for a structure
must not be deallocated during end of statement processing as it does with the
other forms of initialization statements. It must be kept around until end of
program module processing. The data type entry for a structure contains a
pointer to the structure’s Initializer Constant Tree.
Structure initializations use the same form of Initializer Constant Tree as does the Type Declaration Initialization statements. The structure tag entry and the structure in the symbol table will have a pointer the structure’s data type entry. From this the structure’s Initializer Constant Tree can be located.
Here are the Initializer Constant Trees for our example.
ICT s0
---> next ---> next ---> next ---> next
Term Term Term Term
10\*5 10\*10 10\*15 6
a 0 0 c
ICT s1
---> next ---> next ---------> next
Term Subc (ICT s2) Subc (ICT s0)
30\*42 --- ---
b c f
ICT s2
---> next
Term
42
a
Notice that the Initializer Constant Tree for structure s1
contains references to the Initializer Constant Trees for other structures.
Also, the sptr
field in an entry contains a symbol table pointer to
a member of a structure. This is used to get the offset from the beginning of
a structure for a particular member. For example, to obtain the offset from
the beginning of record r1
for the member r1.c.a you would first
calculate the offset from the beginning of r1
to its member
c
, then add to that the offset from the beginning of
c
to its member a
.
Implied DO
-loops and partial array initialization is not allowed.
Expressions¶
When a reduction is made for an arithmetic, character, or logical operation, such as
<addition> ::= <arith expr> <addop> <term> ,
the semantic action code typically performs the following steps:
The data types of the operands are checked and AST’s generated to convert the data type of one or both of the operands if necessary. Hollerith and non-decimal constants are handled by data type assumption rather than data type conversion. Both of the operands would be converted for the case where one operand is doubleprecision and the other operand is complex. In this case both operands are converted to doublecomplex.
Each operand is checked to see if one requires a scalar to vector promotion.
If both operands are constant, the operation is constant folded. Constant folding is also done in the Expander, but is required here for array bounds, initializers, and switch labels.
Otherwise the routine
mkexpr
is called for each operand, to ensure that AST’s have been generated.The AST for the operation is added.
The stack entry corresponding to the left hand side of the reduction is set up with the AST pointer to the added AST, and information on the data type of the expression.
Logical Expressions¶
A logical expression is one of the form
e1 .or. e2
, e1 .and. e2
, or the negation
operator .not. e1
.
Array Constructors¶
After an array constructor list created during parsing, the list
is analyzed by the function
mk_constructor()
.
The result of this function is a semantic stack entry
of type S_IDENT
or S_ACONST
.
In either case, an array temporary is created which represents the
value of the constructor.
In the former case, code is generated to define the values of the
temporary; in the latter case, enough information is saved so that
data initialization records are created when (if) the constructor
is actually referenced.
The first step in
mk_constructor()
is to compute the size of the constructor.
A side-effect of this step is to determine if the temporary can
be data initialized.
Data initialization will occur only if the following criteria are met:
implied do’s are not present and all of the items are constants, or
the context requires an array constant, implied do’s with constant bounds are present, the expressions under the control of the implied do’s will yield constants, and all other items are constants.
Intrinsic Processing¶
Intrinsic functions are supplied by the processor and have a special meaning. Generic names simplify the referencing of specific intrinsic functions. They allow the function argument to be of any type and provide a mapping to the specific intrinsic function based on the data type of the argument.
Generic names cannot be used to pass an intrinsic function as an actual argument. Specific intrinsic names can be passed as an argument only if they have been confirmed as intrinsics (see below). If not confirmed, then they are treated as variables.
An intrinsic name is predefined by the compiler. The name can be confirmed as an intrinsic function or can have its intrinsic property taken away. If the compiler encounters one of the cases that confirms the intended use of a symbol as an intrinsic function, then that symbol is frozen and can only be used to reference that intrinsic function for the remainder of the program unit. If the compiler encounters one of the cases that removes the symbol’s intrinsic property, then that symbol is redefined and must be used in a manner according to the user’s overriding definition and cannot be used as an intrinsic function for the remainder of the program unit.
CASE 1: CONTEXTS HAVING NO EFFECT ON INTRINSIC NAMES
A specific intrinsic name occuring in a type declaration alone, does not effect the intrinsic name. If the name is later confirmed as an intrinsic name then the type declaration has not effect. If the name is later used in a context that removes the intrinsic property of the name, then the name takes on the data type specified in the type declaration encountered earlier.
CASE 2: CONTEXTS CAUSING INTRINSIC CONFIRMATION
Use of a name in the
INTRINSIC
statement.Use of a name that agrees in context and number and type of arguments with the predefined intrinsic confirms that name as an intrinsic.
CASE 3: CONTEXTS CAUSING INTRINSIC PROPERTY REMOVAL
In general, use of an intrinsic function name in a non-executable statement other than the type specification statements removes the intrinsic property of that name.
An intrinsic name declared as an array via a type specification statement removes the intrinsic property of that name.
The use of a symbol on the left side of an assignment statement removes a symbol’s intrinisic property if its intrinsic property is not already confirmed.
The use of a symbol without the correct number and type of arguments in an expression removes the symbol’s intrinsic property if its intrinsic property is not already confirmed.
_ |
||
s+1EXAMPLESs0 |
||
subroutine a |
subroutine b |
subroutine c |
real exp |
integer exp |
integer exp |
x = exp(x) |
x = exp(x) |
x = exp |
end |
end |
end |
_ |
In subroutine a
and b
, the type declaration statement
has no effect on
the intrinsic property of symbol exp
. In subroutine b
,
the symbol exp
will be redefined as an integer variable.