Semantic Analyzer

Overview

The Semantic Analyzer compiler module performs three major functions:

  • generates the first internal representation of the executable statements of the user’s subprogram (AST’s).

  • enters symbols and their attributes into the symbol table and related global data structures.

  • performs semantic error checking and issues appropriate diagnostic messages.

Data Structures

Global Data Structures

Symbol Table — described in section 11.

AST’s — internal representation of executable statements written to a temporary file. Refer to appendix IV for a list of the AST opcodes and descriptions of their meanings.

astb.df — external temporary file containing information generated from the processing of initializations. See section 14.

Data Initialization File — external temporary file containing information generated by the Semantic Analyzer and other phases to effect initializations of compiler-created variables. See section 14.

Reference File — external temporary file containing information on symbol usage for the Cross Reference Listing. See section 13.

Semantic Stack

The Semantic Stack is operated in parallel with the Parse Stack; the same variable is used to point to the top of each (see section 4).

Each stack entry consists of 5 words (of type INT). The contents of each stack entry depend on the corresponding grammar symbol. For example, for the symbol \*(gf<ident>\*(rf, the stack contains a symbol table pointer to the identifier; for the symbol \*(gf<arg list>\*(rf the stack contains pointers to the beginning and end of the list, etc. Each stack entry consists of an AST field.

For terminal symbols of the grammar, the first word of the corresponding semantic stack entry is set to the token value returned by the Scanner (see section 3). For instance, for \*(gf<integer>\*(rf it is the 32-bit integer value of the constant. The exceptions to this rule are constants requiring more than 32-bits of storage such as a complex constant. In these cases the semantic stack entry contains a symbol table pointer to the constant.

For a non-terminal symbol, the sections of semantic code associated with productions which generate the symbol are responsible for ensuring that the proper values are put into the stack entry.

Macros defined in semant.h are used to access the words of the semantic stack. These are of the form SST_\\*(gf<name>\\*(cfP and SST_\\*(gf<name>\\*(cfG. For example,

SST_SYMG(s);
SST_SYMP(s, sptr);

The most important type of stack entry is the one used for \*(gf<expression>\*(rf and a number of other associated grammar symbols such as \*(gf<primary>\*(rf, \*(gf<postfix exp>\*(rf, \*(gf<add exp>\*(rf, etc. This entry itself has several formats, depending on the value of the stack identifier. The following macros are used to reference the various fields described below:

FIELD

MACRO

_

stack type

SST_IDP, SST_IDG

AST pointer

SST_ASTP, SST_ASTG

Constant value

SST_CVALP, SST_CVALG

Symbol pointer

SST_SYMP, SST_SYMG

S_EXPR

AST’s have been written for the associated expression.

word2

data type of the expression (pointer into dtype area, see section 11.4.1).

word5

pointer to AST.

S_LVALUE

The expression is one which can be used as an lvalue, and AST’s have been written for it.

word2

dtype.

word3

symbol of lvalue.

word4

shape.

word5

pointer to AST.

S_LOGEXPR

Logical expression. An expression of the form e1 .and. e2 or e1 .or. e2, or the logical negation (.not. operator) of one of these. See discussion of logical expression processing below.

word1

pointer to LAND or LOR AST.

word2

dtype = DT_INT.

word5

pointer to AST.

S_CONST

the associated expression is a constant.

word1

32-bit constant value if data type of constant is an integer type, otherwise a symbol table pointer to an entry for the constant.

word2

data type of the constant. Note that although the Scanner only returns constants of type DT_INT or DT_FLOAT, many other data types are possible here because of the constant folding of type cast operations.

word5

pointer to AST.

S_STAR

created when \* is seen as a specifier for a dimension.

S_VAL

created when the %VAL built-in is seen; word5 is the pointer to the AST of the operand.

S_IDENT

identifier

word1

a symbol table pointer to an entry for the identifier.

word2

data type of the identifier.

word5

pointer to A_ID AST.

S_LABEL

label

word1

a symbol table pointer to an entry for the label.

word5

pointer to the A_LABEL AST.

S_STFUNC

statement function definition

word1

a symbol table pointer to an entry for the statement function which is being defined.

word2

pointer to the list of the statement function’s formal arguments.

S_ACONST

array constant

word1

a symbol table pointer to an entry for the array temporary representing the constant.

word2

its data type record.

word3

the pointer to the array constructor list representing the values of the array.

Semantic Stack Lists

Often a list of entities must be tied to a semantic stack entry. This happens for identifier lists to program entry points, array subscript lists, subprogram call argument lists, and character substring lists. The semantic analyzer uses item lists to retain semantic stack information until enough information is gathered about the statement being parsed. For example,

foo(a, b, 1, 17, c)

could be an array reference or a function call. Therefore, the semantic analyzer cannot generate ASTs while processing each of the arguments. The only alternative is to save any required information about the arguments in an item list until it is determined what kind of reference is being made.

The following structure defines an item list.

typedef struct xyyz {
 struct xyyz \*next;
 int          ast;
 union {
     int sptr;
     struct sst \*stkp;
     INT conval;
 } t;
} ITEM;

The first field of an item entry is used to point to the next item in the list. A value of LIST_END is used to mark the end of the list. The second field is the AST pointer of the argument. There are four types of item entries; symbol pointer, semantic stack pointer, and constant value. They are each handled in a similar manner. The semantic stack item list entry is discussed here because it is more complex.

An argument list that could represent an array reference, character substring, or function call is represented by an item list of semantic stack pointers. An argument can be:

<expression>

a simple expression which denotes a function argument or a single subscript,

<ident> = <expression>

an identifier keyword which is allowed in certain intrinsic or subprogram calls,

<expression>:<expression>:<expression>

or a subscript triplet which denotes a vector slice.

An argument that is a simple expression is represented by a semantic stack entry of the normal type S_IDENT or S_EXPR.

An argument that is a keyword entry is represented by the corresponding semantic stack id of S_KEYWORD. The SST_E3 field of the semantic stack entry points to another semantic stack entry containing the expression.

An argument that is a subscript triplet is represented by a semantic stack id of S_TRIPLE. The SST_E1, SST_E2, and SST_E3 fields of that semantic stack entry point to other semantic stack entries which represent the initial element, the ending element, and the stride for the triplet notation.

Structure Stack

The structure stack is used to keep track of the current nesting of STRUCTURE, UNION, and MAP statements. The variables sem.stsk_base and sem.stsk_depth are used to find the current stack top. The “Structure Processing” section describes how the structure stack is used by the semantic analyzer.

The form of the structure stack is:

typedef struct {
 char   type;
    int    sptr;
    int    dtype;
 int    last;
    CONST \*ict_beg;
    CONST \*ict_end;
} STSK;

.BS

t``type``

the type of statement for which a stack entry represents (‘s’, ‘u’, or ‘m’ for STRUCTURE, UNION, or MAP, respectively).

t``sptr``

symbol table pointer to the first symbol in a linked list of symbols representing the field name list of a STRUCTURE statement. For example, STRUCTURE /A/ B,C,D(10), the sptr would point to B which is linked to C and D. For UNION and MAP, this field is the symbol table pointer to the compiler-created ST_MEMBER symbol of type TY_UNION or TY_STRUCT, respectively.

t``dtype``

data type pointer to an entry of type TY_STRUCT (TY_UNION) for the current structure.

t``last``

symbol table pointer to the last member which belongs to the structure with respect to the scope. All members which are at the same scoping level are linked together via the VARIANT field in reverse order.

t``ict_beg``

pointer to the beginning of the initializer constant tree for the current structure.

t``ict_end``

pointer to the end of the initializer constant tree for the current structure. .BE

Initializer Constant Tree

An Initializer Constant Tree is built (in dynamic storage) as the initializer for a static or external variable is processed (for automatic variables, AST’s are generated as for an ordinary expression, then an assignment statement is simulated).

The tree consists of nodes linked by (absolute) pointers, and its structure parallels the tree defined by the nesting of braces ({}) in the C language form of the initializer and by STRUCTURE’s in the Fortran language form of the initializer.

Normally, when processing a data initialization statement, the initializer constant tree is allocated, built, passed to dinit.c to generate dinit records, and finally deallocated. This all occurs during the processing of a single Fortran statement. This works for DATA statements and type declaration statements but will not work for STRUCTURE statements. STRUCTURE statements should not cause dinit records to be written. The dinit records should only be written for an instance of a structure declared with a RECORD statement. This implies that the initializer constant tree cannot be deallocated during end of statement processing. In fact, structure initializer constant trees must be allocated from memory that is not deallocated until end of program module processing.

The format of nodes is defined by the following C structure declaration:

struct const {
    struct const \*next;
    struct const \*subc;
    INT           conval;
    INT           repeat;
    int           sptr;
    int           dtype;
}

There are two types of nodes (distinguished by the value of subc):

Set Node (subc != 0) — represents a set of constants in a structure group. .BS

t``next``

pointer to next element of set (if any) which is contained within the parent set. NULL if this set is at the top level, or is the last element.

t``subc``

pointer to the first element in the initializer constant tree of the subordinate structure.

t``conval``

not used.

t``repeat``

not used.

t``sptr``

Pointer to symbol table entry of variable, array, or structure member to initialize. If zero, the area we are currently initializing is continued to be initialized.

t``dtype``

Data type record of the structure. .BE

Terminal Nodes (subc = 0) — represent the occurrence of a constant (or constant expression) in the initializer. .BS

t``next``

same as for Set Nodes.

t``subc``

NULL.

t``conval``

32 bit constant value for integer types, else symbol table pointer to entry for constant.

t``repeat``

The number of times to repeat the constant in conval.

t``sptr``

same as for Set Nodes.

t``dtype``

data type of the constant. .BE

Initializer Variable List

An Initializer Variable List goes hand in hand with the Initializer Constant Tree. It contains the list of variables to be initialized by the constants in the Constant Initializer Tree. It is built in dynamic storage as the initializer for a static or external variable is processed. The Initializer Variable List is only used during DATA statement processing. Therefore, its memory space can be released during end of statement processing.

The format of the Initializer Variable List is:

   struct dinit_var {
       short id;    /\* {Dostart, Doend, Varref} \*/
       union {
           struct {
               short indvar;
               short lowbd, upbd, step;
           } dostart;
           struct {
               struct dinit_var \*dostart;
           } doend;
           struct {
            int id;
            int ptr;
            int dtype;
            int shape;
           } varref;
      } u;
      struct dinit_var \*next;
}

The next field links together more than one variable list element. A variable list element can be one of three types: a simple variable reference, an implied-do start, or an implied-do end.

Simple Variable — This is a variable, array, or array element reference. Information from the semantic stack is copied to this entry. .BS

tid

This holds the value of the SST_ID field in the semantic stack.

tptr

This can either be a symbol table pointer or an AST pointer. It will be an AST pointer for an array element reference (i.e. it points to an SUBSCR AST).

tdtype

This contains the data type of the variable from the semantic stack.

tshape;

This contains the shape of vector references from the semantic stack. .BE Do-start — This marks the beginning of an implied DO-loop. .BS

tindvartlowbdtupbdtstep

This contains AST pointers to the index variable, the lower bound, the upper bound, and the step increment for the DO-loop.

tstep

.BE Do-end — This marks the end of an implied DO-loop. It simply points back to the associated Do-start entry.

Loop Stack

The Loop Stack is used to keep track of the current nesting of do, while, and forall loops, and where and block if statements. It consists of fixed size records and is allocated a contiguous area of dynamic storage. The variable sem.loop_depth is used to find the current stack top.

Each record contains a field which is the beginning line number of the control statement. The remaining contents of a record depends on the type of loop it represents:

Do Loop — .BS

tdo_label

symbol table pointer to the label of the last statement in the loop; this field may be zero.

tdoinfo

pointer to the DOINFO record for the do/dowhile loop (see below).

tname

construct name. This is just an index into the symbol names area; this field is zero is the construct is unnamed.

texit_label

pointer to the symbol table entry for the label of any EXIT statement which appeared in the DO body; 0 if an EXIT statement did not appear in the body.

tcycle_label

pointer to the symbol table entry for the label of any CYCLE statement which appeared in the DO body; 0 if a CYCLE statement did not appear in the body.

.BE

Do While Loop — .BS

tdo_label

symbol table pointer to the label of the last statement in the loop; this field may be zero.

tdoinfo

pointer to the DOINFO record for the do/dowhile loop.

tname

construct name.

texit_label

pointer to the symbol table entry for the label of any EXIT statement which appeared in the DO body; if an EXIT statement did not appear in the body, a label is created. When the terminating statement of the dowhile statement is processed, the do while loop is transformed into:

  top_label:
if (.not. <dowhile expr>) goto exit_label;
<body of dowhile>
goto top_label;
  exit_label:
<statement after dowhile>
tcycle_label

pointer to the symbol table entry for the label of any CYCLE statement which appeared in the DO body; 0 if a CYCLE statement did not appear in the body.

top_label

pointer to the symbol table entry for the compiler-created label which represents the top of the loop.

.BE

Forall Loop — .BS

tdo_label

0

tdoinfo

pointer to the DOINFO record for the do/dowhile loop (see below).

tname

0

.BE

Block IF — .BS

tdo_label

0

tdoinfo

0

tname

construct name

.BE

Where — .BS

tdo_label

0

tdoinfo

0

tname

construct name

.BE

For each DO and DOWHILE loop, a DOINFO record is created to record additional information for the construct:

DOINFO — .BS

tindex_var

pointer to the symbol table entry for the DO index variable (DO loop only).

tinit_expr

ast of the initial expression (DO loop only).

tstep_expr

ast of the increment expression (DO loop only).

tlimit_expr

ast of the increment expression (DO loop only).

tcount

ast of the expression which computes the loop count (DO loop only).

.BE

Array Constructor List

When an array constructor is parsed, a list of array constructor items is created. An item, during parsing, is either an expression or an implied do construct. Each item is built in dynamic storage which is released during end of statement processing.

The format of an array constructor item is:

typedef struct _acl {
 int              id;
 struct _acl     \*next;
 union {
     struct sst  \*stkp;
     int          ast;
     struct _acl \*aclp;
 } t;
 union {
     DOINFO      \*doinfo;
     INT          count;
 } u;
} ACL;

The next field links together the items. The id of an array constructor item while parsing is one of AC_EXPR and AC_IDO.

AC_EXPR — This represent an item which is an expression. Information from the semantic stack is copied to this entry. .BS

tstkp

This holds the value of the semantic stack for the expression. .BE AC_IDO — This marks the beginning of an implied DO-loop. .BS

taclp

This is a pointer to an array constructor list under the control of the implied do.

tdoinfo

This is a pointer to the DOINFO record created for the implied do. .BEOther types of array constructor items are created when the list is actually processed (after the parsing is complete). The other fields in the structure are used during this processing.

Processing

Overview

The Semantic Analyzer code can be divided into three parts:

  1. fBInitialization\*(rf - The fIsemant_init\*(rf routine which is called from the compiler fIinit\*(rf routine to initialize the Semantic Analyzer data items and allocate space for certain Semantic Analyzer data structures.

  2. fBsemantic actions\*(rf - The body of the Semantic Analyzer logically consists of a large switch statement with one case for each production of the grammar (case labels are created by the parse table generator utility - see section 4). Because of the number of productions, the Semantic Analyzer is divided into 4 separate files: semant deals with declarations; semant2 deals with expressions and simple statements; semant3 deals with allocate statements, conditional statements, branching and call/function statements; semantio deals with I/O statements. Each semantic action is responsible for performing the processing associated with the particular production.

  3. fButility routines\*(rf - routines called by the semantic actions to do such things as change expression types, perform constant folding, etc. These routines are found in the module fIsemutil.c\*(rf. The module dinit.c contains the routines to implement data initialization statements and is discussed in the chapter.

The remainder of this section discusses a number of the important semantic processing issues.

Declaration Processing

The base data type of a symbol is kept globally and modified by length specifiers and KIND specifiers. For example, INTEGER\*2 will cause the global data type to be a DT_SINT. When the symbols in the declaration list are analyzed the global data type is used by default. Of course the symbol can have its own length specifier, in which case its data type is chosen accordingly. For example, INTEGER\*2 ZIGGY\*1 will result in the symbol ZIGGY having a data type DT_BINT. Length and KIND specifiers are only allowed in data type declaration statements. The statements DIMENSION and COMMON are not allowed to use length specifiers.

Data type declaration statements, DIMENSION statements, and COMMON statements can modify a symbol to be an array. For example, the symbol ZIGGY could be defined as above and then specified as an array in a DIMENSION or COMMON statement later. The general rule is that the symbol’s type can change from a simple variable to an array once. If an attempt is made to change to an array again an error is flagged.

A declaration problem arises because of intrinsic functions. When the declaration statement REAL SIN is encountered the compiler does not know whether the application programmer meant to simply reaffirm the declaration of the intrinsic function or that they are declaring a local variable called SIN. It becomes simple if the symbol SIN were declared an array as in REAL SIN(10), because this tells the compiler unequivocally that SIN is to be used as a local array. Likewise, if the compiler encounters the symbol SIN later in a DIMENSION, COMMON, or EQUIVALENCE statement it knows that the symbol SIN is to lose its intrinsic properties. If none of these events happens then the compiler must wait until the first reference to the symbol to determine its intended use.

If the following first reference was encountered then the symbol SIN is assumed to be a local variable and the intrinsic SIN will not be available to the application programmer.

sin = sqrt(100)

If the following statement were encountered as the first reference then the symbol SIN is reaffirmed as the intrinsic SIN and the intrinsic is frozen, that is, it can only be used as an intrinsic in the current program section.

x = sin(x)

Interface Blocks

A subprogram for which an interface is explicitly specified is entered in the same scope (level 0) as the main subprogram and its variables. Any variables declared, such as the dummy arguments, are entered into the symbol table at a scope 1 greater than the current scope. When the END statement for the subprogram is seen, the scope is popped (the symbols in scope are removed from the symbol table’s hash table), thus hiding the symbols declared in the scope from the rest of the subprogram. Since subprograms declared in an interface block do not inherit any information from their host, entities, such as the implicit rules and named (parameter) constants are hidden from the semantic analysis of the interface block. Hiding the implicit rules is performed by pushing the current rules onto a stack and references to outer-scoped parameters are hidden by restricting the scope levels which can be accessed if in an interface block.

Structure Processing

The top of the structure stack represents the STRUCTURE statement currently being parsed and the preceeding stack entries represent uncompleted STRUCTURE statements that were interrupted by nested STRUCTURE statements.

The goal of processing a STRUCTURE/END STRUCTURE block of statements is to create a data type for the structure being declared. The data type entry has a pointer to a linked list of structure members and a pointer to the initializer constant tree for the members that are data initialized (see “Initializer Constant Tree” section). The following example will show how the nesting of structures is handled by the structure stack.

structure /a/
 integer b /2/
 structure /c/ d
     integer e /3/
 end structure
end structure

The data type for structure tag a will point to the list of members b and d, and will point to an initializer constant tree containing the constant 2. At the point where structure c is encountered, the state for structure a must be saved on the structure stack so that it is not confused with structure c. Structure c has its own member list and initializer constant tree built and initialized in a new data type of its own. When the \*(cfEND STRUCTURE\*(rf for structure c is encountered the structure stack is popped and the processing for structure a is continued.

Initializer Processing

The Initializer Constant Tree and Initializer Variable List are built as the Semantic Analyzer processes data initialization statements. These data structures are passed to the routine dinit in the module dinit.c which matches each variable with a constant and writes the necessary records to the Data Initialization File. The Semantic Analyzer is responsible for handling the various special features of Fortran initializers, such as:

  • repeated constant fields

  • whole array initializations

  • partial array initializations

  • three forms of initialization syntax; DATA statements, type declaration statements, and STRUCTURE statements

The three forms of data initialization statements are discussed in the following sections.

DATA Statements

The following DATA statement will be used as an example.

INTEGER a, b(5), c(2,4,3)
DATA a, b, c(1,2,3), c(2,3,2) / 3\*1, 5\*4/

Data statement processing requires passing both an Initializer Constant Tree and a Initializer Variable List to the dinit routine. Processing requires a walk of both the Initializer Constant Tree and the Initializer Variable List assigning constants to variables.

The DATA statement is the only initializing statement that allows implied DO-loops and array element initialization. Record references are not allowed in DATA statements.

In the example the variables a and the first 2 elements of the array b will be initialized to the constant 1. Three elements of the array b and two elements in the array c will be initialized to the constant 4. Here is the Initializer Constant Tree and the Initializer Variable List the dinit would process.

IVL ---> a ---> b ---> c ---> c

ICT ---> next ---> next
         Term      Term
         3\*1       5\*4
         0         0

The entries for c will contain AST pointers for array element referencing. These AST pointers will be traced back by dinit to determine the array element being initialized.

The DATA statement is the only form of data initialization statement that allows implied DO-loops. The following example is used to describe the processing of implied DO-loops.

      DATA ((a(i,j), i=1, 10), j = 21, 30) / 100\*42.0 /

IVL ---> dostartj ---> dostarti ---> a ---> doendi ---> doendj
         j             i
         21            1
         30            10
         1             1

ICT ---> next
         Term
         100\*42.0
         0

The values in the IVL structures above are actually AST links. When the array element reference to a is encountered, the AST links are scanned and the array indexes i and j are evaluated. The offset from the beginning of array a is computed so that the dinit record can be generated.

When doendj is encountered, the corresponding dostarti is located. Then the index variable i is incremented by the step amount and a test is made to see if the upper bound has been exceeded. If the inner implied DO-loop controlled by index variable i does not exceed its upper bound then we reset the IVL pointer back to process a again, this time with an updated index variable i. Later when the index variable i does exceed its upper bound we do not back up, we simply move forward to the doendj entry. Processing for the outer implied DO-loop is exactly the same as for the inner DO-loop.

Type Declaration Initializations

Data initializations can occur within a type declaration statement. The following example will be used.

INTEGER a(30) /5\*2, 5\*3, 10\*4/, b/1/

In this example the array a will be initialized with 20 constants; five locations with 2, five locations with 3, and ten locations with 4. The variable b will be initialized with the constant 1. Notice that the type declaration statement does two things. It declares the number of elements in an array and it initializes that array starting from its base address.

Record references are not allowed with this form of data initialization. Implied DO-loop specifiers are not allowed with this type of declaration.

In the example above, the \*cfdinit\*(rf routine will be called twice. An Initializer Variable List is not used. The variable information is embedded within the Initializer Constant Tree. Here is the Initializer Constant Tree for each call:

ICT ---> next ---> next ---> next
         Term      Term      Term
         5\*2       5\*3       10\*4
         a         0         0

ICT ---> next
         Term
         1
         b

The first ICT will cause the dinit routine to begin with the base address of a. It will write dinit records to initialize the first two elements of a with the constant 2, then since the next ICT entry does not have a new symbol table pointer entry, the next five constant 3’s will go into a(3), a(4), and a(5). The remaining ten constant 4’s are assigned locations similarly. Notice that it is not a problem that there are not enough contants to initialize the entire array. However, too many constants would be a problem.

STRUCTURE Data Initializations

The following STRUCTURE statement will be used as an example.

structure /s0/
    integer a(40) /10\*5, 10\*10, 10\*15/
    real b(10)
    integer c/6/
end structure
structure /s1/
    integer a, b(30)/30\*42/
    structure /s2/ c
     integer a /42/
    end structure
    real d, e(30)
    record /s0/ f
end structure
record /s1/ r1, r2(30)

Structure initializations differ from the other forms of data initializations. The occurrance of data initialization within a structure statement does not cause dinit records to be generated. The dinit records would be generated when an instance of the structure was declared via the RECORD statement. This implies that the Initializer Constant Tree for a structure must not be deallocated during end of statement processing as it does with the other forms of initialization statements. It must be kept around until end of program module processing. The data type entry for a structure contains a pointer to the structure’s Initializer Constant Tree.

Structure initializations use the same form of Initializer Constant Tree as does the Type Declaration Initialization statements. The structure tag entry and the structure in the symbol table will have a pointer the structure’s data type entry. From this the structure’s Initializer Constant Tree can be located.

Here are the Initializer Constant Trees for our example.

ICT s0
---> next ---> next ---> next ---> next
     Term      Term      Term      Term
     10\*5      10\*10     10\*15     6
     a         0         0         c

ICT s1
---> next ---> next ---------> next
     Term      Subc (ICT s2)   Subc (ICT s0)
     30\*42     ---             ---
     b         c               f

ICT s2
---> next
     Term
     42
     a

Notice that the Initializer Constant Tree for structure s1 contains references to the Initializer Constant Trees for other structures. Also, the sptr field in an entry contains a symbol table pointer to a member of a structure. This is used to get the offset from the beginning of a structure for a particular member. For example, to obtain the offset from the beginning of record r1 for the member r1.c.a you would first calculate the offset from the beginning of r1 to its member c, then add to that the offset from the beginning of c to its member a.

Implied DO-loops and partial array initialization is not allowed.

Expressions

When a reduction is made for an arithmetic, character, or logical operation, such as

<addition> ::= <arith expr> <addop> <term> ,

the semantic action code typically performs the following steps:

  1. The data types of the operands are checked and AST’s generated to convert the data type of one or both of the operands if necessary. Hollerith and non-decimal constants are handled by data type assumption rather than data type conversion. Both of the operands would be converted for the case where one operand is doubleprecision and the other operand is complex. In this case both operands are converted to doublecomplex.

  2. Each operand is checked to see if one requires a scalar to vector promotion.

  3. If both operands are constant, the operation is constant folded. Constant folding is also done in the Expander, but is required here for array bounds, initializers, and switch labels.

  4. Otherwise the routine mkexpr is called for each operand, to ensure that AST’s have been generated.

  5. The AST for the operation is added.

  6. The stack entry corresponding to the left hand side of the reduction is set up with the AST pointer to the added AST, and information on the data type of the expression.

Logical Expressions

A logical expression is one of the form e1 .or. e2, e1 .and. e2, or the negation operator .not. e1.

Array Constructors

After an array constructor list created during parsing, the list is analyzed by the function mk_constructor(). The result of this function is a semantic stack entry of type S_IDENT or S_ACONST. In either case, an array temporary is created which represents the value of the constructor. In the former case, code is generated to define the values of the temporary; in the latter case, enough information is saved so that data initialization records are created when (if) the constructor is actually referenced.

The first step in mk_constructor() is to compute the size of the constructor. A side-effect of this step is to determine if the temporary can be data initialized. Data initialization will occur only if the following criteria are met:

  1. implied do’s are not present and all of the items are constants, or

  2. the context requires an array constant, implied do’s with constant bounds are present, the expressions under the control of the implied do’s will yield constants, and all other items are constants.

Intrinsic Processing

Intrinsic functions are supplied by the processor and have a special meaning. Generic names simplify the referencing of specific intrinsic functions. They allow the function argument to be of any type and provide a mapping to the specific intrinsic function based on the data type of the argument.

Generic names cannot be used to pass an intrinsic function as an actual argument. Specific intrinsic names can be passed as an argument only if they have been confirmed as intrinsics (see below). If not confirmed, then they are treated as variables.

An intrinsic name is predefined by the compiler. The name can be confirmed as an intrinsic function or can have its intrinsic property taken away. If the compiler encounters one of the cases that confirms the intended use of a symbol as an intrinsic function, then that symbol is frozen and can only be used to reference that intrinsic function for the remainder of the program unit. If the compiler encounters one of the cases that removes the symbol’s intrinsic property, then that symbol is redefined and must be used in a manner according to the user’s overriding definition and cannot be used as an intrinsic function for the remainder of the program unit.

CASE 1: CONTEXTS HAVING NO EFFECT ON INTRINSIC NAMES

  1. A specific intrinsic name occuring in a type declaration alone, does not effect the intrinsic name. If the name is later confirmed as an intrinsic name then the type declaration has not effect. If the name is later used in a context that removes the intrinsic property of the name, then the name takes on the data type specified in the type declaration encountered earlier.

CASE 2: CONTEXTS CAUSING INTRINSIC CONFIRMATION

  1. Use of a name in the INTRINSIC statement.

  2. Use of a name that agrees in context and number and type of arguments with the predefined intrinsic confirms that name as an intrinsic.

CASE 3: CONTEXTS CAUSING INTRINSIC PROPERTY REMOVAL

  1. In general, use of an intrinsic function name in a non-executable statement other than the type specification statements removes the intrinsic property of that name.

  2. An intrinsic name declared as an array via a type specification statement removes the intrinsic property of that name.

  3. The use of a symbol on the left side of an assignment statement removes a symbol’s intrinisic property if its intrinsic property is not already confirmed.

  4. The use of a symbol without the correct number and type of arguments in an expression removes the symbol’s intrinsic property if its intrinsic property is not already confirmed.

_

s+1EXAMPLESs0

subroutine a

subroutine b

subroutine c

real exp

integer exp

integer exp

x = exp(x)

x = exp(x)

x = exp

end

end

end

_

In subroutine a and b, the type declaration statement has no effect on the intrinsic property of symbol exp. In subroutine b, the symbol exp will be redefined as an integer variable.