Introduction¶
Scope¶
This document is the design specification for the Fortran compiler, Flang. The following subjects are included:
First level module decomposition of the compiler and flow of control between the modules.
Processing and data structure overview for each first level module.
Detailed specification of those data structures which are used by more than one first level module. This includes definition of the various internal code representations used within Flang.
Detailed (external) specification of compiler routines required by more than one module, for example, symbol table access and error reporting routines.
Descriptions of important algorithms used by Flang.
Full descriptions of utility programs used to build tables, etc. used by Flang.
Certain coding conventions and practices, including the source file directory structure and file naming conventions.
Topics not covered include:
Flang function specifications or requirements.
Design of compiler run-time support or I/O libraries.
Detailed specification of data structures or routines local to a single module, except for those critical for the module design.
Design¶
Flang is divided into a number of passes which communicate via various internal forms of the user’s program. Figure 1-1 gives an overview of the compiler control flow and some of the important data items. Refer to the Document Overview section below for more detail. .sk
Conventions¶
A number of the usual notational conventions are used informally throughout this document. These include use of “[“] for optional items and “|^” to indicate a choice of items.
A basic knowledge of the C programming language is assumed throughout this document. In particular:
pseudo-code descriptions of algorithms use a C-like syntax and control structures.
function routine definitions are given using C syntax.
reference is made to the various C data types. Note particularly the use of the special data type INT, defined in the utility package specification [8], which is used for integer data which requires 32 bits.
Definition¶
Document Overview¶
A variety of compiler modules, data structures, files, and utility programs are described in this document. This section summarizes the contents of this document and tells where to find information on the various subjects.
Compiler Inputs¶
Command Line — The command line specifies the input and output files and the switches to be in effect for a given compilation. Its format and the allowed switches are described in the Flang functional specification [1]. Processing of the command line is performed by the initialization module INIT, described in section 2.
Source Input File — The input to Flang is a file written in the version of the C language described in [1]. This file is read and tokenized by the Scanner module, described in section 4.
Compiler Outputs¶
Object Module File — The object file format is described in the Object File Standard, [4]. This file is written by the Assembler module described in section 9.
Source Listing — The listing of the user’s source file is written by the Scanner (section 4) as it reads the source input file.
Object Code Listing — The Object Code Listing is an assembler type listing of the object code generated by Flang. It is written by the Assembler, described in section 9.
Cross Reference Listing — an alphabetical listing of user defined symbols and labels giving the various attributes (e.g. data type) of the symbols and the source line numbers of references to each symbol. It is written by the Cross Reference Generator, described in section 10.
Compiler¶
Program Controller and Initialization — initializes the compiler and controls invocation of the remaining compiler modules.
Scanner — reads the source input file and breaks it up into tokens which are passed to the Parser. The C preprocessor functions, such as macro expansion, conditional compilation, and include file processing, are performed by the Scanner. It writes the source listing, if requested, and enters symbols into the Symbol Table. See section 3.
Parser — performs syntactic analysis of the user’s program using tokens returned by the Scanner. Calls the appropriate semantic analysis routines as it parses. Issues syntax errors when detected and makes limited attempts to recover from them. See section 4.
Semantic Analyzer — performs semantic error checking, translates the user program into the first internal representation, ILMs, and enters information into the Symbol Table. See section 5.
Expander — translates the ILMs produced by the Semantic Analyzer into the second internal representation, ILI.
Cross Reference Generator — writes the Cross Reference Listing using information in the Symbol Table and the Reference File. See section 10.
Subprogram Libraries¶
SCUTIL — is the utility package used for I/O, numerical conversions, operating system interfacing, etc. This library is described in reference [5].
Error Reporting Routines — is a set of routines used to construct and issue compiler error diagnostics, described in section 17.
Symbol Table Access Routines — is a set of routines used throughout the compiler to enter symbols into the symbol table. These are described with the Symbol Table, section 11.
Dynamic Data Bases/Temporary Files¶
Symbol Table — contains information on user defined and compiler created symbols and constants, used throughout the compiler. See section 11.
Intermediate Language Macros (ILMs) — is the internal representation of the executable statements of the user’s program, created by the Semantic Analyzer and translated into ILI by the Expander. These are written to a temporary binary file as they are created. Section 12 describes the format, etc., of the ILMs and Appendix IV contains the definition of the ILM opcodes, their meanings, and attributes.
Intermediate Language Instructions (ILI) — is the internal representation of the user’s program, created by the Expander. The general format of the ILI, and associated data structures are described in section 13. Appendix V contains the definitions of the ILI opcodes, their attributes, and their mappings into micro operations.
Data Initialization File — is an external binary file containing data initialization information written during semantic analysis of declarations which contain initializations. This is used by the Assembler to create part of the Object Module File. The format of this file is described in section 16.
Reference File — is a temporary external binary file written by the Semantic Analyzer, containing symbol/line number usage information used by the Cross Reference Generator to create the Cross Reference Listing. Format is described in section 10.
Static Data Bases and Utility Programs¶
Several phases of Flang are table driven. These tables have fixed values and are typically initialized by C source code created by utility programs using text files as input. In most cases the input file is also used to create one of the appendices to this document, ensuring the accuracy of the appendices. The important static data bases and the utilities which generate them are:
Parse Tables (PRSTAB) — are LR(1) parse tables created by the utility PRSTAB and used by the Parser. The format of these tables is described in reference [7]. Appendix I contains a listing and symbol cross reference of the grammar used to create the parse tables.
Error Messages (ERRMSG) — is the structure containing the text of error messages, created by the utility ERRMSG, and used by the Error Reporting routines to issue error messages. Appendix II lists the messages issued by Flang.
ILM Attributes and Templates (ILMTP) — are tables created by the utility ILMTP which define the attributes of the various ILM opcodes and their translation into ILI. See section 12 for the format of this information, and Appendix IV for the current attribute and template values.
ILI Attributes and Templates (ILITP) — are tables created by the utility ILITP which define the attributes of the various ILI and their translation into micro-operations. See section 13 for the format of this information and Appendix V for the current attribute and template values.
Micro Operations (MICROP) — are the primitive operations making up an ILI template and which are combined to form SC instructions. The tables defining the micro operations are created by the utility MICROP and used by the Assembler. The format of this information, and the MICROP utility, are described in section 14. Appendix VI contains a listing of the current symbolic definition of the micro operations.
Symbol Table (SYMTAB) — <to be supplied>
Machine Characteristics (MACHAR) — <to be supplied>
Coding Practices¶
Section 18 discusses a number of topics, including coding conventions, variable naming conventions, source file data base structure, and compiler build procedures.