Compiler Internals
Turbo Pascal compiler functions are grouped into several categories/units according to their role in the compiler. This grouping is done only to have a better overview on the individual parts of the compiler. On the other hand, functions from one group usually share common types and variables and therefore it makes sense to place them in separate units.
Basic Turbo Pascal Internals
Unit files in Turbo Pascal (tpu extension) are actually symbol tables that are compacted and saved as individual files. The System unit is implicitly used in every program or unit. It contains the boot-strap symbol table and compiler procedures. The definition order of these compiler procedures is important because compiler calls them by id number. To compile System unit you need bootstrap symbol table (SYSTEM.TPS
).
Boot-strap symbol table contains system types like Byte
, Char
, Boolean
, port identifiers, memory identifiers, system functions and system procedures.
Turbo Pascal library (extension tpl
) is simple binary concatenation of one or more units. It is loaded at the compiler start. It should contain at least the system unit. You can create unit with console command copy:
copy /b system.tpu unit1.tpu unit2.tpu turbo.tpl
Turbo Pascal uses low-level intermediate code. Each record can contain target instruction with reference data, intermediate code instruction for subroutines or special meta instruction.
Turbo Pascal relies heavily on the segment:offset architecture of the x86 family in the real mode. In many cases this is a limiting factor because many data structures are limited to 64 KB. But on the other hand this comes very convenient when dealing with addresses and offsets.
Turbo Pascal Internal Structure
This category contains the main program, common compiler variables, and everything else that does not belong into other categories.
Scanner unit contains functions that processes source files and compiler directives and extracts tokens - basic elements of the language.
Symbol tables are core part of every compiler. Turbo Pascal uses linked lists and hasing to effectively store and retrieve identifiers. Functions in this unit take care for data storing, identifier searching and various symbol table management.
Parser processes the main program and units, checks syntax, processes stream of tokens and generates intermediate code. This is where the core compiler functions are located.
Expression in Turbo Pascal is everything from constant, variable, calculation or just an identifier. Expressions are made up of operators and operands. Most Pascal operators are binary; they take two operands. The rest are unary and take only one operand. This unit contains over 100 functions to process every possible Turbo Pascal expression.
This unit is used by the Expressions unit and contains functions that process calculations with one or two operands and calculation operation. This unit actually generates the code for addition, subtraction, multiplication, division, shifts, etc.
This unit contains files that process each Pascal statement: If
, While
, For
, Repeat
, Case
, With
, GoTo
, Inline
, Asm
block, or system procedure.
Assembler unit processes assembly instructions in the Asm-end block and generates code for them.
This is another unit that is used by the Expressions
unit which processes system functions like Abs
, UpCase
, Sqr
, Succ
, Pred
, etc.
This unit contains functions to process system procedures like Write
, Writeln
, Assign
, Dispose
, Delete
, etc.
Type definitions unit defines data structures for basic types and contains few functions to process type definitions.
This unit imports and processes object files and generates intermediate code for OMF records.
This unit processes intermediate code and generates executable code and reference records for Linker.
Linker joins code from all used units, determines addresses of variables, functions and procedures, resolves references and generates executable file.
Turbo Pascal contains many functions that read or write files, handle error messages and take care for compiler operation.