CP502: Descriptors.

10.3. Descriptors.

This information can be passed to print (and other external predefined routines) as well as the values. This enables print to correctly print out the value.

Some of this information is also passed to the linker program = directions for storing variables, procedures, arrays. We shall look at these "descriptors" but first the linker.

The link accepts as input, the output of one of the computer's translators or compilers in order to produce an executable version of your program. The compiler output, known as an object module or relocatable binary module, contains users programs plus additional information necessary for the linking of separately compiled modules. Most object modules contain relocatable code so that the modules position in memory can be determined by the linker.

The user benefits by not having to worry where one module ends and another starts. User can also code the modules and not worry about where it is put. When moving the relocatable code in memory to the place where it will execute, the linker adjusts all relocatable addresses, in the code to the absolute addresses. Linkage between modules is accomplished through symbolic addresses. By including symbolic addresses, the user is delaying assigning an actual value until load time.

By linking these modules, the linker provides communication between independantly compiled modules. When finished satisfying all these symbolic addresses, your program is ready to run.

NOTE: The relocation done by the linker has nothing to do with relocation of your code by segmentation or paging. This is yet another level of address trans- lation, but it is the same each time your program is Executed.

TABLE 2 shows the information which is required in general to describe an object. This block of information is called a descriptor and is created in the declaration phase of syntax analysis. In case of PL0 what information was created?

        TABLE = record

                    name of identifier

                    Val if constant or

                    address and level

                          = scope of procedure or variable

                end


       When is it created?

The only declaration (in PL0) is of const, var, and procedures

and these occur as the first things tested within  block.

Therefore when it comes to process statement all descriptors

must be defined.

In Java, this means all procedure descriptors must be defined before they can be used.


Forward declarations of procedures must be used (if needed)

and must give number, order and type of parameters since

these descriptors must also be filled in.

The "kind", "type" and "accessing information" entries are determined by the details of the declaration. The "runtime address" can be symbolic and is set by the linker program.

DESCRIPTOR of procedure will contain a pointer to descriptor

for each of its parameters. This descriptor will also be

accessible via the entry for the name of the parameter.

It is common for a single name to correspond to more than one descriptor. Therefore the "descriptor" field in the symbol table entry is normally a pointer to a list of descriptor records rather than the descriptor record itself.

The "kind" of object in Java is


 (variable, array, procedure,  constant, function, label, set,

  packed-array, record, pointer, file, std-proc, std-function).

Some of thse may be superfluous since we may also have access to the "type" field at our disposal to include extra information. In fact, it is possible to separate "kind" into "scalar-kind" and "structured-kind". What goes where?

In simple languages such as FORTRAN-66 and BASIC, where the "type" is one of several fixed types; i.e. INTEGER, REAL, LOGICAL; The "type" field may be a simple enumerated type, i.e.

(int_type, real_type, complex_type, logical_type, double_type).

Since some languages allow a user to create his/her own types, the "type" field has to be a pointer to a descriptor which indicates the user defined type. We could emulate types complex and double precision with user-defined types. But we still have to process complex in some manner

        i.e.  var x,y,z: complex /* some predefined type  */

                        :

                z  x*y     /*  not legal since not allowed type  */

Some languages allow new operators to be defined to act upon the new data type defined: i.e.

x*y is legal and does correct thing.

Multiplication of complex numbers must now also be defined in some manner beforehand. SIMULA allows the CLASS concept which is similar to Java's class and also embodies procedures and functions which can act on that data type.

An example of typical layout:


ptr_to_node = node

            node = record

                         info:integer

                        fptr:ptr_to_node

                        bptr:ptr_to_node

                  end

Note that none of the information in Table 2 is relevant to syntax analysis. The syntax of a language has little to do with the information in Table 2.

Example:        {

                char a;

                        a = 1;

                }

This is quite legitimate in terms of Java syntax although semantics is incorrect since a is of type character not integer.

Once the syntax analysis has been done, the information in the descriptors is crucial to check for TYPEing errors, and the like and for code generation.


Example:  x = y

     call EXPRESSION(    ) to get information re x;

     similarly call EXPRESSION (  ) re y.


     if(x.type in [inttype, booltype, chartype, realtype])

         and (x.type=y.type) then /*  OK  */

     else /*  TYPE conflict of OPERANDS  */

Each variable has a "type" and "kind" associated with it since it must have been pre-declared and hence a check can be made at same time for type conflicts.

What type is b? What is assumed? Some compilers will have a type "kind" = (,,, NOTYPE) to cater for this and hence in above if we had:

if b then a

' ';

The compiler may produce two errors: (1) unidentified variable, and (2) incorrect type for operand (since it is NOTYPE). Other compilers, when finding an unidentified variable, set its type accordingly, (in this case to BOOLEAN), so only one error message is produced.