The syntax analyser has to determine if a given "sentence" in the source language
Language is described by a finite number of
rules = SYNTAX of language
but can generate an infinite number of sentences.
The syntax analyser tests the sentences against the rules.
A computer program must be unambiguous if it is to produce correct, reproducible results. Therefore our syntax rules must be strict and, like our MU puzzle, we must not try to break away from these rules.
Note that in the above ambiguous sentence the use of
punctuation can remove the ambiguity. In the same way
punctuation (, ; : BEGIN END) can be used to remove ambiguities
from computer programs.
Many compilers are smart enough to be
able to supply missing semicolons, just as we are able to supply
meaning to slightly ambiguous sentences.
We can do that by noting the context of the ambiguity.
<sentence>::=<subject><predicate>
< > Non terminal symbols
::= is defined by
> < followed by
| choice of symbols
<subject>::= <article><adjective><noun>
<predicate>::= <verb><object>
<object>::= <article><adjective><noun>
<article>::= "the"|"a" )
<adjective>::= "blue"|"hairy-nosed"|"frightened" } terminal
<noun>::= "wombat"|"fish"|"house" } symbols
<verb>::= "flew"|"hunted" )
The hairy-nosed wombat hunted a frightened fish
A hairy-nosed fish hunted the frightened wombat
(1) Non terminal symbols i.e. <_> are written as
identifiers without angle brackets but in CAPITALS.
(2) Terminal symbols are in lower case letters and
their values are in quotation marks if necessary.
(3) Braces { } are used to denote zero or more occurrences.
(4) Square brackets [] are used to denote 0 or 1 occurrence.
<statement>::=[<label>":"]<unlabelled statement>
or
<statement>::=<unlabelled statement>|<label>":"<unlabelled statement>
(1) A vocabulary T of terminal symbols
e.g. IF, THEN, ELSE, BEGIN, END, REPEAT
e.g. <expression> <factor> <term> <block>
e.g. <expression>::=
<simple expr><relational operator><simple expr>
| <simple expr>
e.g. <program>::=
Starting symbol in MU puzzle was MI. We can generate one
of set of possible states by adding U, MI -> MIU,
then by doubling MIU -> MIUIU
(1) (3)
MI --> MIU --> MIUIU etc.