# 1 The Syntax of ALS Prolog

- 1.1 Constants
- 1.2 Variables
- 1.3 Compund Terms
- 1.4 Curly Braces
- 1.5 Lists
- 1.6 Strings
- 1.7 Operators
- 1.8 Comments
- 1.9 Preprocessor Directives: Syntax

This chapter describes the syntax of ALS Prolog, which is for the most part the syntax of the ISO Prolog standard. Prolog syntax is quite simple and regular, which is a great strength.

## 1.1 Constants

The simplest Prolog data type is a constant, which comes in two flavors:

- atoms (sometimes called
)**symbols** - numbers

The notion of a *constant* corresponds roughly to the notion of a *name* in a natural
language. Names in natural languages refer to things (which covers a lot of
ground), and constants in Prolog are be used to refer to things when the language is
interpreted.

### 1.1.1 Numbers

Prolog uses two representations for numbers:

- integer
- floating point

When it is impossible to use an integer representation due to the size of a nominal
integer, a floating point representation can be used instead. This means that extremely large integers may actually require the extended precision of a floating point value. Any operation involving integers, such as a call to `is/2`

, will first attempt to use an integer representation for the result, and will use a floating point value only when necessary.
This type *coercion* is carried out consistently within the Prolog system.
There is no automatic conversion of floating point numbers into integers. (Note that the ISO Prolog standard now forbids this kind of conversion.)

#### Integers

The textual representation of an integer consists of a sequence of one or more digits (0 through 9) optionally preceded by a ‘-‘ to signify a negative number. The parser assumes that all integers are written using base ten, unless the special binary, octal, or hexadecimal notation is used.

The hexadecimal notation is a `0x`

followed by a sequence of valid hexadecimal digits. The following are valid hexadecimal digits:

```
0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F
```

The octal notation is a `0o`

followed by a sequence of valid octal digits. The octal digits are:

```
0 1 2 3 4 5 6 7
```

The binary notation is a `0b`

follwed by a sequence of 0’s and 1’s.
Here are some examples of integers:

```
0 4532 -273 0000001 0x1fff 0b1001 0o123
```

It is important to note that a term of the form `+5`

is not an integer, but instead is a structured term.

#### Floating point numbers

Floating point numbers are slightly more complex than integers in that they may
have either a fractional part, an exponent, or both. A * fractional floating point number* consists of a sequence of one or more numeric characters, followed by a dot (‘.’), in turn followed by another sequence of one or more numeric characters; the entire expression may optionally be preceded by a ‘-‘. Here are some examples of
floating point numbers:

```
0.0 3.1415927 -3.4 000023.540000
```

You can also specify an exponent using * scientific notation*. An exponent is either
an e or an E followed by an optional ‘-‘, signifying a negative exponent, followed
by a sequence of one or more numeric characters. Here are examples of floating
point numbers with exponents:

```
0.1e-3 10E99 -44.66e-88 0E-0
```

#### ASCII Codes

ASCII (American Standard Code for Information Interchange) codes are small integers between 0 and 255 inclusive that represent characters. The parser will translate any printable character into its corresponding ASCII integer. In order to get the ASCII code for a character, precede the character by the characters `0'`

. For example, the code for the characters ‘A’, ‘8’, and ‘%’ would be given by:

```
0'A 0'8 0'%
```

In addition, the ANSI C-style octal and hex forms expression can be used. Thus, all of the expressions below denote the number 65:

```
0'A 0'\101 0'\x41
```

The table below displays several example ASCII sequences:

Expr | Octal Expr | Hex Expr | ASCI Code | Char |
---|---|---|---|---|

0’A | 0’\101 | 0’\x41 | 65 | Upper case A |

0’c | 0’\143 | 0’\x63 | 99 | Lower case c |

0’~ | 0’\176 | 0’\x7e | 126 | Tilde character |

There also exists a small collection of symbolic control characters which can be thought of as synonyms for certain of the ASCI control character codes. These are presented in the following table:

Expr | Octal Expr | Hex Expr | ASCII Code | Char |
---|---|---|---|---|

0’\a | 0’\007 | 0’\x7 | 7 | alert (‘bell’) |

0’\b | 0’\010 | 0’x\8 | 8 | backspace |

0’\f | 0’\014 | 0’\xC | 12 | form feed |

0’\n | 0’\012 | 0’\xA | 10 | new line |

0’\r | 0’\015 | 0’\xD | 13 | return |

0’\t | 0’\011 | 0’\x9 | 9 | horizontal tab |

0’\v | 0’\147 | 0’\x77 | 119 | vertical tab |

#### Atoms

An atom is a sequence of characters that are parsed together as a constant.

##### Alphanumeric atoms

An alphanumeric atom is a sequence of characters that begins with a lower case letter, and is followed by zero or more alphanumeric characters, possibly including ‘_’. Here are some examples of alphanumeric atoms:

```
foobar123 zIPPY bread_and_butter money
```

##### Quoted atoms

A quoted atom is formed by placing any sequence of characters between single quotes (‘). A single quote can be included in the text of the atom by using two consecutive single quotes for each one desired, or by prefixing the embedded single quote with the backslash (\) escape character. The following are all quoted atoms:

```
'any char will do' '$*#!#@%#*'
'Can''t miss' 'Can\'t miss' '99999'
```

If the characters that compose a quoted atom can be interpreted as an atom when
they occur without the enclosing single quotes, then it is not necessary to use the
quoted form. However, if the atom contains characters that aren’t allowed in a simple atom, then the quotes are required. Note that the last example above is an atom
whose print name is `99999`

, not the integer `99999`

.
Quoted atoms can span multiple lines, but in this case the end of each such line must
be preceeded by the backslash escape character, as in the following example of an
atom:

```
'We are the stars which sing. \
We sing with our light; \
We are the birds of fire, \
We fly over the sky. \
-- Algonquin poem.'
```

##### Special atoms

A special atom is any sequence of characters from the following set:

```
+-*/\^<>=':.?@#&.
```

In addition, the atoms, `[]`

, `!`

, `;`

and `,`

are considered to be special atoms. Some oth-
er examples of special atoms are:

```
+= && @>= == <---------
```

Most special atoms are automatically read as quoted atoms unless they have been declared as operators (See Section 1.7 Operators {ADD LINK}).

## 1.2 Variables

A variable consists of either a _ (underbar character) or an upper case letter, followed by a sequence of alphanumeric characters and dollar signs. Here are some variables:

```
Variable X123a _a$bc _123 _
```

## 1.3 Compund Terms

A compound term is consists of a symbolic constant, called a functor, followed
by a left parenthesis followed by one or more terms separated by commas, followed
by a right parenthesis. The number of terms separated by commas enclosed in the
parentheses is called the *arity* of the structure. For example, the compound term

```
f(a,b(X),y)
```

has arity 3.

## 1.4 Curly Braces

Instead of prefixing a structured term with a functor, the curly brace notation allows a sequence of terms, separated by commas, to be grouped together in a comma list with ‘{}’ as the principal functor. For example,

```
{all,the,young,dudes}
```

parses internally into:

```
'{}'((all,the,young,dudes))
```

## 1.5 Lists

The simplest list is the empty list, represented by the atom ‘[]’. Any other list is
a structured term with `./2`

as principal functor and whose second argument is a list.
Lists can be written by using ‘.’ explicitly as a functor, or using the special *list* notation.
A list using list notation is written as a `[`

followed by the successive first arguments
of all the sublists in order seperated by commas, followed by `]`

. The following are
all different ways of writing the same list:

```
a.b.c.[]
[a,b,c]
'.'(a,'.'(b,'.'(c,[])))
```

Unless specified, the last tail of a list is assumed to be `[]`

. A tail of a list can be
specified explicitly by using `|`

, as in these examples:

```
[a|X]
[1,2,3|[]]
[Head|Tail]
```

The list notation for lists is preferrable to using ‘.’ explicitly because the dot is also used in floating point numbers and to signal termination of input terms.

## 1.6 Strings

A string is any sequence of characters enclosed in double quotes (“). The parser automatically translates any string into the list of ASCII codes that corresponds to the characters between the quotes. For example, the string

```
"It's a dog's life"
```

is translated into

```
[73,116,39,115,32,97,32,100,111,103,39,115,32,108,10,5,102,101]
```

Double quotes can be embedded in strings by either repeating the double quote or by using the backslash escape character before the embedded “, as for example in

```
"She said, ""hi.""".
"She said, \"hi.\"".
```

## 1.7 Operators

The prefix functor notation is convenient for writing terms with many arguments. However, Prolog allows a program to define a more readable syntax for structured terms with one or arguments. For example, the parser recognizes the text

```
a+b+c
```

as an expression representing

```
+(+(a,b),c)
```

because the special atom + is declared as an infix operator. Infix operators are written between their two arguments. For the other operator types, prefix and postfix, the operator (functor) is written before (prefix) or after (postfix) the single argument to the term.

#### What Makes an Operator?

Operators are either alphanumeric atoms or special atoms which have a corresponding *precedence* and *associativity*. The associativity is sometimes referred to as the *type* of an operator. Operators may be declared by using the `op/3`

builtin. Precedences range from 1 to 1200 with the lower precedences having the tightest binding. Another way of looking at this is that in an expression such as `1*X+Y`

, the
operator with the highest precedence will be the principal functor. So `1*X+Y`

is
equivalent to `'+'('*'(1,X),Y)`

because the ‘*’ binds tighter than the ‘+’. The types of operators are named

```
fx, fy, xf, yf, xfx, yfx, and xfy,
```

where the ‘f’ shows the position of the operator. Hence, `fx`

and `fy`

indicate prefix
operators, `yf`

, and `xf`

indicate postfix operators, and `xfx`

, `yfx`

, and `xfy`

indicate
infix operators. An ‘x’ indicates that the operator will not associate with operators
of the same or greater precedence, while a ‘ y’ indicates that it will associate with
operators of the same or lower precedence, but not operators of greater precedence. The table below describes all of the predefined binary operators in ALS Prolog:

Operator | Specifier | Precedence | Operator | Specifier | Precedence |
---|---|---|---|---|---|

:- | xfx | 1200 | =:= | xfx | 700 |

–> | xfx | 1200 | == | xfx | 700 |

==> | xfy | 1200 | < | xfx | 700 |

when | xfx | 1190 | =< | xfx | 700 |

where | xfx | 1180 | > | xfx | 700 |

with | xfx | 1170 | >= | xfx | 700 |

if | xfx | 1160 | := | xfy | 600 |

; | xfy | 1100 | + | yfx | 500 |

| | xfy | 1100 | - | yfx | 500 |

-> | xfy | 1050 | /\ | yfx | 500 |

, | xfy | 1000 | \/ | yfx | 500 |

: | xfy | 950 | xor | yfx | 500 |

. | xfy | 800 | or | yfx | 500 |

= | xfx | 700 | and | yfx | 500 |

= | xfx | 700 | * | yfx | 400 |

== | xfx | 700 | / | yfx | 400 |

== | xfx | 700 | // | yfx | 400 |

@< | xfx | 700 | div | yfx | 400 |

@=< | xfx | 700 | rem | yfx | 400 |

@> | xfx | 700 | mod | yfx | 400 |

@>= | xfx | 700 | « | yfx | 400 |

=.. | xfx | 700 | » | yfx | 400 |

is | xfx | 700 | ** | xfx | 200 |

^ | xfy | 200 |

The next table describes the predefined prefix operators for ALS Prolog:

Operator | Specifier | Precedence | Operator | Specifier | Precedence |
---|---|---|---|---|---|

:- | fx | 1200 | nospy | fx | 800 |

?- | fx | 1200 | - | fy | 200 |

vi | fx | 1125 | + | fy | 200 |

edit | fx | 1125 | \ | fy | 200 |

ls | fx | 1125 | export | fx | 1200 |

cd | fx | 1125 | use | fx | 1200 |

dir | fx | 1125 | module | fx | 1200 |

not | fx | 900 | ’’ | fx | 925 |

+ | fx | 900 | ’ | fx | 930 |

trace | fx | 800 | ~ | fy | 300 |

spy | fx | 800 |

### Special Cases

It is possible to declare an operator via `op/3`

that can never be parsed. Even though quoted atoms can be assigned a precedence and associativity, the parser will only interpret alphanumeric atoms or special atoms as operators.

### White space

*White space*, or *layout characters*, refers to the part of source code, data, and goals that is not made up of readable characters. The term white space comes from the fact that these unreadable characters appear white when source code is printed on a sheet of white paper. White space is any sequence of spaces, tabs, or new lines. ￼￼￼Generally speaking, white space has little meaning to the parser. It is occasionally important for recognizing full stops, and for delimiting constructs which, if they were run together, would not be recognizable as separate constructs. There are also places where additional white space is either inappropriate or changes the meaning of the text. For example, you can’t embed a space in a number.

## 1.8 Comments

Comments can be put anywhere white space can occur. Comments can take one of two forms:

- A line comment: anything following a percent sign (%) is ignored until the end of line.
- A block comment: anything enclosed in a ‘/* */’ pair is ignored. Block com- ments may span many lines if desired. Block comments may be nested, thus allowing commented code to be commented out.

```
/*
*
* /*
* * This is one way to use block comments
* */
*
*/
connected(foot bone, leg bone).
/* Here's another */
connected(headbone, neckbone). % line comments can
connected(doggone, fishbone). % look good
% next to code that
% you write
```

## 1.9 Preprocessor Directives: Syntax

ALS Prolog supports * preprocessor directives* which can affect the text at the time the program is compiled (or loaded into an image). These expressions include the
following:

```
#include #if #else #elif #endif
```

Each of these must occur at the beginning of a line of program text. Each of `#include`

, `#if`

, and `#elif`

must be followed by a Prolog term, but each of `#else`

and `#endif`

must stand on a line by themselves. The `#include`

directive should be followed by a Prolog double quoted string, intended to name a file:

```
#include “/mydir/foo.pro”
```

No full stop (.) should follow this expression, nor the expressions following `#if`

and `#elif`

. The expression following `#if`

or `#elif`

can be an arbitrary Prolog term.

The expressions `#if`

, `#else`

, `#elif`

, `#endif`

must be organized as conditionals in a manner similar to their use in C programs. Thus, the first expression occurring must be an `#if`

, and the last must be an `#endif`

. Between them there can be zero or more occurrences of `#else`

and `#elif`

. There can be at most one occurrence of #else between a given `#if ... #endif`

pair, and it must follow all of the zero or more occurrences of `#elif`

between the same pair. Preprocessor directive semantics appears in Section 2.3 Preprocessor Directives: Semantics. {ADD LINK}