Lexical conventions

A Metaslang text consists of a sequence of symbols, possibly interspersed with whitespace. The term whitespace refers to any non-empty sequence of spaces, tabs, newlines, and comments (described below). A symbol is presented in the text as a sequence of one or more "marks" (ASCII characters). Within a composite (multi-mark) symbol, no whitespace is allowed, but whitespace may be needed between two symbols if together they could be taken for one symbol; in particular, two names that follow each other should be separated by whitespace. More precisely, whitespace is required between two adjacent symbols for each of the following combinations, in which "abc" stands for an arbitrary word-symbol, "<*>" stands for an arbitrary non-word-symbol, "?:!" stands for an arbitrary non-word-symbol starting with a ?-mark, and "123" stands for an arbitrary literal (see below for the definitions of the various classes of symbols):
    abc  abc
    abc  ?:!
    abc  123
    <*>  <*>
    123  abc
    123  123
    abc  _
      (  *
Apart from the last two cases, no whitespace is ever needed adjacent to a special-symbol.

Inside literals (constant-denotations) whitespace is also disallowed, except for "significant-whitespace" as described under String-literals.

Other than that, whitespace -- or the lack of it -- has no significance. Whitespace can be used to lay-out the text for readability, but as far as only the meaning is concerned, the following two presentations of the same spec are entirely equivalent:
    spec
      sort Even
      op next : Even -> Even
      axiom nextEffect is
        fa(x : Even) ~(next x = x)
    endspec

    spec sort   Even op   next : Even -> Even axiom nextEffect
    is fa(x : Even)~(next     x            = x)endspec

Symbols and Names

symbol ::= name | literal | special-symbol

name ::= word-symbol | non-word-symbol

word-symbol ::= word-start-mark { word-continue-mark }*

word-start-mark ::= letter

word-continue-mark ::=
    letter | decimal-digit | _ | ?

letter ::=
      A | B | C | D | E | F | G | H | I | J | K | L | M
    | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
    | a | b | c | d | e | f | g | h | i | j | k | l | m
    | n | o | p | q | r | s | t | u | v | w | x | y | z

decimal-digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

non-word-symbol ::= non-word-mark { non-word-mark }*

non-word-mark ::=
      ` | ~ | ! | @ | $ | ^ | & | * | -
    | = | + | \ | | | : | < | > | / | ?

special-symbol ::= _ | ( | ) | [ | ] | { | } | ; | , | .
Sample names:
    Date                    $$            ?!
    yymmdd2date             <*>           :=:
    well_ordered?           ~==           
For convenience, here are the 14 printing ASCII marks that, next to letters and decimal-digits, can not occur in a non-word-symbol:
      #    %    '    "    _    (    )
      [    ]    {    }    ;    ,    .
Restriction. As mentioned before, no whitespace is allowed in symbols: while anode is a single name, both a node and an ode consist of two names. Further, the case (lower or upper) of letters in names is significant: grandparent, grandParent and grandpaRent are three different names.

Restriction. In general, names can be chosen freely. However, the following reserved words have a special meaning and must not be used for names:
    as            endspec       infixr        relax
    axiom         ex            is            restrict
    case          fa            let           sort
    choose        false         morphism      spec
    colimit       fn            obligations   then
    conjecture    from          of            theorem
    def           generate      op            translate
    diagram       if            project       true
    else          import        prove         where
    embed         in            qualifying
    embed?        infixl        quotient
They each count as a single symbol, and no whitespace is allowed inside any reserved word.

Likewise, the following reserved non-words must not be used for names:
    :       ::      |
    <-      ->      +->
In addition, several names -- for example = -- are pre-defined in built-in libraries. While strictly speaking not reserved, they must not be redefined. See further the Libraries Appendix.

The non-word-symbols can be used to choose convenient names for infix operators that, conventionally, are written with non-alphabetic marks.

Some Metaslang users follow the convention of using names that start with a capital letter for unit-identifiers and sort-names and for constructors, while word-symbols chosen for op-names and field-names start with a lower-case letter. Both plain local-variables and local-sort-variables are often chosen to be single lower-case letters: x, y, z, a, b, c, with the start of the alphabet preferred for local-sort-variables. Op-names of predicates (that is, having some sort S -> Boolean) often end with the mark ?. These are just conventions that users are free to follow or ignore, but in particular some convention distinguishing constructors from op-names and local-variables is recommended.

Comments

comment ::= line-end-comment | block-comment

line-end-comment ::= % line-end-comment-body

line-end-comment-body ::=
    any-text-up-to-end-of-line

block-comment ::= (* block-comment-body *)

block-comment-body ::=
    any-text-including-newlines-and-nested-block-comments
Sample comments:
    % keys must be unique
    (* op yymmdd2Date : String -> Date *)
Metaslang allows two styles of comments. The %-style is light-weight, for adding comment on a line after the formal text (or taking a line on its own, but always confined to a single line). The (*...*)-style can be used for blocks of text, spanning several lines, or stay within a line. Any text remaining on the line after the closing *) is processed as formal text. Block-comments may be nested, so the pairs of brackets (* and *) must be balanced.

A block-comment can not contain a line-end-comment and vice versa: whichever starts first has "the right of way". For example, (* 100 % or more! *) is a block-comment with block-comment-body 100 % or more! . The % here is a mark like any other; it does not introduce a line-end-comment. Conversely, in the line-end-comment % op <*> stands for (*) the (* is part of the line-end-comment-body; it does not introduce a block-comment. Note also that % and (* have no special significance in literals (which may not contain whitespace, including comments): "100 % or more!" is a well-formed string-literal.