Conventions

WebAssembly is a programming language that has multiple concrete representations (its binary format and the text format). Both map to a common structure. For conciseness, this structure is described in the form of an abstract syntax. All parts of this specification are defined in terms of this abstract syntax.

Grammar Notation

The following conventions are adopted in defining grammar rules for abstract syntax.

  • Terminal symbols (atoms) are written in sans-serif font: \(\mathsf{i32}, \mathsf{end}\).

  • Nonterminal symbols are written in italic font: \(\mathit{valtype}, \mathit{instr}\).

  • \(A^n\) is a sequence of \(n\geq 0\) iterations of \(A\).

  • \(A^\ast\) is a possibly empty sequence of iterations of \(A\). (This is a shorthand for \(A^n\) used where \(n\) is not relevant.)

  • \(A^+\) is a non-empty sequence of iterations of \(A\). (This is a shorthand for \(A^n\) where \(n \geq 1\).)

  • \(A^?\) is an optional occurrence of \(A\). (This is a shorthand for \(A^n\) where \(n \leq 1\).)

  • Productions are written \(\mathit{sym} ::= A_1 ~|~ \dots ~|~ A_n\).

  • Large productions may be split into multiple definitions, indicated by ending the first one with explicit ellipses, \(\mathit{sym} ::= A_1 ~|~ \dots\), and starting continuations with ellipses, \(\mathit{sym} ::= \dots ~|~ A_2\).

  • Some productions are augmented with side conditions in parentheses, “\((\mathrel{\mbox{if}} \mathit{condition})\)”, that provide a shorthand for a combinatorial expansion of the production into many separate cases.

Auxiliary Notation

When dealing with syntactic constructs the following notation is also used:

  • \(\epsilon\) denotes the empty sequence.

  • \(|s|\) denotes the length of a sequence \(s\).

  • \(s[i]\) denotes the \(i\)-th element of a sequence \(s\), starting from \(0\).

  • \(s[i \href{../syntax/conventions.html#notation-slice}{\mathrel{\mathbf{:}}} n]\) denotes the sub-sequence \(s[i]~\dots~s[i+n-1]\) of a sequence \(s\).

  • \(s \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} [i] = A\) denotes the same sequence as \(s\), except that the \(i\)-th element is replaced with \(A\).

  • \(s \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} [i \href{../syntax/conventions.html#notation-slice}{\mathrel{\mathbf{:}}} n] = A^n\) denotes the same sequence as \(s\), except that the sub-sequence \(s[i \href{../syntax/conventions.html#notation-slice}{\mathrel{\mathbf{:}}} n]\) is replaced with \(A^n\).

  • \(\href{../syntax/conventions.html#notation-concat}{\mathrm{concat}}(s^\ast)\) denotes the flat sequence formed by concatenating all sequences \(s_i\) in \(s^\ast\).

Moreover, the following conventions are employed:

  • The notation \(x^n\), where \(x\) is a non-terminal symbol, is treated as a meta variable ranging over respective sequences of \(x\) (similarly for \(x^\ast\), \(x^+\), \(x^?\)).

  • When given a sequence \(x^n\), then the occurrences of \(x\) in a sequence written \((A_1~x~A_2)^n\) are assumed to be in point-wise correspondence with \(x^n\) (similarly for \(x^\ast\), \(x^+\), \(x^?\)). This implicitly expresses a form of mapping syntactic constructions over a sequence.

Productions of the following form are interpreted as records that map a fixed set of fields \(\mathsf{field}_i\) to “values” \(A_i\), respectively:

\[\mathit{r} ~::=~ \{ \mathsf{field}_1~A_1, \mathsf{field}_2~A_2, \dots \}\]

The following notation is adopted for manipulating such records:

  • \(r.\mathsf{field}\) denotes the contents of the \(\mathsf{field}\) component of \(r\).

  • \(r \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} \mathsf{field} = A\) denotes the same record as \(r\), except that the contents of the \(\mathsf{field}\) component is replaced with \(A\).

  • \(r_1 \href{../syntax/conventions.html#notation-compose}{\oplus} r_2\) denotes the composition of two records with the same fields of sequences by appending each sequence point-wise:

    \[\{ \mathsf{field}_1\,A_1^\ast, \mathsf{field}_2\,A_2^\ast, \dots \} \href{../syntax/conventions.html#notation-compose}{\oplus} \{ \mathsf{field}_1\,B_1^\ast, \mathsf{field}_2\,B_2^\ast, \dots \} = \{ \mathsf{field}_1\,A_1^\ast~B_1^\ast, \mathsf{field}_2\,A_2^\ast~B_2^\ast, \dots \}\]
  • \(\href{../syntax/conventions.html#notation-compose}{\bigoplus} r^\ast\) denotes the composition of a sequence of records, respectively; if the sequence is empty, then all fields of the resulting record are empty.

The update notation for sequences and records generalizes recursively to nested components accessed by “paths” \(\mathit{pth} ::= ([\dots] \;| \;.\mathsf{field})^+\):

  • \(s \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} [i]\,\mathit{pth} = A\) is short for \(s \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} [i] = (s[i] \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} \mathit{pth} = A)\).

  • \(r \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} \mathsf{field}\,\mathit{pth} = A\) is short for \(r \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} \mathsf{field} = (r.\mathsf{field} \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} \mathit{pth} = A)\).

where \(r \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}}~.\mathsf{field} = A\) is shortened to \(r \href{../syntax/conventions.html#notation-replace}{\mathrel{\mbox{with}}} \mathsf{field} = A\).

Vectors

Vectors are bounded sequences of the form \(A^n\) (or \(A^\ast\)), where the \(A\) can either be values or complex constructions. A vector can have at most \(2^{32}-1\) elements.

\[\begin{split}\begin{array}{lllll} \def\mathdef1445#1{{}}\mathdef1445{vector} & \href{../syntax/conventions.html#syntax-vec}{\mathit{vec}}(A) &::=& A^n & (\mathrel{\mbox{if}} n < 2^{32})\\ \end{array}\end{split}\]