Conventions¶
Validation checks that a WebAssembly module is well-formed. Only valid modules can be instantiated.
Validity is defined by a type system over the abstract syntax of a module and its contents. For each piece of abstract syntax, there is a typing rule that specifies the constraints that apply to it. All rules are given in two equivalent forms:
In prose, describing the meaning in intuitive form.
In formal notation, describing the rule in mathematical form. [1]
Note
The prose and formal rules are equivalent, so that understanding of the formal notation is not required to read this specification. The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.
In both cases, the rules are formulated in a declarative manner. That is, they only formulate the constraints, they do not define an algorithm. The skeleton of a sound and complete algorithm for type-checking instruction sequences according to this specification is provided in the appendix.
Types¶
To define the semantics, the definition of some sorts of types is extended to include additional forms. By virtue of not being representable in either the binary format or the text format, these forms cannot be used in a program; they only occur during validation or execution.
The unique value type
Note
No validation rule uses bottom types explicitly, but various rules can pick any value or heap type, including bottom. This ensures the existence of principal types, and thus a validation algorithm without back tracking.
A concrete heap type can consist of a defined type directly. this occurs as the result of substituting a type index with its definition.
A concrete heap type may also be a recursive type index.
Such an index refers to the
Finally, the representation of supertypes in a sub type is generalized from mere type indices to heap types. They occur as defined types or recursive type indices after substituting type indices or rolling up recursive types.
Note
It is an invariant of the semantics that sub types occur only in one of two forms: either as “syntactic” types as in a source module, where all supertypes are type indices, or as “semantic” types, where all supertypes are resolved to either defined types or recursive type indices.
A type of any form is closed when it does not contain a heap type that is a type index or a recursive type index without a surrounding recursive type, i.e., all type indices have been substituted with their defined type and all free recursive type indices have been unrolled.
Note
Recursive type indices are internal to a recursive type. They are distinguished from regular type indices and represented such that two closed types are syntactically equal if and only if they have the same recursive structure.
Convention¶
The difference
between two reference types is defined as follows:
Note
This definition computes an approximation of the reference type that is inhabited by all values from
Defined Types¶
Defined types denote the individual types defined in a module. Each such type is represented as a projection from the recursive type group it originates from, indexed by its position in that group.
Defined types do not occur in the binary or text format, but are formed by rolling up the recursive types defined in a module.
It is hence an invariant of the semantics that all recursive types occurring in defined types are rolled up.
Conventions¶
denotes the parallel substitution of type indices with defined types in type , provided . denotes the parallel substitution of recursive type indices with defined types in type , provided . is shorthand for the substitution , where .
Rolling and Unrolling¶
In order to allow comparing recursive types for equivalence, their representation is changed such that all type indices internal to the same recursive type are replaced by recursive type indices.
Note
This representation is independent of the type index space, so that it is meaningful across module boundaries. Moreover, this representation ensures that types with equivalent recursive structure are also syntactically equal, hence allowing a simple equality check on (closed) types. It gives rise to an iso-recursive interpretation of types.
The representation change is performed by two auxiliary operations on the syntax of recursive types:
Rolling up a recursive type substitutes its internal type indices with corresponding recursive type indices.
Unrolling a recursive type substitutes its recursive type indices with the corresponding defined types.
These operations are extended to defined types and defined as follows:
In addition, the following auxiliary function denotes the expansion of a defined type:
Instruction Types¶
Instruction types classify the behaviour of instructions or instruction sequences, by describing how they manipulate the operand stack and the initialization status of locals:
An instruction type
Note
Instruction types are only used for validation, they do not occur in programs.
Local Types¶
Local types classify locals, by describing their value type as well as their initialization status:
Note
Local types are only used for validation, they do not occur in programs.
Contexts¶
Validity of an individual definition is specified relative to a context, which collects relevant information about the surrounding module and the definitions in scope:
Types: the list of types defined in the current module.
Functions: the list of functions declared in the current module, represented by a defined type that expands to their function type.
Tables: the list of tables declared in the current module, represented by their table type.
Memories: the list of memories declared in the current module, represented by their memory type.
Globals: the list of globals declared in the current module, represented by their global type.
Tags: the list of tags declared in the current module, represented by their tag type.
Element Segments: the list of element segments declared in the current module, represented by the elements’ reference type.
Data Segments: the list of data segments declared in the current module, each represented by an
entry.Locals: the list of locals declared in the current function (including parameters), represented by their local type.
Labels: the stack of labels accessible from the current position, represented by their result type.
Return: the return type of the current function, represented as an optional result type that is absent when no return is allowed, as in free-standing expressions.
References: the list of function indices that occur in the module outside functions and can hence be used to form references inside them.
In other words, a context contains a sequence of suitable types for each index space, describing each defined entry in that space. Locals, labels and return type are only used for validating instructions in function bodies, and are left empty elsewhere. The label stack is the only part of the context that changes as validation of an instruction sequence proceeds.
More concretely, contexts are defined as records
In addition to field access written
When spelling out a context, empty fields are omitted.
denotes the same context as but with the elements prepended to its component sequence.
Note
Indexing notation like
Convention¶
Any form of type can be closed to bring it into closed form relative to a context it is valid in by substituting each type index
Prose Notation¶
Validation is specified by stylised rules for each relevant part of the abstract syntax. The rules not only state constraints defining when a phrase is valid, they also classify it with a type. The following conventions are adopted in stating these rules.
A phrase
is said to be “valid with type ” if and only if all constraints expressed by the respective rules are met. The form of depends on what is.Note
For example, if
is a function, then is a function type; for an that is a global, is a global type; and so on.The rules implicitly assume a given context
.In some places, this context is locally extended to a context
with additional entries. The formulation “Under context , … statement …” is adopted to express that the following statement must apply under the assumptions embodied in the extended context.
Formal Notation¶
Note
This section gives a brief explanation of the notation for specifying typing rules formally. For the interested reader, a more thorough introduction can be found in respective text books. [2]
The proposition that a phrase
The formal typing rules use a standard approach for specifying type systems, rendering them into deduction rules. Every rule has the following general form:
Such a rule is read as a big implication: if all premises hold, then the conclusion holds.
Some rules have no premises; they are axioms whose conclusion holds unconditionally.
The conclusion always is a judgment
Note
For example, the typing rule for the
The instruction is always valid with type
An instruction like
Here, the premise enforces that the immediate global index
Finally, a structured instruction requires a recursive rule, where the premise is itself a typing judgement:
A