Type Soundness¶

The type system of WebAssembly is sound, implying both type safety and memory safety with respect to the WebAssembly semantics. For example:

All types declared and derived during validation are respected at run time; e.g., every local or global variable will only contain type-correct values, every instruction will only be applied to operands of the expected type, and every function invocation always evaluates to a result of the right type (if it does not trap or diverge).
No memory location will be read or written except those explicitly defined by the program, i.e., as a local, a global, an element in a table, or a location within a linear memory.
There is no undefined behavior, i.e., the execution rules cover all possible cases that can occur in a valid program, and the rules are mutually consistent.

Soundness also is instrumental in ensuring additional properties, most notably, encapsulation of function and module scopes: no locals can be accessed outside their own function and no module components can be accessed outside their own module unless they are explicitly exported or imported.

The typing rules defining WebAssembly validation only cover the static components of a WebAssembly program. In order to state and prove soundness precisely, the typing rules must be extended to the dynamic components of the abstract runtime, that is, the store, configurations, and administrative instructions. [1]

Contexts¶

In order to check rolled up recursive types, the context is locally extended with an additional component that records the sub type corresponding to each recursive type index within the current recursive type:

\begin{array}{r} \begin{array}{llll} C & ::= & {\dots, recs {subtype}^{*}} \end{array} \end{array}

Types¶

Well-formedness for extended type forms is defined as follows.

Heap Type $bot$ ¶

The heap type is valid.

\frac{}{C ⊢ bot ok}

Heap Type $rec i$ ¶

The recursive type index $i$ must exist in $C . recs$ .
Then the heap type is valid.

\frac{C . recs [i] = subtype}{C ⊢ rec i ok}

Value Type $bot$ ¶

The value type is valid.

\frac{}{C ⊢ bot ok}

Recursive Types $rec {subtype}^{*}$ ¶

Let $C^{'}$ be the current context $C$ , but where $recs$ is ${subtype}^{*}$ .
There must be a type index $x$ , such that for each sub type ${subtype}_{i}$ in ${subtype}^{*}$ :
- Under the context $C^{'}$ , the sub type ${subtype}_{i}$ must be valid for type index $x + i$ and recursive type index $i$ .
Then the recursive type is valid for the type index $x$ .

\frac{C, recs {subtype}^{*} ⊢ rec {subtype}^{*} ok (x, 0)}{C ⊢ rec {subtype}^{*} ok (x)}

\frac{}{C ⊢ rec ϵ ok (x, i)} \frac{C ⊢ subtype ok (x, i) C ⊢ rec {subtype}^{'}^{*} ok (x + 1, i + 1)}{C ⊢ rec subtype {subtype}^{'}^{*} ok (x, i)}

Note

These rules are a generalisation of the ones previously given.

Sub types $sub {final}^{?} {ht}^{*} comptype$ ¶

The composite type $comptype$ must be valid.
The sequence ${ht}^{*}$ may be no longer than $1$ .
For every heap type ${ht}_{k}$ in ${ht}^{*}$ :
- The heap type ${ht}_{k}$ must be ordered before a type index $x$ and recursive type index a $i$ , meaning:
  - Either ${ht}_{k}$ is a defined type.
  - Or ${ht}_{k}$ is a type index $y_{k}$ that is smaller than $x$ .
  - Or ${ht}_{k}$ is a recursive type index $rec j_{k}$ where $j_{k}$ is smaller than $i$ .
- Let sub type ${subtype}_{k}$ be the unrolling of the heap type ${ht}_{k}$ , meaning:
  - Either ${ht}_{k}$ is a defined type ${deftype}_{k}$ , then ${subtype}_{k}$ must be the unrolling of ${deftype}_{k}$ .
  - Or ${ht}_{k}$ is a type index $y_{k}$ , then ${subtype}_{k}$ must be the unrolling of the defined type $C . types [y_{k}]$ .
  - Or ${ht}_{k}$ is a recursive type index $rec j_{k}$ , then ${subtype}_{k}$ must be $C . recs [j_{k}]$ .
- The sub type ${subtype}_{k}$ must not contain $final$ .
- Let ${comptype}_{k}^{'}$ be the composite type in ${subtype}_{k}$ .
- The composite type $comptype$ must match ${comptype}_{k}^{'}$ .
Then the sub type is valid for the type index $x$ and recursive type index $i$ .

\begin{array}{r} \frac{\begin{array}{c} | {ht}^{*} | \leq 1 (ht ≺ x, i)^{*} ({unroll}_{C} (ht) = sub {ht}^{'}^{*} {comptype}^{'})^{*} \\ C ⊢ comptype ok (C ⊢ comptype \leq {comptype}^{'})^{*} \end{array}}{C ⊢ sub {final}^{?} {ht}^{*} comptype ok (x, i)} \end{array}

where:

\begin{array}{r} \begin{array}{lll} (deftype ≺ x, i) & = & true \\ (y ≺ x, i) & = & y < x \\ (rec j ≺ x, i) & = & j < i \\ [2 e x] {unroll}_{C} (deftype) & = & unroll (deftype) \\ {unroll}_{C} (y) & = & unroll (C . types [y]) \\ {unroll}_{C} (rec j) & = & C . recs [j] \end{array} \end{array}

Note

This rule is a generalisation of the ones previously given, which only allowed type indices as supertypes.

Subtyping¶

In a rolled-up recursive type, a recursive type indices $rec i$ matches another heap type $ht$ if:

Let $sub {final}^{?} {ht}^{'}^{*} comptype$ be the sub type $C . recs [i]$ .
The heap type $ht$ is contained in ${ht}^{'}^{*}$ .

\frac{C . recs [i] = sub {final}^{?} ({ht}_{1}^{*} ht {ht}_{2}^{*}) comptype}{C ⊢ rec i \leq ht}

Note

This rule is only invoked when checking validity of rolled-up recursive types.

Results¶

Results can be classified by result types as follows.

Results ${val}^{*}$ ¶

For each value ${val}_{i}$ in ${val}^{*}$ :
- The value ${val}_{i}$ is valid with some value type $t_{i}$ .
Let $t^{*}$ be the concatenation of all $t_{i}$ .
Then the result is valid with result type $[t^{*}]$ .

\frac{(S ⊢ val : t)^{*}}{S ⊢ {val}^{*} : [t^{*}]}

Results $trap$ ¶

The result is valid with result type $[t^{*}]$ , for any valid closed result types.

\frac{⊢ [t^{*}] ok}{S ⊢ trap : [t^{*}]}

Store Validity¶

The following typing rules specify when a runtime store $S$ is valid. A valid store must consist of function, table, memory, global, and module instances that are themselves valid, relative to $S$ .

To that end, each kind of instance is classified by a respective function, table, memory, or global type. Module instances are classified by module contexts, which are regular contexts repurposed as module types describing the index spaces defined by a module.

Store $S$ ¶

Each function instance ${funcinst}_{i}$ in $S . funcs$ must be valid with some function type ${functype}_{i}$ .
Each table instance ${tableinst}_{i}$ in $S . tables$ must be valid with some table type ${tabletype}_{i}$ .
Each memory instance ${meminst}_{i}$ in $S . mems$ must be valid with some memory type ${memtype}_{i}$ .
Each global instance ${globalinst}_{i}$ in $S . globals$ must be valid with some global type ${globaltype}_{i}$ .
Each element instance ${eleminst}_{i}$ in $S . elems$ must be valid with some reference type ${reftype}_{i}$ .
Each data instance ${datainst}_{i}$ in $S . datas$ must be valid.
Each structure instance ${structinst}_{i}$ in $S . structs$ must be valid.
Each array instance ${arrayinst}_{i}$ in $S . arrays$ must be valid.
No reference to a bound structure address must be reachable from itself through a path consisting only of indirections through immutable structure or array fields.
No reference to a bound array address must be reachable from itself through a path consisting only of indirections through immutable structure or array fields.
Then the store is valid.

\begin{array}{r} \frac{\begin{array}{c} (S ⊢ funcinst : deftype)^{*} (S ⊢ tableinst : tabletype)^{*} \\ (S ⊢ meminst : memtype)^{*} (S ⊢ globalinst : globaltype)^{*} \\ (S ⊢ eleminst : reftype)^{*} (S ⊢ datainst ok)^{*} \\ (S ⊢ structinst ok)^{*} (S ⊢ arrayinst ok)^{*} \\ S = {\begin{cases} funcs {funcinst}^{*}, globals {globalinst}^{*}, tables {tableinst}^{*}, mems {meminst}^{*}, \\ elems {eleminst}^{*}, datas {datainst}^{*}, structs {structinst}^{*}, arrays {arrayinst}^{*}} \end{cases} \\ (S . structs [a_{s}] = structinst)^{*} ((ref . struct a_{s}) {≫̸}_{S}^{+} (ref . struct a_{s}))^{*} \\ (S . arrays [a_{a}] = arrayinst)^{*} ((ref . array a_{a}) {≫̸}_{S}^{+} (ref . array a_{a}))^{*} \end{array}}{⊢ S ok} \end{array}

where ${val}_{1} ≫_{S}^{+} {val}_{2}$ denotes the transitive closure of the following reachability relation on values:

\begin{array}{r} \begin{array}{lcll} (ref . struct a) & ≫_{S} & S . structs [a] . fields [i] & if expand (S . structs [a] . type) = struct {ft}_{1}^{i} (const st) {ft}_{2}^{*} \\ (ref . array a) & ≫_{S} & S . arrays [a] . fields [i] & if expand (S . arrays [a] . type) = array (const st) \\ (ref . extern ref) & ≫_{S} & ref \end{array} \end{array}

Note

The constraint on reachability through immutable fields prevents the presence of cyclic data structures that can not be constructed in the language. Cycles can only be formed using mutation.

Function Instances ${type functype, module moduleinst, code func}$ ¶

The function type $functype$ must be valid under an empty context.
The module instance $moduleinst$ must be valid with some context $C$ .
Under context $C$ :
- The function $func$ must be valid with some function type ${functype}^{'}$ .
- The function type ${functype}^{'}$ must match $functype$ .
Then the function instance is valid with function type $functype$ .

\begin{array}{r} \frac{\begin{array}{c} ⊢ functype ok S ⊢ moduleinst : C \\ C ⊢ func : {functype}^{'} C ⊢ {functype}^{'} \leq functype \end{array}}{S ⊢ {type functype, module moduleinst, code func} : functype} \end{array}

Host Function Instances ${type functype, hostcode hf}$ ¶

The function type $functype$ must be valid under an empty context.
Let $[t_{1}^{*}] \to [t_{2}^{*}]$ be the function type $functype$ .
For every valid store $S_{1}$ extending $S$ and every sequence ${val}^{*}$ of values whose types coincide with $t_{1}^{*}$ :
- Executing $hf$ in store $S_{1}$ with arguments ${val}^{*}$ has a non-empty set of possible outcomes.
- For every element $R$ of this set:
  - Either $R$ must be $⊥$ (i.e., divergence).
  - Or $R$ consists of a valid store $S_{2}$ extending $S_{1}$ and a result $result$ whose type coincides with $[t_{2}^{*}]$ .
Then the function instance is valid with function type $functype$ .

\begin{array}{r} \frac{\begin{array}{l} ⊢ [t_{1}^{*}] \to [t_{2}^{*}] ok \end{array} \begin{array}{l} \forall S_{1}, {val}^{*}, ⊢ S_{1} ok \land ⊢ S ⪯ S_{1} \land S_{1} ⊢ {val}^{*} : [t_{1}^{*}] ⟹ \\ hf (S_{1}; {val}^{*}) \supset \emptyset \land \\ \forall R \in hf (S_{1}; {val}^{*}), R = ⊥ \lor \\ \exists S_{2}, result, ⊢ S_{2} ok \land ⊢ S_{1} ⪯ S_{2} \land S_{2} ⊢ result : [t_{2}^{*}] \land R = (S_{2}; result) \end{array}}{S ⊢ {type [t_{1}^{*}] \to [t_{2}^{*}], hostcode hf} : [t_{1}^{*}] \to [t_{2}^{*}]} \end{array}

Note

This rule states that, if appropriate pre-conditions about store and arguments are satisfied, then executing the host function must satisfy appropriate post-conditions about store and results. The post-conditions match the ones in the execution rule for invoking host functions.

Any store under which the function is invoked is assumed to be an extension of the current store. That way, the function itself is able to make sufficient assumptions about future stores.

Table Instances ${type (limits t), elem {ref}^{*}}$ ¶

The table type $limits t$ must be valid under the empty context.
The length of ${ref}^{*}$ must equal $limits . \min$ .
For each reference ${ref}_{i}$ in the table’s elements ${ref}^{n}$ :
- The reference ${ref}_{i}$ must be valid with some reference type $t_{i}^{'}$ .
- The reference type $t_{i}^{'}$ must match the reference type $t$ .
Then the table instance is valid with table type $limits t$ .

\frac{⊢ limits t ok n = limits . \min (S ⊢ ref : t^{'})^{n} (⊢ t^{'} \leq t)^{n}}{S ⊢ {type (limits t), elem {ref}^{n}} : limits t}

Memory Instances ${type limits, data b^{*}}$ ¶

The memory type $limits$ must be valid under the empty context.
The length of $b^{*}$ must equal $limits . \min$ multiplied by the page size $64 Ki$ .
Then the memory instance is valid with memory type $limits$ .

\frac{⊢ limits ok n = limits . \min \cdot 64 Ki}{S ⊢ {type limits, data b^{n}} : limits}

Global Instances ${type (mut t), value val}$ ¶

The global type $mut t$ must be valid under the empty context.
The value $val$ must be valid with some value type $t^{'}$ .
The value type $t^{'}$ must match the value type $t$ .
Then the global instance is valid with global type $mut t$ .

\frac{⊢ mut t ok S ⊢ val : t^{'} ⊢ t^{'} \leq t}{S ⊢ {type (mut t), value val} : mut t}

Element Instances ${type t, elem {ref}^{*}}$ ¶

The reference type $t$ must be valid under the empty context.
For each reference ${ref}_{i}$ in the elements ${ref}^{n}$ :
- The reference ${ref}_{i}$ must be valid with some reference type $t_{i}^{'}$ .
- The reference type $t_{i}^{'}$ must match the reference type $t$ .
Then the element instance is valid with reference type $t$ .

\frac{⊢ t ok (S ⊢ ref : t^{'})^{*} (⊢ t^{'} \leq t)^{*}}{S ⊢ {type t, elem {ref}^{*}} : t}

Data Instances ${data b^{*}}$ ¶

The data instance is valid.

\frac{}{S ⊢ {data b^{*}} ok}

Structure Instances ${type deftype, fields {fieldval}^{*}}$ ¶

The defined type $deftype$ must be valid.
The expansion of $deftype$ must be a structure type $struct {fieldtype}^{*}$ .
The length of the sequence of field values ${fieldval}^{*}$ must be the same as the length of the sequence of field types ${fieldtype}^{*}$ .
For each field value ${fieldval}_{i}$ in ${fieldval}^{*}$ and corresponding field type ${fieldtype}_{i}$ in ${fieldtype}^{*}$ :
- Let ${fieldtype}_{i}$ be $mut {storagetype}_{i}$ .
- The field value ${fieldval}_{i}$ must be valid with storage type ${storagetype}_{i}$ .
Then the structure instance is valid.

\frac{⊢ dt ok expand (dt) = struct (mut st)^{*} (S ⊢ fv : st)^{*}}{S ⊢ {type dt, fields {fv}^{*}} ok}

Array Instances ${type deftype, fields {fieldval}^{*}}$ ¶

The defined type $deftype$ must be valid.
The expansion of $deftype$ must be an array type $array fieldtype$ .
Let $fieldtype$ be $mut storagetype$ .
For each field value ${fieldval}_{i}$ in ${fieldval}^{*}$ :
- The field value ${fieldval}_{i}$ must be valid with storage type $storagetype$ .
Then the array instance is valid.

\frac{⊢ dt ok expand (dt) = array (mut st) (S ⊢ fv : st)^{*}}{S ⊢ {type dt, fields {fv}^{*}} ok}

Field Values $fieldval$ ¶

If $fieldval$ is a value $val$ , then:
- The value $val$ must be valid with value type $t$ .
- Then the field value is valid with value type $t$ .
Else, $fieldval$ is a packed value $packedval$ :
- Let $packedtype . pack i$ be the field value $fieldval$ .
- Then the field value is valid with packed type $packedtype$ .

\frac{}{S ⊢ pt . pack i : pt}

Export Instances ${name name, value externval}$ ¶

The external value $externval$ must be valid with some external type $externtype$ .
Then the export instance is valid.

\frac{S ⊢ externval : externtype}{S ⊢ {name name, value externval} ok}

Module Instances $moduleinst$ ¶

Each defined type ${deftype}_{i}$ in $moduleinst . types$ must be valid under the empty context.
For each function address ${funcaddr}_{i}$ in $moduleinst . funcaddrs$ , the external value $func {funcaddr}_{i}$ must be valid with some external type $func {functype}_{i}$ .
For each table address ${tableaddr}_{i}$ in $moduleinst . tableaddrs$ , the external value $table {tableaddr}_{i}$ must be valid with some external type $table {tabletype}_{i}$ .
For each memory address ${memaddr}_{i}$ in $moduleinst . memaddrs$ , the external value $mem {memaddr}_{i}$ must be valid with some external type $mem {memtype}_{i}$ .
For each global address ${globaladdr}_{i}$ in $moduleinst . globaladdrs$ , the external value $global {globaladdr}_{i}$ must be valid with some external type $global {globaltype}_{i}$ .
For each element address ${elemaddr}_{i}$ in $moduleinst . elemaddrs$ , the element instance $S . elems [{elemaddr}_{i}]$ must be valid with some reference type ${reftype}_{i}$ .
For each data address ${dataaddr}_{i}$ in $moduleinst . dataaddrs$ , the data instance $S . datas [{dataaddr}_{i}]$ must be valid.
Each export instance ${exportinst}_{i}$ in $moduleinst . exports$ must be valid.
For each export instance ${exportinst}_{i}$ in $moduleinst . exports$ , the name ${exportinst}_{i} . name$ must be different from any other name occurring in $moduleinst . exports$ .
Let ${deftype}^{*}$ be the concatenation of all ${deftype}_{i}$ in order.
Let ${functype}^{*}$ be the concatenation of all ${functype}_{i}$ in order.
Let ${tabletype}^{*}$ be the concatenation of all ${tabletype}_{i}$ in order.
Let ${memtype}^{*}$ be the concatenation of all ${memtype}_{i}$ in order.
Let ${globaltype}^{*}$ be the concatenation of all ${globaltype}_{i}$ in order.
Let ${reftype}^{*}$ be the concatenation of all ${reftype}_{i}$ in order.
Let $m$ be the length of $moduleinst . funcaddrs$ .
Let $n$ be the length of $moduleinst . dataaddrs$ .
Let $x^{*}$ be the sequence of function indices from $0$ to $m - 1$ .
Then the module instance is valid with context ${types {deftype}^{*},$ $funcs {functype}^{*},$ $tables {tabletype}^{*},$ $mems {memtype}^{*},$ $globals {globaltype}^{*},$ $elems {reftype}^{*},$ $datas {ok}^{n},$ $refs x^{*}}$ .

\begin{array}{r} \frac{\begin{array}{c} (⊢ deftype ok)^{*} \\ (S ⊢ func funcaddr : func functype)^{*} (S ⊢ table tableaddr : table tabletype)^{*} \\ (S ⊢ mem memaddr : mem memtype)^{*} (S ⊢ global globaladdr : global globaltype)^{*} \\ (S ⊢ S . elems [elemaddr] : reftype)^{*} (S ⊢ S . datas [dataaddr] ok)^{n} \\ (S ⊢ exportinst ok)^{*} (exportinst . name)^{*} disjoint \end{array}}{S ⊢ {\begin{cases} types & {deftype}^{*}, \\ funcaddrs & {funcaddr}^{*}, \\ tableaddrs & {tableaddr}^{*}, \\ memaddrs & {memaddr}^{*}, \\ globaladdrs & {globaladdr}^{*}, \\ elemaddrs & {elemaddr}^{*}, \\ dataaddrs & {dataaddr}^{n}, \\ exports & {exportinst}^{*}} : {\begin{cases} types & {deftype}^{*}, \\ funcs & {functype}^{*}, \\ tables & {tabletype}^{*}, \\ mems & {memtype}^{*}, \\ globals & {globaltype}^{*}, \\ elems & {reftype}^{*}, \\ datas & {ok}^{n}, \\ refs & 0 \dots (| {funcaddr}^{*} | - 1)} \end{cases} \end{cases}} \end{array}

Configuration Validity¶

To relate the WebAssembly type system to its execution semantics, the typing rules for instructions must be extended to configurations $S; T$ , which relates the store to execution threads.

Configurations and threads are classified by their result type. In addition to the store $S$ , threads are typed under a return type ${resulttype}^{?}$ , which controls whether and with which type a $return$ instruction is allowed. This type is absent ( $ϵ$ ) except for instruction sequences inside an administrative $frame$ instruction.

Finally, frames are classified with frame contexts, which extend the module contexts of a frame’s associated module instance with the locals that the frame contains.

Configurations $S; T$ ¶

The store $S$ must be valid.
Under no allowed return type, the thread $T$ must be valid with some result type $[t^{*}]$ .
Then the configuration is valid with the result type $[t^{*}]$ .

\frac{⊢ S ok S; ϵ ⊢ T : [t^{*}]}{⊢ S; T : [t^{*}]}

Threads $F; {instr}^{*}$ ¶

Let ${resulttype}^{?}$ be the current allowed return type.
The frame $F$ must be valid with a context $C$ .
Let $C^{'}$ be the same context as $C$ , but with $return$ set to ${resulttype}^{?}$ .
Under context $C^{'}$ , the instruction sequence ${instr}^{*}$ must be valid with some type $[] \to [t^{*}]$ .
Then the thread is valid with the result type $[t^{*}]$ .

\frac{S ⊢ F : C S; C, return {resulttype}^{?} ⊢ {instr}^{*} : [] \to [t^{*}]}{S; {resulttype}^{?} ⊢ F; {instr}^{*} : [t^{*}]}

Frames ${locals {val}^{*}, module moduleinst}$ ¶

The module instance $moduleinst$ must be valid with some module context $C$ .
Each value ${val}_{i}$ in ${val}^{*}$ must be valid with some value type $t_{i}$ .
Let $t^{*}$ be the concatenation of all $t_{i}$ in order.
Let $C^{'}$ be the same context as $C$ , but with the value types $t^{*}$ prepended to the $locals$ vector.
Then the frame is valid with frame context $C^{'}$ .

\frac{S ⊢ moduleinst : C (S ⊢ val : t)^{*}}{S ⊢ {locals {val}^{*}, module moduleinst} : (C, locals t^{*})}

Administrative Instructions¶

Typing rules for administrative instructions are specified as follows. In addition to the context $C$ , typing of these instructions is defined under a given store $S$ .

To that end, all previous typing judgements $C ⊢ prop$ are generalized to include the store, as in $S; C ⊢ prop$ , by implicitly adding $S$ to all rules – $S$ is never modified by the pre-existing rules, but it is accessed in the extra rules for administrative instructions given below.

$trap$ ¶

The instruction is valid with any valid instruction type of the form $[t_{1}^{*}] \to [t_{2}^{*}]$ .

\frac{C ⊢ [t_{1}^{*}] \to [t_{2}^{*}] ok}{S; C ⊢ trap : [t_{1}^{*}] \to [t_{2}^{*}]}

$val$ ¶

The value $val$ must be valid with value type $t$ .
Then it is valid as an instruction with type $[] \to [t]$ .

\frac{S ⊢ val : t}{S; C ⊢ val : [] \to [t]}

$invoke funcaddr$ ¶

The external function value $func funcaddr$ must be valid with external function type $func {functype}^{'}$ .
Let $[t_{1}^{*}] \to [t_{2}^{*}])$ be the function type $functype$ .
Then the instruction is valid with type $[t_{1}^{*}] \to [t_{2}^{*}]$ .

\frac{S ⊢ func funcaddr : func [t_{1}^{*}] \to [t_{2}^{*}]}{S; C ⊢ invoke funcaddr : [t_{1}^{*}] \to [t_{2}^{*}]}

${label}_{n} {{instr}_{0}^{}} {instr}^{} end$ ¶

The instruction sequence ${instr}_{0}^{*}$ must be valid with some type $[t_{1}^{n}] \to_{x^{*}} [t_{2}^{*}]$ .
Let $C^{'}$ be the same context as $C$ , but with the result type $[t_{1}^{n}]$ prepended to the $labels$ vector.
Under context $C^{'}$ , the instruction sequence ${instr}^{*}$ must be valid with type $[] \to_{{x^{'}}^{*}} [t_{2}^{*}]$ .
Then the compound instruction is valid with type $[] \to [t_{2}^{*}]$ .

\frac{S; C ⊢ {instr}_{0}^{*} : [t_{1}^{n}] \to_{x^{*}} [t_{2}^{*}] S; C, labels [t_{1}^{n}] ⊢ {instr}^{*} : [] \to_{{x^{'}}^{*}} [t_{2}^{*}]}{S; C ⊢ {label}_{n} {{instr}_{0}^{*}} {instr}^{*} end : [] \to [t_{2}^{*}]}

${frame}_{n} {F} {instr}^{*} end$ ¶

Under the valid return type $[t^{n}]$ , the thread $F; {instr}^{*}$ must be valid with result type $[t^{n}]$ .
Then the compound instruction is valid with type $[] \to [t^{n}]$ .

\frac{C ⊢ [t^{n}] ok S; [t^{n}] ⊢ F; {instr}^{*} : [t^{n}]}{S; C ⊢ {frame}_{n} {F} {instr}^{*} end : [] \to [t^{n}]}

Store Extension¶

Programs can mutate the store and its contained instances. Any such modification must respect certain invariants, such as not removing allocated instances or changing immutable definitions. While these invariants are inherent to the execution semantics of WebAssembly instructions and modules, host functions do not automatically adhere to them. Consequently, the required invariants must be stated as explicit constraints on the invocation of host functions. Soundness only holds when the embedder ensures these constraints.

The necessary constraints are codified by the notion of store extension: a store state $S^{'}$ extends state $S$ , written $S ⪯ S^{'}$ , when the following rules hold.

Note

Extension does not imply that the new store is valid, which is defined separately above.

Store $S$ ¶

The length of $S . funcs$ must not shrink.
The length of $S . tables$ must not shrink.
The length of $S . mems$ must not shrink.
The length of $S . globals$ must not shrink.
The length of $S . elems$ must not shrink.
The length of $S . datas$ must not shrink.
The length of $S . structs$ must not shrink.
The length of $S . arrays$ must not shrink.
For each function instance ${funcinst}_{i}$ in the original $S . funcs$ , the new function instance must be an extension of the old.
For each table instance ${tableinst}_{i}$ in the original $S . tables$ , the new table instance must be an extension of the old.
For each memory instance ${meminst}_{i}$ in the original $S . mems$ , the new memory instance must be an extension of the old.
For each global instance ${globalinst}_{i}$ in the original $S . globals$ , the new global instance must be an extension of the old.
For each element instance ${eleminst}_{i}$ in the original $S . elems$ , the new element instance must be an extension of the old.
For each data instance ${datainst}_{i}$ in the original $S . datas$ , the new data instance must be an extension of the old.
For each structure instance ${structinst}_{i}$ in the original $S . structs$ , the new structure instance must be an extension of the old.
For each array instance ${arrayinst}_{i}$ in the original $S . arrays$ , the new array instance must be an extension of the old.

\begin{array}{r} \frac{\begin{array}{ccc} S_{1} . funcs = {funcinst}_{1}^{*} & S_{2} . funcs = {funcinst}_{1}^{'}^{*} {funcinst}_{2}^{*} & (⊢ {funcinst}_{1} ⪯ {funcinst}_{1}^{'})^{*} \\ S_{1} . tables = {tableinst}_{1}^{*} & S_{2} . tables = {tableinst}_{1}^{'}^{*} {tableinst}_{2}^{*} & (⊢ {tableinst}_{1} ⪯ {tableinst}_{1}^{'})^{*} \\ S_{1} . mems = {meminst}_{1}^{*} & S_{2} . mems = {meminst}_{1}^{'}^{*} {meminst}_{2}^{*} & (⊢ {meminst}_{1} ⪯ {meminst}_{1}^{'})^{*} \\ S_{1} . globals = {globalinst}_{1}^{*} & S_{2} . globals = {globalinst}_{1}^{'}^{*} {globalinst}_{2}^{*} & (⊢ {globalinst}_{1} ⪯ {globalinst}_{1}^{'})^{*} \\ S_{1} . elems = {eleminst}_{1}^{*} & S_{2} . elems = {eleminst}_{1}^{'}^{*} {eleminst}_{2}^{*} & (⊢ {eleminst}_{1} ⪯ {eleminst}_{1}^{'})^{*} \\ S_{1} . datas = {datainst}_{1}^{*} & S_{2} . datas = {datainst}_{1}^{'}^{*} {datainst}_{2}^{*} & (⊢ {datainst}_{1} ⪯ {datainst}_{1}^{'})^{*} \\ S_{1} . structs = {structinst}_{1}^{*} & S_{2} . structs = {structinst}_{1}^{'}^{*} {structinst}_{2}^{*} & (⊢ {structinst}_{1} ⪯ {structinst}_{1}^{'})^{*} \\ S_{1} . arrays = {arrayinst}_{1}^{*} & S_{2} . arrays = {arrayinst}_{1}^{'}^{*} {arrayinst}_{2}^{*} & (⊢ {arrayinst}_{1} ⪯ {arrayinst}_{1}^{'})^{*} \end{array}}{⊢ S_{1} ⪯ S_{2}} \end{array}

Function Instance $funcinst$ ¶

A function instance must remain unchanged.

\frac{}{⊢ funcinst ⪯ funcinst}

Table Instance $tableinst$ ¶

The table type $tableinst . type$ must remain unchanged.
The length of $tableinst . elem$ must not shrink.

\frac{n_{1} \leq n_{2}}{⊢ {type tt, elem ({fa}_{1}^{?})^{n_{1}}} ⪯ {type tt, elem ({fa}_{2}^{?})^{n_{2}}}}

Memory Instance $meminst$ ¶

The memory type $meminst . type$ must remain unchanged.
The length of $meminst . data$ must not shrink.

\frac{n_{1} \leq n_{2}}{⊢ {type mt, data b_{1}^{n_{1}}} ⪯ {type mt, data b_{2}^{n_{2}}}}

Global Instance $globalinst$ ¶

The global type $globalinst . type$ must remain unchanged.
Let $mut t$ be the structure of $globalinst . type$ .
If $mut$ is $const$ , then the value $globalinst . value$ must remain unchanged.

\frac{mut = var \lor {val}_{1} = {val}_{2}}{⊢ {type (mut t), value {val}_{1}} ⪯ {type (mut t), value {val}_{2}}}

Element Instance $eleminst$ ¶

The reference type $eleminst . type$ must remain unchanged.
The vector $eleminst . elem$ must:
- either remain unchanged,
- or shrink to length $0$ .

\frac{}{⊢ {type t, elem a^{*}} ⪯ {type t, elem a^{*}}}

\frac{}{⊢ {type t, elem a^{*}} ⪯ {type t, elem ϵ}}

Data Instance $datainst$ ¶

The vector $datainst . data$ must:
- either remain unchanged,
- or shrink to length $0$ .

\frac{}{⊢ {data b^{*}} ⪯ {data b^{*}}}

\frac{}{⊢ {data b^{*}} ⪯ {data ϵ}}

Structure Instance $structinst$ ¶

The defined type $structinst . type$ must remain unchanged.
Assert: due to store well-formedness, the expansion of $structinst . type$ is a structure type.
Let $struct {fieldtype}^{*}$ be the expansion of $structinst . type$ .
The length of the vector $structinst . fields$ must remain unchanged.
Assert: due to store well-formedness, the length of $structinst . fields$ is the same as the length of ${fieldtype}^{*}$ .
For each field value ${fieldval}_{i}$ in $structinst . fields$ and corresponding field type ${fieldtype}_{i}$ in ${fieldtype}^{*}$ :
- Let ${mut}_{i} {st}_{i}$ be the structure of ${fieldtype}_{i}$ .
- If ${mut}_{i}$ is $const$ , then the field value ${fieldval}_{i}$ must remain unchanged.

\frac{(mut = var \lor {fieldval}_{1} = {fieldval}_{2})^{*}}{⊢ {type (mut st)^{*}, fields {fieldval}_{1}^{*}} ⪯ {type (mut st)^{*}, fields {fieldval}_{2}^{*}}}

Array Instance $arrayinst$ ¶

The defined type $arrayinst . type$ must remain unchanged.
Assert: due to store well-formedness, the expansion of $arrayinst . type$ is an array type.
Let $array fieldtype$ be the expansion of $arrayinst . type$ .
The length of the vector $arrayinst . fields$ must remain unchanged.
Let $mut st$ be the structure of $fieldtype$ .
If $mut$ is $const$ , then the sequence of field values $arrayinst . fields$ must remain unchanged.

\frac{mut = var \lor {fieldval}_{1}^{*} = {fieldval}_{2}^{*}}{⊢ {type (mut st), fields {fieldval}_{1}^{*}} ⪯ {type (mut st), fields {fieldval}_{2}^{*}}}

Theorems¶

Given the definition of valid configurations, the standard soundness theorems hold. [2] [3]

Theorem (Preservation). If a configuration $S; T$ is valid with result type $[t^{*}]$ (i.e., $⊢ S; T : [t^{*}]$ ), and steps to $S^{'}; T^{'}$ (i.e., $S; T ↪ S^{'}; T^{'}$ ), then $S^{'}; T^{'}$ is a valid configuration with the same result type (i.e., $⊢ S^{'}; T^{'} : [t^{*}]$ ). Furthermore, $S^{'}$ is an extension of $S$ (i.e., $⊢ S ⪯ S^{'}$ ).

A terminal thread is one whose sequence of instructions is a result. A terminal configuration is a configuration whose thread is terminal.

Theorem (Progress). If a configuration $S; T$ is valid (i.e., $⊢ S; T : [t^{*}]$ for some result type $[t^{*}]$ ), then either it is terminal, or it can step to some configuration $S^{'}; T^{'}$ (i.e., $S; T ↪ S^{'}; T^{'}$ ).

From Preservation and Progress the soundness of the WebAssembly type system follows directly.

Corollary (Soundness). If a configuration $S; T$ is valid (i.e., $⊢ S; T : [t^{*}]$ for some result type $[t^{*}]$ ), then it either diverges or takes a finite number of steps to reach a terminal configuration $S^{'}; T^{'}$ (i.e., $S; T ↪^{*} S^{'}; T^{'}$ ) that is valid with the same result type (i.e., $⊢ S^{'}; T^{'} : [t^{*}]$ ) and where $S^{'}$ is an extension of $S$ (i.e., $⊢ S ⪯ S^{'}$ ).

In other words, every thread in a valid configuration either runs forever, traps, or terminates with a result that has the expected type. Consequently, given a valid store, no computation defined by instantiation or invocation of a valid module can “crash” or otherwise (mis)behave in ways not covered by the execution semantics given in this specification.

Type System Properties¶

Principal Types¶

The type system of WebAssembly features both subtyping and simple forms of polymorphism for instruction types. That has the effect that every instruction or instruction sequence can be classified with multiple different instruction types.

However, the typing rules still allow deriving principal types for instruction sequences. That is, every valid instruction sequence has one particular type scheme, possibly containing some unconstrained place holder type variables, that is a subtype of all its valid instruction types, after substituting its type variables with suitable specific types.

Moreover, when deriving an instruction type in a “forward” manner, i.e., the input of the instruction sequence is already fixed to specific types, then it has a principal output type expressible without type variables, up to a possibly polymorphic stack bottom representable with one single variable. In other words, “forward” principal types are effectively closed.

Note

For example, in isolation, the instruction $ref . as_non_null$ has the type $[(ref null ht)] \to [(ref ht)]$ for any choice of valid heap type $ht$ . Moreover, if the input type $[(ref null ht)]$ is already determined, i.e., a specific $ht$ is given, then the output type $[(ref ht)]$ is fully determined as well.

The implication of the latter property is that a validator for complete instruction sequences (as they occur in valid modules) can be implemented with a simple left-to-right algorithm that does not require the introduction of type variables.

A typing algorithm capable of handling partial instruction sequences (as might be considered for program analysis or program manipulation) needs to introduce type variables and perform substitutions, but it does not need to perform backtracking or record any non-syntactic constraints on these type variables.

Technically, the syntax of heap, value, and result types can be enriched with type variables as follows:

\begin{array}{r} \begin{array}{llll} null & ::= & {null}^{?} | α_{null} \\ heaptype & ::= & \dots | α_{heaptype} \\ reftype & ::= & ref null heaptype \\ valtype & ::= & \dots | α_{valtype} | α_{numvectype} \\ resulttype & ::= & [α_{{valtype}^{*}}^{?} {valtype}^{*}] \end{array} \end{array}

where each $α_{xyz}$ ranges over a set of type variables for syntactic class $xyz$ , respectively. The special class $numvectype$ is defined as $numtype | vectype | bot$ , and is only needed to handle unannotated $select$ instructions.

A type is closed when it does not contain any type variables, and open otherwise. A type substitution $σ$ is a finite mapping from type variables to closed types of the respective syntactic class. When applied to an open type, it replaces the type variables $α$ from its domain with the respective $σ (α)$ .

Theorem (Principal Types). If an instruction sequence ${instr}^{*}$ is valid with some closed instruction type $instrtype$ (i.e., $C ⊢ {instr}^{*} : instrtype$ ), then it is also valid with a possibly open instruction type ${instrtype}_{min}$ (i.e., $C ⊢ {instr}^{*} : {instrtype}_{min}$ ), such that for every closed type ${instrtype}^{'}$ with which ${instr}^{*}$ is valid (i.e., for all $C ⊢ {instr}^{*} : {instrtype}^{'}$ ), there exists a substitution $σ$ , such that $σ ({instrtype}_{min})$ is a subtype of ${instrtype}^{'}$ (i.e., $C ⊢ σ ({instrtype}_{min}) \leq {instrtype}^{'}$ ). Furthermore, ${instrtype}_{min}$ is unique up to the choice of type variables.

Theorem (Closed Principal Forward Types). If closed input type $[t_{1}^{*}]$ is given and the instruction sequence ${instr}^{*}$ is valid with instruction type $[t_{1}^{*}] \to_{x^{*}} [t_{2}^{*}]$ (i.e., $C ⊢ {instr}^{*} : [t_{1}^{*}] \to_{x^{*}} [t_{2}^{*}]$ ), then it is also valid with instruction type $[t_{1}^{*}] \to_{x^{*}} [α_{{valtype}^{*}} t^{*}]$ (i.e., $C ⊢ {instr}^{*} : [t_{1}^{*}] \to_{x^{*}} [α_{{valtype}^{*}} t^{*}]$ ), where all $t^{*}$ are closed, such that for every closed result type $[{t_{2}^{'}}^{*}]$ with which ${instr}^{*}$ is valid (i.e., for all $C ⊢ {instr}^{*} : [t_{1}^{*}] \to_{x^{*}} [{t_{2}^{'}}^{*}]$ ), there exists a substitution $σ$ , such that $[{t_{2}^{'}}^{*}] = [σ (α_{{valtype}^{*}}) t^{*}]$ .

Type Lattice¶

The Principal Types property depends on the existence of a greatest lower bound for any pair of types.

Theorem (Greatest Lower Bounds for Value Types). For any two value types $t_{1}$ and $t_{2}$ that are valid (i.e., $C ⊢ t_{1} ok$ and $C ⊢ t_{2} ok$ ), there exists a valid value type $t$ that is a subtype of both $t_{1}$ and $t_{2}$ (i.e., $C ⊢ t ok$ and $C ⊢ t \leq t_{1}$ and $C ⊢ t \leq t_{2}$ ), such that every valid value type $t^{'}$ that also is a subtype of both $t_{1}$ and $t_{2}$ (i.e., for all $C ⊢ t^{'} ok$ and $C ⊢ t^{'} \leq t_{1}$ and $C ⊢ t^{'} \leq t_{2}$ ), is a subtype of $t$ (i.e., $C ⊢ t^{'} \leq t$ ).

Note

The greatest lower bound of two types may be $bot$ .

Theorem (Conditional Least Upper Bounds for Value Types). Any two value types $t_{1}$ and $t_{2}$ that are valid (i.e., $C ⊢ t_{1} ok$ and $C ⊢ t_{2} ok$ ) either have no common supertype, or there exists a valid value type $t$ that is a supertype of both $t_{1}$ and $t_{2}$ (i.e., $C ⊢ t ok$ and $C ⊢ t_{1} \leq t$ and $C ⊢ t_{2} \leq t$ ), such that every valid value type $t^{'}$ that also is a supertype of both $t_{1}$ and $t_{2}$ (i.e., for all $C ⊢ t^{'} ok$ and $C ⊢ t_{1} \leq t^{'}$ and $C ⊢ t_{2} \leq t^{'}$ ), is a supertype of $t$ (i.e., $C ⊢ t \leq t^{'}$ ).

Note

If a top type was added to the type system, a least upper bound would exist for any two types.

Corollary (Type Lattice). Assuming the addition of a provisional top type, value types form a lattice with respect to their subtype relation.

Finally, value types can be partitioned into multiple disjoint hierarchies that are not related by subtyping, except through $bot$ .

Theorem (Disjoint Subtype Hierarchies). The greatest lower bound of two value types is $bot$ or $ref bot$ if and only if they do not have a least upper bound.

In other words, types that do not have common supertypes, do not have common subtypes either (other than $bot$ or $ref bot$ ), and vice versa.

Note

Types from disjoint hierarchies can safely be represented in mutually incompatible ways in an implementation, because their values can never flow to the same place.

Compositionality¶

Valid instruction sequences can be freely composed, as long as their types match up.

Theorem (Composition). If two instruction sequences ${instr}_{1}^{*}$ and ${instr}_{2}^{*}$ are valid with types $[t_{1}^{*}] \to_{x_{1}^{*}} [t^{*}]$ and $[t^{*}] \to_{x_{2}^{*}} [t_{2}^{*}]$ , respectively (i.e., $C ⊢ {instr}_{1}^{*} : [t_{1}^{*}] \to_{x_{1}^{*}} [t^{*}]$ and $C ⊢ {instr}_{1}^{*} : [t^{*}] \to_{x_{2}^{*}} [t_{2}^{*}]$ ), then the concatenated instruction sequence $({instr}_{1}^{*} {instr}_{2}^{*})$ is valid with type $[t_{1}^{*}] \to_{x_{1}^{*} x_{2}^{*}} [t_{2}^{*}]$ (i.e., $C ⊢ {instr}_{1}^{*} {instr}_{2}^{*} : [t_{1}^{*}] \to_{x_{1}^{*} x_{2}^{*}} [t_{2}^{*}]$ ).

Note

More generally, instead of a shared type $[t^{*}]$ , it suffices if the output type of ${instr}_{1}^{*}$ is a subtype of the input type of ${instr}_{1}^{*}$ , since the subtype can always be weakened to its supertype by subsumption.

Inversely, valid instruction sequences can also freely be decomposed, that is, splitting them anywhere produces two instruction sequences that are both valid.

Theorem (Decomposition). If an instruction sequence ${instr}^{*}$ that is valid with type $[t_{1}^{*}] \to_{x^{*}} [t_{2}^{*}]$ (i.e., $C ⊢ {instr}^{*} : [t_{1}^{*}] \to_{x^{*}} [t_{2}^{*}]$ ) is split into two instruction sequences ${instr}_{1}^{*}$ and ${instr}_{2}^{*}$ at any point (i.e., ${instr}^{*} = {instr}_{1}^{*} {instr}_{2}^{*}$ ), then these are separately valid with some types $[t_{1}^{*}] \to_{x_{1}^{*}} [t^{*}]$ and $[t^{*}] \to_{x_{2}^{*}} [t_{2}^{*}]$ , respectively (i.e., $C ⊢ {instr}_{1}^{*} : [t_{1}^{*}] \to_{x_{1}^{*}} [t^{*}]$ and $C ⊢ {instr}_{1}^{*} : [t^{*}] \to_{x_{2}^{*}} [t_{2}^{*}]$ ), where $x^{*} = x_{1}^{*} x_{2}^{*}$ .

Note

This property holds because validation is required even for unreachable code. Without that, ${instr}_{2}^{*}$ might not be valid in isolation.

Type Soundness¶

Contexts¶

Types¶

Heap Type bot¶

Heap Type rec i¶

Value Type bot¶

Recursive Types rec subtype∗¶

Sub types sub final? ht∗ comptype¶

Subtyping¶

Results¶

Results val∗¶

Results trap¶

Store Validity¶

Store S¶

Function Instances {type functype,module moduleinst,code func}¶

Host Function Instances {type functype,hostcode hf}¶

Table Instances {type (limits t),elem ref∗}¶

Memory Instances {type limits,data b∗}¶

Global Instances {type (mut t),value val}¶

Element Instances {type t,elem ref∗}¶

Data Instances {data b∗}¶

Structure Instances {type deftype,fields fieldval∗}¶

Array Instances {type deftype,fields fieldval∗}¶

Field Values fieldval¶

Export Instances {name name,value externval}¶

Module Instances moduleinst¶

Configuration Validity¶

Configurations S;T¶

Threads F;instr∗¶

Frames {locals val∗,module moduleinst}¶

Administrative Instructions¶

trap¶

val¶

invoke funcaddr¶

labeln{instr0∗} instr∗ end¶

framen{F} instr∗ end¶

Store Extension¶

Store S¶

Function Instance funcinst¶

Table Instance tableinst¶

Memory Instance meminst¶

Global Instance globalinst¶

Element Instance eleminst¶

Data Instance datainst¶

Structure Instance structinst¶

Array Instance arrayinst¶

Theorems¶

Type System Properties¶

Principal Types¶

Type Lattice¶

Compositionality¶

Heap Type $bot$ ¶

Heap Type $rec i$ ¶

Value Type $bot$ ¶

Recursive Types $rec {subtype}^{*}$ ¶

Sub types $sub {final}^{?} {ht}^{*} comptype$ ¶

Results ${val}^{*}$ ¶

Results $trap$ ¶

Store $S$ ¶

Function Instances ${type functype, module moduleinst, code func}$ ¶

Host Function Instances ${type functype, hostcode hf}$ ¶

Table Instances ${type (limits t), elem {ref}^{*}}$ ¶

Memory Instances ${type limits, data b^{*}}$ ¶

Global Instances ${type (mut t), value val}$ ¶

Element Instances ${type t, elem {ref}^{*}}$ ¶

Data Instances ${data b^{*}}$ ¶

Structure Instances ${type deftype, fields {fieldval}^{*}}$ ¶

Array Instances ${type deftype, fields {fieldval}^{*}}$ ¶

Field Values $fieldval$ ¶

Export Instances ${name name, value externval}$ ¶

Module Instances $moduleinst$ ¶

Configurations $S; T$ ¶

Threads $F; {instr}^{*}$ ¶

Frames ${locals {val}^{*}, module moduleinst}$ ¶

$trap$ ¶

$val$ ¶

$invoke funcaddr$ ¶

${label}_{n} {{instr}_{0}^{}} {instr}^{} end$ ¶

${frame}_{n} {F} {instr}^{*} end$ ¶

Store $S$ ¶

Function Instance $funcinst$ ¶

Table Instance $tableinst$ ¶

Memory Instance $meminst$ ¶

Global Instance $globalinst$ ¶

Element Instance $eleminst$ ¶

Data Instance $datainst$ ¶

Structure Instance $structinst$ ¶

Array Instance $arrayinst$ ¶