Type Soundness¶
The type system of WebAssembly is sound, implying both type safety and memory safety with respect to the WebAssembly semantics. For example:
All types declared and derived during validation are respected at run time; e.g., every local or global variable will only contain type-correct values, every instruction will only be applied to operands of the expected type, and every function invocation always evaluates to a result of the right type (if it does not trap or diverge).
No memory location will be read or written except those explicitly defined by the program, i.e., as a local, a global, an element in a table, or a location within a linear memory.
There is no undefined behavior, i.e., the execution rules cover all possible cases that can occur in a valid program, and the rules are mutually consistent.
Soundness also is instrumental in ensuring additional properties, most notably, encapsulation of function and module scopes: no locals can be accessed outside their own function and no module components can be accessed outside their own module unless they are explicitly exported or imported.
The typing rules defining WebAssembly validation only cover the static components of a WebAssembly program. In order to state and prove soundness precisely, the typing rules must be extended to the dynamic components of the abstract runtime, that is, the store, configurations, and administrative instructions. [1]
Contexts¶
In order to check rolled up recursive types, the context is locally extended with an additional component that records the sub type corresponding to each recursive type index within the current recursive type:
Types¶
Well-formedness for extended type forms is defined as follows.
Heap Type ¶
The heap type is valid.
Heap Type ¶
The recursive type index
must exist in .Then the heap type is valid.
Value Type ¶
The value type is valid.
Recursive Types ¶
Let
be the current context , but where is .There must be a type index
, such that for each sub type in :Under the context
, the sub type must be valid for type index and recursive type index .
Then the recursive type is valid for the type index
.
Note
These rules are a generalisation of the ones previously given.
Sub types ¶
The composite type
must be valid.The sequence
may be no longer than .For every heap type
in :The heap type
must be ordered before a type index and recursive type index a , meaning:Either
is a defined type.Or
is a type index that is smaller than .Or
is a recursive type index where is smaller than .
Let sub type
be the unrolling of the heap type , meaning:Either
is a defined type , then must be the unrolling of .Or
is a type index , then must be the unrolling of the defined type .Or
is a recursive type index , then must be .
The sub type
must not contain .Let
be the composite type in .The composite type
must match .
Then the sub type is valid for the type index
and recursive type index .
where:
Note
This rule is a generalisation of the ones previously given, which only allowed type indices as supertypes.
Subtyping¶
In a rolled-up recursive type, a recursive type indices
Let
be the sub type .The heap type
is contained in .
Note
This rule is only invoked when checking validity of rolled-up recursive types.
Results¶
Results can be classified by result types as follows.
Results ¶
For each value
in :The value
is valid with some value type .
Let
be the concatenation of all .Then the result is valid with result type
.
Results ¶
The result is valid with result type
, for any valid closed result types.
Store Validity¶
The following typing rules specify when a runtime store
To that end, each kind of instance is classified by a respective function, table, memory, or global type. Module instances are classified by module contexts, which are regular contexts repurposed as module types describing the index spaces defined by a module.
Store ¶
Each function instance
in must be valid with some function type .Each table instance
in must be valid with some table type .Each memory instance
in must be valid with some memory type .Each global instance
in must be valid with some global type .Each element instance
in must be valid with some reference type .Each data instance
in must be valid.Each structure instance
in must be valid.Each array instance
in must be valid.No reference to a bound structure address must be reachable from itself through a path consisting only of indirections through immutable structure or array fields.
No reference to a bound array address must be reachable from itself through a path consisting only of indirections through immutable structure or array fields.
Then the store is valid.
where
Note
The constraint on reachability through immutable fields prevents the presence of cyclic data structures that can not be constructed in the language. Cycles can only be formed using mutation.
Function Instances ¶
The function type
must be valid under an empty context.The module instance
must be valid with some context .Under context
:The function
must be valid with some function type .The function type
must match .
Then the function instance is valid with function type
.
Host Function Instances ¶
The function type
must be valid under an empty context.Let
be the function type .For every valid store
extending and every sequence of values whose types coincide with :Then the function instance is valid with function type
.
Note
This rule states that, if appropriate pre-conditions about store and arguments are satisfied, then executing the host function must satisfy appropriate post-conditions about store and results. The post-conditions match the ones in the execution rule for invoking host functions.
Any store under which the function is invoked is assumed to be an extension of the current store. That way, the function itself is able to make sufficient assumptions about future stores.
Table Instances ¶
The table type
must be valid under the empty context.The length of
must equal .For each reference
in the table’s elements :The reference
must be valid with some reference type .The reference type
must match the reference type .
Then the table instance is valid with table type
.
Memory Instances ¶
The memory type
must be valid under the empty context.The length of
must equal multiplied by the page size .Then the memory instance is valid with memory type
.
Global Instances ¶
The global type
must be valid under the empty context.The value
must be valid with some value type .The value type
must match the value type .Then the global instance is valid with global type
.
Element Instances ¶
The reference type
must be valid under the empty context.For each reference
in the elements :The reference
must be valid with some reference type .The reference type
must match the reference type .
Then the element instance is valid with reference type
.
Data Instances ¶
The data instance is valid.
Structure Instances ¶
The defined type
must be valid.The expansion of
must be a structure type .The length of the sequence of field values
must be the same as the length of the sequence of field types .For each field value
in and corresponding field type in :Let
be .The field value
must be valid with storage type .
Then the structure instance is valid.
Array Instances ¶
The defined type
must be valid.The expansion of
must be an array type .Let
be .For each field value
in :The field value
must be valid with storage type .
Then the array instance is valid.
Field Values ¶
If
is a value , then:The value
must be valid with value type .Then the field value is valid with value type
.
Else,
is a packed value :Let
be the field value .Then the field value is valid with packed type
.
Export Instances ¶
The external value
must be valid with some external type .Then the export instance is valid.
Module Instances ¶
Each defined type
in must be valid under the empty context.For each function address
in , the external value must be valid with some external type .For each table address
in , the external value must be valid with some external type .For each memory address
in , the external value must be valid with some external type .For each global address
in , the external value must be valid with some external type .For each element address
in , the element instance must be valid with some reference type .For each data address
in , the data instance must be valid.Each export instance
in must be valid.For each export instance
in , the name must be different from any other name occurring in .Let
be the concatenation of all in order.Let
be the concatenation of all in order.Let
be the concatenation of all in order.Let
be the concatenation of all in order.Let
be the concatenation of all in order.Let
be the concatenation of all in order.Let
be the length of .Let
be the length of .Let
be the sequence of function indices from to .Then the module instance is valid with context
.
Configuration Validity¶
To relate the WebAssembly type system to its execution semantics, the typing rules for instructions must be extended to configurations
Configurations and threads are classified by their result type.
In addition to the store
Finally, frames are classified with frame contexts, which extend the module contexts of a frame’s associated module instance with the locals that the frame contains.
Configurations ¶
Under no allowed return type, the thread
must be valid with some result type .Then the configuration is valid with the result type
.
Threads ¶
Let
be the current allowed return type.Let
be the same context as , but with set to .Under context
, the instruction sequence must be valid with some type .Then the thread is valid with the result type
.
Frames ¶
The module instance
must be valid with some module context .Each value
in must be valid with some value type .Let
be the concatenation of all in order.Let
be the same context as , but with the value types prepended to the vector.Then the frame is valid with frame context
.
Administrative Instructions¶
Typing rules for administrative instructions are specified as follows.
In addition to the context
To that end, all previous typing judgements
¶
The instruction is valid with any valid instruction type of the form
.
¶
The value
must be valid with value type .Then it is valid as an instruction with type
.
¶
The external function value
must be valid with external function type .Let
be the function type .Then the instruction is valid with type
.
¶
The instruction sequence
must be valid with some type .Let
be the same context as , but with the result type prepended to the vector.Under context
, the instruction sequence must be valid with type .Then the compound instruction is valid with type
.
¶
Under the valid return type
, the thread must be valid with result type .Then the compound instruction is valid with type
.
Store Extension¶
Programs can mutate the store and its contained instances. Any such modification must respect certain invariants, such as not removing allocated instances or changing immutable definitions. While these invariants are inherent to the execution semantics of WebAssembly instructions and modules, host functions do not automatically adhere to them. Consequently, the required invariants must be stated as explicit constraints on the invocation of host functions. Soundness only holds when the embedder ensures these constraints.
The necessary constraints are codified by the notion of store extension:
a store state
Note
Extension does not imply that the new store is valid, which is defined separately above.
Store ¶
The length of
must not shrink.The length of
must not shrink.The length of
must not shrink.The length of
must not shrink.The length of
must not shrink.The length of
must not shrink.The length of
must not shrink.The length of
must not shrink.For each function instance
in the original , the new function instance must be an extension of the old.For each table instance
in the original , the new table instance must be an extension of the old.For each memory instance
in the original , the new memory instance must be an extension of the old.For each global instance
in the original , the new global instance must be an extension of the old.For each element instance
in the original , the new element instance must be an extension of the old.For each data instance
in the original , the new data instance must be an extension of the old.For each structure instance
in the original , the new structure instance must be an extension of the old.For each array instance
in the original , the new array instance must be an extension of the old.
Function Instance ¶
A function instance must remain unchanged.
Table Instance ¶
The table type
must remain unchanged.The length of
must not shrink.
Memory Instance ¶
The memory type
must remain unchanged.The length of
must not shrink.
Global Instance ¶
The global type
must remain unchanged.Let
be the structure of .If
is , then the value must remain unchanged.
Element Instance ¶
The reference type
must remain unchanged.The vector
must:either remain unchanged,
or shrink to length
.
Data Instance ¶
The vector
must:either remain unchanged,
or shrink to length
.
Structure Instance ¶
The defined type
must remain unchanged.Assert: due to store well-formedness, the expansion of
is a structure type.Let
be the expansion of .The length of the vector
must remain unchanged.Assert: due to store well-formedness, the length of
is the same as the length of .For each field value
in and corresponding field type in :Let
be the structure of .If
is , then the field value must remain unchanged.
Array Instance ¶
The defined type
must remain unchanged.Assert: due to store well-formedness, the expansion of
is an array type.Let
be the expansion of .The length of the vector
must remain unchanged.Let
be the structure of .If
is , then the sequence of field values must remain unchanged.
Theorems¶
Given the definition of valid configurations, the standard soundness theorems hold. [2] [3]
Theorem (Preservation).
If a configuration
A terminal thread is one whose sequence of instructions is a result. A terminal configuration is a configuration whose thread is terminal.
Theorem (Progress).
If a configuration
From Preservation and Progress the soundness of the WebAssembly type system follows directly.
Corollary (Soundness).
If a configuration
In other words, every thread in a valid configuration either runs forever, traps, or terminates with a result that has the expected type. Consequently, given a valid store, no computation defined by instantiation or invocation of a valid module can “crash” or otherwise (mis)behave in ways not covered by the execution semantics given in this specification.
Type System Properties¶
Principal Types¶
The type system of WebAssembly features both subtyping and simple forms of polymorphism for instruction types. That has the effect that every instruction or instruction sequence can be classified with multiple different instruction types.
However, the typing rules still allow deriving principal types for instruction sequences. That is, every valid instruction sequence has one particular type scheme, possibly containing some unconstrained place holder type variables, that is a subtype of all its valid instruction types, after substituting its type variables with suitable specific types.
Moreover, when deriving an instruction type in a “forward” manner, i.e., the input of the instruction sequence is already fixed to specific types, then it has a principal output type expressible without type variables, up to a possibly polymorphic stack bottom representable with one single variable. In other words, “forward” principal types are effectively closed.
Note
For example, in isolation, the instruction
The implication of the latter property is that a validator for complete instruction sequences (as they occur in valid modules) can be implemented with a simple left-to-right algorithm that does not require the introduction of type variables.
A typing algorithm capable of handling partial instruction sequences (as might be considered for program analysis or program manipulation) needs to introduce type variables and perform substitutions, but it does not need to perform backtracking or record any non-syntactic constraints on these type variables.
Technically, the syntax of heap, value, and result types can be enriched with type variables as follows:
where each
A type is closed when it does not contain any type variables, and open otherwise.
A type substitution
Theorem (Principal Types).
If an instruction sequence
Theorem (Closed Principal Forward Types).
If closed input type
Type Lattice¶
The Principal Types property depends on the existence of a greatest lower bound for any pair of types.
Theorem (Greatest Lower Bounds for Value Types).
For any two value types
Note
The greatest lower bound of two types may be
Theorem (Conditional Least Upper Bounds for Value Types).
Any two value types
Note
If a top type was added to the type system, a least upper bound would exist for any two types.
Corollary (Type Lattice). Assuming the addition of a provisional top type, value types form a lattice with respect to their subtype relation.
Finally, value types can be partitioned into multiple disjoint hierarchies that are not related by subtyping, except through
Theorem (Disjoint Subtype Hierarchies).
The greatest lower bound of two value types is
In other words, types that do not have common supertypes,
do not have common subtypes either (other than
Note
Types from disjoint hierarchies can safely be represented in mutually incompatible ways in an implementation, because their values can never flow to the same place.
Compositionality¶
Valid instruction sequences can be freely composed, as long as their types match up.
Theorem (Composition).
If two instruction sequences
Note
More generally, instead of a shared type
Inversely, valid instruction sequences can also freely be decomposed, that is, splitting them anywhere produces two instruction sequences that are both valid.
Theorem (Decomposition).
If an instruction sequence
Note
This property holds because validation is required even for unreachable code.
Without that,