Runtime Structure

Store, stack, and other runtime structure forming the WebAssembly abstract machine, such as values or module instances, are made precise in terms of additional auxiliary syntax.

Values

WebAssembly computations manipulate values of either the four basic number types, i.e., integers and floating-point data of 32 or 64 bit width each, or vectors of 128 bit width, or of reference type.

In most places of the semantics, values of different types can occur. In order to avoid ambiguities, values are therefore represented with an abstract syntax that makes their type explicit. It is convenient to reuse the same notation as for the const instructions and ref.null producing them.

References other than null are represented with additional administrative instructions. They either are scalar references, containing a 31-bit integer, structure references, pointing to a specific structure address, array references, pointing to a specific array address, function references, pointing to a specific function address, or host references pointing to an uninterpreted form of host address defined by the embedder. Any of the aformentioned references can furthermore be wrapped up as an external reference.

num::=i32.const i32|i64.const i64|f32.const f32|f64.const f64vec::=v128.const i128ref::=ref.null (absheaptype | deftype)|ref.i31 u31|ref.struct structaddr|ref.array arrayaddr|ref.func funcaddr|ref.host hostaddr|ref.extern refval::=num | vec | ref

Note

Future versions of WebAssembly may add additional forms of reference.

Value types can have an associated default value; it is the respective value 0 for number types, 0 for vector types, and null for nullable reference types. For other references, no default value is defined, defaultt hence is an optional value val?.

defaultt=t.const 0(ift=numtype)defaultt=t.const 0(ift=vectype)defaultt=ref.null t(ift=(ref null heaptype))defaultt=ϵ(ift=(ref heaptype))

Convention

  • The meta variable r ranges over reference values where clear from context.

Results

A result is the outcome of a computation. It is either a sequence of values or a trap.

result::=val|trap

Store

The store represents all global state that can be manipulated by WebAssembly programs. It consists of the runtime representation of all instances of functions, tables, memories, and globals, element segments, data segments, and structures or arrays that have been allocated during the life time of the abstract machine. [1]

It is an invariant of the semantics that no element or data instance is addressed from anywhere else but the owning module instances.

Syntactically, the store is defined as a record listing the existing instances of each category:

store::={ funcsfuncinst,tablestableinst,memsmeminst,globalsglobalinst,elemseleminst,datasdatainst,structsstructinst,arraysarrayinst }

Convention

  • The meta variable S ranges over stores where clear from context.

Addresses

Function instances, table instances, memory instances, and global instances, element instances, data instances and structure or array instances in the store are referenced with abstract addresses. These are simply indices into the respective store component. In addition, an embedder may supply an uninterpreted set of host addresses.

addr::=0 | 1 | 2 | funcaddr::=addrtableaddr::=addrmemaddr::=addrglobaladdr::=addrelemaddr::=addrdataaddr::=addrstructaddr::=addrarrayaddr::=addrhostaddr::=addr

An embedder may assign identity to exported store objects corresponding to their addresses, even where this identity is not observable from within WebAssembly code itself (such as for function instances or immutable globals).

Note

Addresses are dynamic, globally unique references to runtime objects, in contrast to indices, which are static, module-local references to their original definitions. A memory address memaddr denotes the abstract address of a memory instance in the store, not an offset inside a memory instance.

There is no specific limit on the number of allocations of store objects, hence logical addresses can be arbitrarily large natural numbers.

Conventions

  • The notation addr(A) denotes the set of addresses from address space addr occurring free in A. We sometimes reinterpret this set as the vector of its elements.

Module Instances

A module instance is the runtime representation of a module. It is created by instantiating a module, and collects runtime representations of all entities that are imported, defined, or exported by the module.

moduleinst::={typesdeftype,funcaddrsfuncaddr,tableaddrstableaddr,memaddrsmemaddr,globaladdrsglobaladdr,elemaddrselemaddr,dataaddrsdataaddr,exportsexportinst }

Each component references runtime instances corresponding to respective declarations from the original module – whether imported or defined – in the order of their static indices. Function instances, table instances, memory instances, and global instances are referenced with an indirection through their respective addresses in the store.

It is an invariant of the semantics that all export instances in a given module instance have different names.

Function Instances

A function instance is the runtime representation of a function. It effectively is a closure of the original function over the runtime module instance of its originating module. The module instance is used to resolve references to other definitions during execution of the function.

funcinst::={type deftype,module moduleinst,code func}|{type deftype,hostcode hostfunc}hostfunc::=

A host function is a function expressed outside WebAssembly but passed to a module as an import. The definition and behavior of host functions are outside the scope of this specification. For the purpose of this specification, it is assumed that when invoked, a host function behaves non-deterministically, but within certain constraints that ensure the integrity of the runtime.

Note

Function instances are immutable, and their identity is not observable by WebAssembly code. However, the embedder might provide implicit or explicit means for distinguishing their addresses.

Table Instances

A table instance is the runtime representation of a table. It records its type and holds a vector of reference values.

tableinst::={type tabletype,elem vec(ref)}

Table elements can be mutated through table instructions, the execution of an active element segment, or by external means provided by the embedder.

It is an invariant of the semantics that all table elements have a type matching the element type of tabletype. It also is an invariant that the length of the element vector never exceeds the maximum size of tabletype, if present.

Memory Instances

A memory instance is the runtime representation of a linear memory. It records its type and holds a vector of bytes.

meminst::={type memtype,data vec(byte)}

The length of the vector always is a multiple of the WebAssembly page size, which is defined to be the constant 65536 – abbreviated 64Ki.

The bytes can be mutated through memory instructions, the execution of an active data segment, or by external means provided by the embedder.

It is an invariant of the semantics that the length of the byte vector, divided by page size, never exceeds the maximum size of memtype, if present.

Global Instances

A global instance is the runtime representation of a global variable. It records its type and holds an individual value.

globalinst::={type globaltype,value val}

The value of mutable globals can be mutated through variable instructions or by external means provided by the embedder.

It is an invariant of the semantics that the value has a type matching the value type of globaltype.

Element Instances

An element instance is the runtime representation of an element segment. It holds a vector of references and their common type.

eleminst::={type reftype,elem vec(ref)}

Data Instances

An data instance is the runtime representation of a data segment. It holds a vector of bytes.

datainst::={data vec(byte)}

Export Instances

An export instance is the runtime representation of an export. It defines the export’s name and the associated external value.

exportinst::={name name,value externval}

External Values

An external value is the runtime representation of an entity that can be imported or exported. It is an address denoting either a function instance, table instance, memory instance, or global instances in the shared store.

externval::=func funcaddr|table tableaddr|mem memaddr|global globaladdr

Conventions

The following auxiliary notation is defined for sequences of external values. It filters out entries of a specific kind in an order-preserving fashion:

  • funcs(externval)=[funcaddr | (func funcaddr)externval]

  • tables(externval)=[tableaddr | (table tableaddr)externval]

  • mems(externval)=[memaddr | (mem memaddr)externval]

  • globals(externval)=[globaladdr | (global globaladdr)externval]

Aggregate Instances

A structure instance is the runtime representation of a heap object allocated from a structure type. Likewise, an array instance is the runtime representation of a heap object allocated from an array type. Both record their respective defined type and hold a vector of the values of their fields.

structinst::={type deftype,fields vec(fieldval)}arrayinst::={type deftype,fields vec(fieldval)}fieldval::=val | packedvalpackedval::=i8.pack u8 | i16.pack u16

Conventions

  • Conversion of a regular value to a field value is defined as follows:

    packvaltype(val)=valpackpackedtype(i32.const i)=packedtype.pack (wrap32,|packedtype|(i))
  • The inverse conversion of a field value to a regular value is defined as follows:

    unpackvaltype(val)=valunpackpackedtypesx(packedtype.pack i)=i32.const (extend|packedtype|,32sx(i))

Stack

Besides the store, most instructions interact with an implicit stack. The stack contains three kinds of entries:

These entries can occur on the stack in any order during the execution of a program. Stack entries are described by abstract syntax as follows.

Note

It is possible to model the WebAssembly semantics using separate stacks for operands, control constructs, and calls. However, because the stacks are interdependent, additional book keeping about associated stack heights would be required. For the purpose of this specification, an interleaved representation is simpler.

Values

Values are represented by themselves.

Labels

Labels carry an argument arity n and their associated branch target, which is expressed syntactically as an instruction sequence:

label::=labeln{instr}

Intuitively, instr is the continuation to execute when the branch is taken, in place of the original control construct.

Note

For example, a loop label has the form

labeln{loop  end}

When performing a branch to this label, this executes the loop, effectively restarting it from the beginning. Conversely, a simple block label has the form

labeln{ϵ}

When branching, the empty continuation ends the targeted block, such that execution can proceed with consecutive instructions.

Activation Frames

Activation frames carry the return arity n of the respective function, hold the values of its locals (including arguments) in the order corresponding to their static local indices, and a reference to the function’s own module instance:

frame::=framen{framestate}framestate::={locals (val?),module moduleinst}

Locals may be uninitialized, in which case they are empty. Locals are mutated by respective variable instructions.

Conventions

  • The meta variable L ranges over labels where clear from context.

  • The meta variable F ranges over frame states where clear from context.

  • The following auxiliary definition takes a block type and looks up the instruction type that it denotes in the current frame:

instrtypeS;F(typeidx)=functype(ifexpand(F.module.types[typeidx])=func functype)instrtypeS;F([valtype?])=[][valtype?]

Administrative Instructions

Note

This section is only relevant for the formal notation.

In order to express the reduction of traps, calls, and control instructions, the syntax of instructions is extended to include the following administrative instructions:

instr::=|trap|ref.i31 u31|ref.struct structaddr|ref.array arrayaddr|ref.func funcaddr|ref.host hostaddr|ref.extern ref|invoke funcaddr|return_invoke funcaddr|labeln{instr} instr end|framen{framestate} instr end

The trap instruction represents the occurrence of a trap. Traps are bubbled up through nested instruction sequences, ultimately reducing the entire program to a single trap instruction, signalling abrupt termination.

The ref.i31 instruction represents unboxed scalar reference values, ref.struct and ref.array represent structure and array reference values, respectively, and ref.func instruction represents function reference values. Similarly, ref.host represents host references and ref.extern represents any externalized reference.

The invoke instruction represents the imminent invocation of a function instance, identified by its address. It unifies the handling of different forms of calls. Analogously, return_invoke represents the imminent tail invocation of a function instance.

The label and frame instructions model labels and frames “on the stack”. Moreover, the administrative syntax maintains the nesting structure of the original structured control instruction or function body and their instruction sequences with an end marker. That way, the end of the inner instruction sequence is known when part of an outer sequence.

Note

For example, the reduction rule for block is:

block [tn] instr endlabeln{ϵ} instr end

This replaces the block with a label instruction, which can be interpreted as “pushing” the label on the stack. When end is reached, i.e., the inner instruction sequence has been reduced to the empty sequence – or rather, a sequence of n const instructions representing the resulting values – then the label instruction is eliminated courtesy of its own reduction rule:

labelm{instr} valn endvaln

This can be interpreted as removing the label from the stack and only leaving the locally accumulated operand values.

Block Contexts

In order to specify the reduction of branches, the following syntax of block contexts is defined, indexed by the count k of labels surrounding a hole [_] that marks the place where the next step of computation is taking place:

B0::=val [_] instrBk+1::=val labeln{instr} Bk end instr

This definition allows to index active labels surrounding a branch or return instruction.

Note

For example, the reduction of a simple branch can be defined as follows:

label0{instr} Bl[br l] endinstr

Here, the hole [_] of the context is instantiated with a branch instruction. When a branch occurs, this rule replaces the targeted label and associated instruction sequence with the label’s continuation. The selected label is identified through the label index l, which corresponds to the number of surrounding label instructions that must be hopped over – which is exactly the count encoded in the index of a block context.

Configurations

A configuration consists of the current store and an executing thread.

A thread is a computation over instructions that operates relative to the state of a current frame referring to the module instance in which the computation runs, i.e., where the current function originates from.

config::=store;threadthread::=framestate;instr

Note

The current version of WebAssembly is single-threaded, but configurations with multiple threads may be supported in the future.

Evaluation Contexts

Finally, the following definition of evaluation context and associated structural rules enable reduction inside instruction sequences and administrative forms as well as the propagation of traps:

E::=[_] | val E instr | labeln{instr} E end
S;F;E[instr]S;F;E[instr](ifS;F;instrS;F;instr)S;F;framen{F} instr endS;F;framen{F} instr end(ifS;F;instrS;F;instr)S;F;E[trap]S;F;trap(ifE[_])S;F;framen{F} trap endS;F;trap

Reduction terminates when a thread’s instruction sequence has been reduced to a result, that is, either a sequence of values or to a trap.

Note

The restriction on evaluation contexts rules out contexts like [_] and ϵ [_] ϵ for which E[trap]=trap.

For an example of reduction under evaluation contexts, consider the following instruction sequence.

(f64.const x1) (f64.const x2) f64.neg (f64.const x3) f64.add f64.mul

This can be decomposed into E[(f64.const x2) f64.neg] where

E=(f64.const x1) [_] (f64.const x3) f64.add f64.mul

Moreover, this is the only possible choice of evaluation context where the contents of the hole matches the left-hand side of a reduction rule.