WebAssembly Core Specification

Editor’s Draft,

More details about this document
This version:
https://webassembly.github.io/spec/core/bikeshed/
Latest published version:
https://www.w3.org/TR/wasm-core-2/
Feedback:
GitHub
Editor:
Andreas Rossberg
Issue Tracking:
GitHub Issues

Abstract

This document describes release 2.0 of the core WebAssembly standard, a safe, portable, low-level code format designed for efficient execution and compact representation.

This is part of a collection of related documents: the Core WebAssembly Specification, the WebAssembly JS Interface, and the WebAssembly Web API.

Status of this document

This is a public copy of the editors’ draft. It is provided for discussion only and may change at any moment. Its publication here does not imply endorsement of its contents by W3C. Don’t cite this document other than as work in progress.

GitHub Issues are preferred for discussion of this specification. All issues and comments are archived.

This document was produced by the WebAssembly Working Group.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 03 November 2023 W3C Process Document.

1. Introduction

1.1. Introduction

WebAssembly (abbreviated Wasm [1]) is a safe, portable, low-level code format designed for efficient execution and compact representation. Its main goal is to enable high performance applications on the Web, but it does not make any Web-specific assumptions or provide Web-specific features, so it can be employed in other environments as well.

WebAssembly is an open standard developed by a W3C Community Group.

This document describes version 2.0 + tail calls + function references + gc (Draft 2024-12-09) of the core WebAssembly standard. It is intended that it will be superseded by new incremental releases with additional features in the future.

1.1.1. Design Goals

The design goals of WebAssembly are the following:

  • Fast, safe, and portable semantics:

    • Fast: executes with near native code performance, taking advantage of capabilities common to all contemporary hardware.

    • Safe: code is validated and executes in a memory-safe [2], sandboxed environment preventing data corruption or security breaches.

    • Well-defined: fully and precisely defines valid programs and their behavior in a way that is easy to reason about informally and formally.

    • Hardware-independent: can be compiled on all modern architectures, desktop or mobile devices and embedded systems alike.

    • Language-independent: does not privilege any particular language, programming model, or object model.

    • Platform-independent: can be embedded in browsers, run as a stand-alone VM, or integrated in other environments.

    • Open: programs can interoperate with their environment in a simple and universal manner.

  • Efficient and portable representation:

    • Compact: has a binary format that is fast to transmit by being smaller than typical text or native code formats.

    • Modular: programs can be split up in smaller parts that can be transmitted, cached, and consumed separately.

    • Efficient: can be decoded, validated, and compiled in a fast single pass, equally with either just-in-time (JIT) or ahead-of-time (AOT) compilation.

    • Streamable: allows decoding, validation, and compilation to begin as soon as possible, before all data has been seen.

    • Parallelizable: allows decoding, validation, and compilation to be split into many independent parallel tasks.

    • Portable: makes no architectural assumptions that are not broadly supported across modern hardware.

WebAssembly code is also intended to be easy to inspect and debug, especially in environments like web browsers, but such features are beyond the scope of this specification.

1.1.2. Scope

At its core, WebAssembly is a virtual instruction set architecture (virtual ISA). As such, it has many use cases and can be embedded in many different environments. To encompass their variety and enable maximum reuse, the WebAssembly specification is split and layered into several documents.

This document is concerned with the core ISA layer of WebAssembly. It defines the instruction set, binary encoding, validation, and execution semantics, as well as a textual representation. It does not, however, define how WebAssembly programs can interact with a specific environment they execute in, nor how they are invoked from such an environment.

Instead, this specification is complemented by additional documents defining interfaces to specific embedding environments such as the Web. These will each define a WebAssembly application programming interface (API) suitable for a given environment.

1.1.3. Security Considerations

WebAssembly provides no ambient access to the computing environment in which code is executed. Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module. An embedder can establish security policies suitable for a respective environment by controlling or limiting which functional capabilities it makes available for import. Such considerations are an embedder’s responsibility and the subject of API definitions for a specific environment.

Because WebAssembly is designed to be translated into machine code running directly on the host’s hardware, it is potentially vulnerable to side channel attacks on the hardware level. In environments where this is a concern, an embedder may have to put suitable mitigations into place to isolate WebAssembly computations.

1.1.4. Dependencies

WebAssembly depends on two existing standards:

However, to make this specification self-contained, relevant aspects of the aforementioned standards are defined and formalized as part of this specification, such as the binary representation and rounding of floating-point values, and the value range and UTF-8 encoding of Unicode characters.

Note

The aforementioned standards are the authoritative source of all respective definitions. Formalizations given in this specification are intended to match these definitions. Any discrepancy in the syntax or semantics described is to be considered an error.

1.2. Overview

1.2.1. Concepts

WebAssembly encodes a low-level, assembly-like programming language. This language is structured around the following concepts.

Values

WebAssembly provides only four basic number types. These are integers and [IEEE-754-2019] numbers, each in 32 and 64 bit width. 32 bit integers also serve as Booleans and as memory addresses. The usual operations on these types are available, including the full matrix of conversions between them. There is no distinction between signed and unsigned integer types. Instead, integers are interpreted by respective operations as either unsigned or signed in two’s complement representation.

In addition to these basic number types, there is a single 128 bit wide vector type representing different types of packed data. The supported representations are 4 32-bit, or 2 64-bit [IEEE-754-2019] numbers, or different widths of packed integer values, specifically 2 64-bit integers, 4 32-bit integers, 8 16-bit integers, or 16 8-bit integers.

Finally, values can consist of opaque references that represent pointers towards different sorts of entities. Unlike with other types, their size or representation is not observable.

Instructions

The computational model of WebAssembly is based on a stack machine. Code consists of sequences of instructions that are executed in order. Instructions manipulate values on an implicit operand stack [1] and fall into two main categories. Simple instructions perform basic operations on data. They pop arguments from the operand stack and push results back to it. Control instructions alter control flow. Control flow is structured, meaning it is expressed with well-nested constructs such as blocks, loops, and conditionals. Branches can only target such constructs.

Traps

Under some conditions, certain instructions may produce a trap, which immediately aborts execution. Traps cannot be handled by WebAssembly code, but are reported to the outside environment, where they typically can be caught.

Functions

Code is organized into separate functions. Each function takes a sequence of values as parameters and returns a sequence of values as results. Functions can call each other, including recursively, resulting in an implicit call stack that cannot be accessed directly. Functions may also declare mutable local variables that are usable as virtual registers.

Tables

A table is an array of opaque values of a particular reference type. It allows programs to select such values indirectly through a dynamic index operand. Thereby, for example, a program can call functions indirectly through a dynamic index into a table. This allows emulating function pointers by way of table indices.

Linear Memory

A linear memory is a contiguous, mutable array of raw bytes. Such a memory is created with an initial size but can be grown dynamically. A program can load and store values from/to a linear memory at any byte address (including unaligned). Integer loads and stores can specify a storage size which is smaller than the size of the respective value type. A trap occurs if an access is not within the bounds of the current memory size.

Modules

A WebAssembly binary takes the form of a module that contains definitions for functions, tables, and linear memories, as well as mutable or immutable global variables. Definitions can also be imported, specifying a module/name pair and a suitable type. Each definition can optionally be exported under one or more names. In addition to definitions, modules can define initialization data for their memories or tables that takes the form of segments copied to given offsets. They can also define a start function that is automatically executed.

Embedder

A WebAssembly implementation will typically be embedded into a host environment. This environment defines how loading of modules is initiated, how imports are provided (including host-side definitions), and how exports can be accessed. However, the details of any particular embedding are beyond the scope of this specification, and will instead be provided by complementary, environment-specific API definitions.

1.2.2. Semantic Phases

Conceptually, the semantics of WebAssembly is divided into three phases. For each part of the language, the specification specifies each of them.

Decoding

WebAssembly modules are distributed in a binary format. Decoding processes that format and converts it into an internal representation of a module. In this specification, this representation is modelled by abstract syntax, but a real implementation could compile directly to machine code instead.

Validation

A decoded module has to be valid. Validation checks a number of well-formedness conditions to guarantee that the module is meaningful and safe. In particular, it performs type checking of functions and the instruction sequences in their bodies, ensuring for example that the operand stack is used consistently.

Execution

Finally, a valid module can be executed. Execution can be further divided into two phases:

Instantiation. A module instance is the dynamic representation of a module, complete with its own state and execution stack. Instantiation executes the module body itself, given definitions for all its imports. It initializes globals, memories and tables and invokes the module’s start function if defined. It returns the instances of the module’s exports.

Invocation. Once instantiated, further WebAssembly computations can be initiated by invoking an exported function on a module instance. Given the required arguments, that executes the respective function and returns its results.

Instantiation and invocation are operations within the embedding environment.

2. Structure

2.1. Conventions

WebAssembly is a programming language that has multiple concrete representations (its binary format and the text format). Both map to a common structure. For conciseness, this structure is described in the form of an abstract syntax. All parts of this specification are defined in terms of this abstract syntax.

2.1.1. Grammar Notation

The following conventions are adopted in defining grammar rules for abstract syntax.

  • Terminal symbols (atoms) are written in sans-serif font or in symbolic form: .

  • Nonterminal symbols are written in italic font: .

  • is a sequence of iterations of .

  • is a possibly empty sequence of iterations of . (This is a shorthand for used where is not relevant.)

  • is a non-empty sequence of iterations of . (This is a shorthand for where .)

  • is an optional occurrence of . (This is a shorthand for where .)

  • Productions are written .

  • Large productions may be split into multiple definitions, indicated by ending the first one with explicit ellipses, , and starting continuations with ellipses, .

  • Some productions are augmented with side conditions in parentheses, “”, that provide a shorthand for a combinatorial expansion of the production into many separate cases.

  • If the same meta variable or non-terminal symbol appears multiple times in a production, then all those occurrences must have the same instantiation. (This is a shorthand for a side condition requiring multiple different variables to be equal.)

2.1.2. Auxiliary Notation

When dealing with syntactic constructs the following notation is also used:

  • denotes the empty sequence.

  • denotes the length of a sequence .

  • denotes the -th element of a sequence , starting from .

  • denotes the sub-sequence of a sequence .

  • denotes the same sequence as , except that the -th element is replaced with .

  • denotes the same sequence as , except that the sub-sequence is replaced with .

  • denotes the flat sequence formed by concatenating all sequences in .

Moreover, the following conventions are employed:

  • The notation , where is a non-terminal symbol, is treated as a meta variable ranging over respective sequences of (similarly for , , ).

  • When given a sequence , then the occurrences of in a sequence written are assumed to be in point-wise correspondence with (similarly for , , ). This implicitly expresses a form of mapping syntactic constructions over a sequence.

Productions of the following form are interpreted as records that map a fixed set of fields to “values” , respectively:

The following notation is adopted for manipulating such records:

  • denotes the contents of the component of .

  • denotes the same record as , except that the contents of the component is replaced with .

  • denotes the composition of two records with the same fields of sequences by appending each sequence point-wise:

  • denotes the composition of a sequence of records, respectively; if the sequence is empty, then all fields of the resulting record are empty.

The update notation for sequences and records generalizes recursively to nested components accessed by “paths” :

where is shortened to .

2.1.3. Vectors

Vectors are bounded sequences of the form (or ), where the can either be values or complex constructions. A vector can have at most elements.

2.2. Values

WebAssembly programs operate on primitive numeric values. Moreover, in the definition of programs, immutable sequences of values occur to represent more complex data, such as text strings or other vectors.

2.2.1. Bytes

The simplest form of value are raw uninterpreted bytes. In the abstract syntax they are represented as hexadecimal literals.

2.2.1.1. Conventions
  • The meta variable ranges over bytes.

  • Bytes are sometimes interpreted as natural numbers .

2.2.2. Integers

Different classes of integers with different value ranges are distinguished by their bit width and by whether they are unsigned or signed.

The class defines uninterpreted integers, whose signedness interpretation can vary depending on context. In the abstract syntax, they are represented as unsigned values. However, some operations convert them to signed based on a two’s complement interpretation.

Note

The main integer types occurring in this specification are , , , , , , , . However, other sizes occur as auxiliary constructions, e.g., in the definition of floating-point numbers.

2.2.2.1. Conventions
  • The meta variables range over integers.

  • Numbers may be denoted by simple arithmetics, as in the grammar above. In order to distinguish arithmetics like from sequences like , the latter is distinguished with parentheses.

2.2.3. Floating-Point

Floating-point data represents 32 or 64 bit values that correspond to the respective binary formats of the [IEEE-754-2019] standard (Section 3.3).

Every value has a sign and a magnitude. Magnitudes can either be expressed as normal numbers of the form , where is the exponent and is the significand whose most significant bit is , or as a subnormal number where the exponent is fixed to the smallest possible value and is ; among the subnormals are positive and negative zero values. Since the significands are binary values, normals are represented in the form , where is the bit width of ; similarly for subnormals.

Possible magnitudes also include the special values (infinity) and (NaN, not a number). NaN values have a payload that describes the mantissa bits in the underlying binary representation. No distinction is made between signalling and quiet NaNs.

where and with

A canonical NaN is a floating-point value where is a payload whose most significant bit is while all others are :

An arithmetic NaN is a floating-point value with , such that the most significant bit is while all others are arbitrary.

Note

In the abstract syntax, subnormals are distinguished by the leading 0 of the significand. The exponent of subnormals has the same value as the smallest possible exponent of a normal number. Only in the binary representation the exponent of a subnormal is encoded differently than the exponent of any normal number.

The notion of canonical NaN defined here is unrelated to the notion of canonical NaN that the [IEEE-754-2019] standard (Section 3.5.2) defines for decimal interchange formats.

2.2.3.1. Conventions
  • The meta variable ranges over floating-point values where clear from context.

2.2.4. Vectors

Numeric vectors are 128-bit values that are processed by vector instructions (also known as SIMD instructions, single instruction multiple data). They are represented in the abstract syntax using . The interpretation of lane types (integer or floating-point numbers) and lane sizes are determined by the specific instruction operating on them.

2.2.5. Names

Names are sequences of characters, which are scalar values as defined by [UNICODE] (Section 2.4).

Due to the limitations of the binary format, the length of a name is bounded by the length of its UTF-8 encoding.

2.2.5.1. Convention
  • Characters (Unicode scalar values) are sometimes used interchangeably with natural numbers .

2.3. Types

Various entities in WebAssembly are classified by types. Types are checked during validation, instantiation, and possibly execution.

2.3.1. Number Types

Number types classify numeric values.

The types and classify 32 and 64 bit integers, respectively. Integers are not inherently signed or unsigned, their interpretation is determined by individual operations.

The types and classify 32 and 64 bit floating-point data, respectively. They correspond to the respective binary floating-point representations, also known as single and double precision, as defined by the [IEEE-754-2019] standard (Section 3.3).

Number types are transparent, meaning that their bit patterns can be observed. Values of number type can be stored in memories.

2.3.1.1. Conventions
  • The notation denotes the bit width of a number type . That is, and .

2.3.2. Vector Types

Vector types classify vectors of numeric values processed by vector instructions (also known as SIMD instructions, single instruction multiple data).

The type corresponds to a 128 bit vector of packed integer or floating-point data. The packed data can be interpreted as signed or unsigned integers, single or double precision floating-point values, or a single 128 bit type. The interpretation is determined by individual operations.

Vector types, like number types are transparent, meaning that their bit patterns can be observed. Values of vector type can be stored in memories.

2.3.2.1. Conventions
  • The notation for bit width extends to vector types as well, that is, .

2.3.3. Heap Types

Heap types classify objects in the runtime store. There are three disjoint hierarchies of heap types:

  • function types classify functions,

  • aggregate types classify dynamically allocated managed data, such as structures, arrays, or unboxed scalars,

  • external types classify external references possibly owned by the embedder.

The values from the latter two hierarchies are interconvertible by ways of the and instructions. That is, both type hierarchies are inhabited by an isomorphic set of values, but may have different, incompatible representations in practice.

A heap type is either abstract or concrete.

The abstract type denotes the common supertype of all function types, regardless of their concrete definition. Dually, the type denotes the common subtype of all function types, regardless of their concrete definition. This type has no values.

The abstract type denotes the common supertype of all external references received through the embedder. This type has no concrete subtypes. Dually, the type denotes the common subtype of all forms of external references. This type has no values.

The abstract type denotes the common supertype of all aggregate types, as well as possibly abstract values produced by internalizing an external reference of type . Dually, the type denotes the common subtype of all forms of aggregate types. This type has no values.

The abstract type is a subtype of that includes all types for which references can be compared, i.e., aggregate values and .

The abstract types and denote the common supertypes of all structure and array aggregates, respectively.

The abstract type denotes unboxed scalars, that is, integers injected into references. Their observable value range is limited to 31 bits.

Note

An is not actually allocated in the store, but represented in a way that allows them to be mixed with actual references into the store without ambiguity. Engines need to perform some form of pointer tagging to achieve this, which is why 1 bit is reserved.

Although the types , , and are not inhabited by any values, they can be used to form the types of all null references in their respective hierarchy. For example, is the generic type of a null reference compatible with all function reference types.

A concrete heap type consists of a type index and classifies an object of the respective type defined in a module.

The syntax of heap types is extended with additional forms for the purpose of specifying validation and execution.

2.3.4. Reference Types

Reference types classify values that are first-class references to objects in the runtime store.

A reference type is characterised by the heap type it points to.

In addition, a reference type of the form is nullable, meaning that it can either be a proper reference to or null. Other references are non-null.

Reference types are opaque, meaning that neither their size nor their bit pattern can be observed. Values of reference type can be stored in tables.

2.3.4.1. Conventions

2.3.5. Value Types

Value types classify the individual values that WebAssembly code can compute with and the values that a variable accepts. They are either number types, vector types, or reference types.

The syntax of value types is extended with additional forms for the purpose of specifying validation.

2.3.5.1. Conventions
  • The meta variable ranges over value types or subclasses thereof where clear from context.

2.3.6. Result Types

Result types classify the result of executing instructions or functions, which is a sequence of values, written with brackets.

2.3.7. Function Types

Function types classify the signature of functions, mapping a vector of parameters to a vector of results. They are also used to classify the inputs and outputs of instructions.

2.3.8. Aggregate Types

Aggregate types describe compound objects consisting of multiple values. These are either structures or arrays, which both consist of a list of possibly mutable and possibly packed fields. Structures are heterogeneous, but require static indexing, while arrays need to be homogeneous, but allow dynamic indexing.

2.3.8.1. Conventions

2.3.9. Composite Types

Composite types are all types composed from simpler types, including function types and aggregate types.

2.3.10. Recursive Types

Recursive types denote a group of mutually recursive composite types, each of which can optionally declare a list of type indices of supertypes that it matches. Each type can also be declared final, preventing further subtyping.

In a module, each member of a recursive type is assigned a separate type index.

The syntax of sub types is generalized for the purpose of specifying validation and execution.

2.3.11. Limits

Limits classify the size range of resizeable storage associated with memory types and table types.

If no maximum is given, the respective storage can grow to any size.

2.3.12. Memory Types

Memory types classify linear memories and their size range.

The limits constrain the minimum and optionally the maximum size of a memory. The limits are given in units of page size.

2.3.13. Table Types

Table types classify tables over elements of reference type within a size range.

Like memories, tables are constrained by limits for their minimum and optionally maximum size. The limits are given in numbers of entries.

2.3.14. Global Types

Global types classify global variables, which hold a value and can either be mutable or immutable.

2.3.15. External Types

External types classify imports and external values with their respective types.

2.3.15.1. Conventions

The following auxiliary notation is defined for sequences of external types. It filters out entries of a specific kind in an order-preserving fashion:

2.4. Instructions

WebAssembly code consists of sequences of instructions. Its computational model is based on a stack machine in that instructions manipulate values on an implicit operand stack, consuming (popping) argument values and producing or returning (pushing) result values.

In addition to dynamic operands from the stack, some instructions also have static immediate arguments, typically indices or type annotations, which are part of the instruction itself.

Some instructions are structured in that they bracket nested sequences of instructions.

The following sections group instructions into a number of different categories.

2.4.1. Numeric Instructions

Numeric instructions provide basic operations over numeric values of specific type. These operations closely match respective operations available in hardware.

Numeric instructions are divided by number type. For each type, several subcategories can be distinguished:

  • Constants: return a static constant.

  • Unary Operations: consume one operand and produce one result of the respective type.

  • Binary Operations: consume two operands and produce one result of the respective type.

  • Tests: consume one operand of the respective type and produce a Boolean integer result.

  • Comparisons: consume two operands of the respective type and produce a Boolean integer result.

  • Conversions: consume a value of one type and produce a result of another (the source type of the conversion is the one after the “”).

Some integer instructions come in two flavors, where a signedness annotation distinguishes whether the operands are to be interpreted as unsigned or signed integers. For the other integer instructions, the use of two’s complement for the signed interpretation means that they behave the same regardless of signedness.

2.4.1.1. Conventions

Occasionally, it is convenient to group operators together according to the following grammar shorthands:

2.4.2. Vector Instructions

Vector instructions (also known as SIMD instructions, single instruction multiple data) provide basic operations over values of vector type.

Vector instructions have a naming convention involving a prefix that determines how their operands will be interpreted. This prefix describes the shape of the operand, written , and consisting of a packed numeric type and the number of lanes of that type. Operations are performed point-wise on the values of each lane.

Note

For example, the shape interprets the operand as four values, packed into an . The bit width of the numeric type times always is 128.

Instructions prefixed with do not involve a specific interpretation, and treat the as an value or a vector of 128 individual bits.

Vector instructions can be grouped into several subcategories:

  • Constants: return a static constant.

  • Unary Operations: consume one operand and produce one result.

  • Binary Operations: consume two operands and produce one result.

  • Ternary Operations: consume three operands and produce one result.

  • Tests: consume one operand and produce a Boolean integer result.

  • Shifts: consume a operand and a operand, producing one result.

  • Splats: consume a value of numeric type and produce a result of a specified shape.

  • Extract lanes: consume a operand and return the numeric value in a given lane.

  • Replace lanes: consume a operand and a numeric value for a given lane, and produce a result.

Some vector instructions have a signedness annotation which distinguishes whether the elements in the operands are to be interpreted as unsigned or signed integers. For the other vector instructions, the use of two’s complement for the signed interpretation means that they behave the same regardless of signedness.

2.4.2.1. Conventions

Occasionally, it is convenient to group operators together according to the following grammar shorthands:

2.4.3. Reference Instructions

Instructions in this group are concerned with accessing references.

The and instructions produce a null value or a reference to a given function, respectively.

The instruction checks for null, while converts a nullable to a non-null one, and traps if it encounters null.

The compares two references.

The instructions and test the dynamic type of a reference operand. The former merely returns the result of the test, while the latter performs a downcast and traps if the operand’s type does not match.

Note

The and instructions provides versions of the latter that branch depending on the success of the downcast instead of trapping.

2.4.4. Aggregate Instructions

Instructions in this group are concerned with creating and accessing references to aggregate types.

The instructions and allocate a new structure, initializing them either with operands or with default values. The remaining instructions on structs access individual fields, allowing for different sign extension modes in the case of packed storage types.

Similarly, arrays can be allocated either with an explicit initialization operand or a default value. Furthermore, allocates an array with statically fixed size, and and allocate an array and initialize it from a data or element segment, respectively. , , , and access individual slots, again allowing for different sign extension modes in the case of a packed storage type. produces the length of an array. fills a specified slice of an array with a given value and , , and copy elements to a specified slice of an array from a given array, data segment, or element segment, respectively.

The instructions and convert between type and an unboxed scalar.

The instructions and allow lossless conversion between references represented as type .

2.4.5. Parametric Instructions

Instructions in this group can operate on operands of any value type.

The instruction simply throws away a single operand.

The instruction selects one of its first two operands based on whether its third operand is zero or not. It may include a value type determining the type of these operands. If missing, the operands must be of numeric type.

Note

In future versions of WebAssembly, the type annotation on may allow for more than a single value being selected at the same time.

2.4.6. Variable Instructions

Variable instructions are concerned with access to local or global variables.

These instructions get or set the values of variables, respectively. The instruction is like but also returns its argument.

2.4.7. Table Instructions

Instructions in this group are concerned with tables table.

The and instructions load or store an element in a table, respectively.

The instruction returns the current size of a table. The instruction grows table by a given delta and returns the previous size, or if enough space cannot be allocated. It also takes an initialization value for the newly allocated entries.

The instruction sets all entries in a range to a given value.

The instruction copies elements from a source table region to a possibly overlapping destination region; the first index denotes the destination. The instruction copies elements from a passive element segment into a table. The instruction prevents further use of a passive element segment. This instruction is intended to be used as an optimization hint. After an element segment is dropped its elements can no longer be retrieved, so the memory used by this segment may be freed.

An additional instruction that accesses a table is the control instruction .

2.4.8. Memory Instructions

Instructions in this group are concerned with linear memory.

Memory is accessed with and instructions for the different number types. They all take a memory immediate that contains an address offset and the expected alignment (expressed as the exponent of a power of 2). Integer loads and stores can optionally specify a storage size that is smaller than the bit width of the respective value type. In the case of loads, a sign extension mode is then required to select appropriate behavior.

Vector loads can specify a shape that is half the bit width of . Each lane is half its usual size, and the sign extension mode then specifies how the smaller lane is extended to the larger lane. Alternatively, vector loads can perform a splat, such that only a single lane of the specified storage size is loaded, and the result is duplicated to all lanes.

The static address offset is added to the dynamic address operand, yielding a 33 bit effective address that is the zero-based index at which the memory is accessed. All values are read and written in little endian byte order. A trap results if any of the accessed memory bytes lies outside the address range implied by the memory’s current size.

Note

Future versions of WebAssembly might provide memory instructions with 64 bit address ranges.

The instruction returns the current size of a memory. The instruction grows memory by a given delta and returns the previous size, or if enough memory cannot be allocated. Both instructions operate in units of page size.

The instruction sets all values in a region to a given byte. The instruction copies data from a source memory region to a possibly overlapping destination region. The instruction copies data from a passive data segment into a memory. The instruction prevents further use of a passive data segment. This instruction is intended to be used as an optimization hint. After a data segment is dropped its data can no longer be retrieved, so the memory used by this segment may be freed.

Note

In the current version of WebAssembly, all memory instructions implicitly operate on memory index . This restriction may be lifted in future versions.

2.4.9. Control Instructions

Instructions in this group affect the flow of control.

The instruction does nothing.

The instruction causes an unconditional trap.

The , and instructions are structured instructions. They bracket nested sequences of instructions, called blocks, terminated with, or separated by, or pseudo-instructions. As the grammar prescribes, they must be well-nested.

A structured instruction can consume input and produce output on the operand stack according to its annotated block type. It is given either as a type index that refers to a suitable function type reinterpreted as an instruction type, or as an optional value type inline, which is a shorthand for the instruction type .

Each structured control instruction introduces an implicit label. Labels are targets for branch instructions that reference them with label indices. Unlike with other index spaces, indexing of labels is relative by nesting depth, that is, label refers to the innermost structured control instruction enclosing the referring branch instruction, while increasing indices refer to those farther out. Consequently, labels can only be referenced from within the associated structured control instruction. This also implies that branches can only be directed outwards, “breaking” from the block of the control construct they target. The exact effect depends on that control construct. In case of or it is a forward jump, resuming execution after the matching . In case of it is a backward jump to the beginning of the loop.

Note

This enforces structured control flow. Intuitively, a branch targeting a or behaves like a statement in most C-like languages, while a branch targeting a behaves like a statement.

Branch instructions come in several flavors: performs an unconditional branch, performs a conditional branch, and performs an indirect branch through an operand indexing into the label vector that is an immediate to the instruction, or to a default target if the operand is out of bounds. The and instructions check whether a reference operand is null and branch if that is the case or not the case, respectively. Similarly, and attempt a downcast on a reference operand and branch if that succeeds, or fails, respectively.

The instruction is a shortcut for an unconditional branch to the outermost block, which implicitly is the body of the current function. Taking a branch unwinds the operand stack up to the height where the targeted structured control instruction was entered. However, branches may additionally consume operands themselves, which they push back on the operand stack after unwinding. Forward branches require operands according to the output of the targeted block’s type, i.e., represent the values produced by the terminated block. Backward branches require operands according to the input of the targeted block’s type, i.e., represent the values consumed by the restarted block.

The instruction invokes another function, consuming the necessary arguments from the stack and returning the result values of the call. The instruction invokes a function indirectly through a function reference operand. The instruction calls a function indirectly through an operand indexing into a table that is denoted by a table index and must contain function references. Since it may contain functions of heterogeneous type, the callee is dynamically checked against the function type indexed by the instruction’s second immediate, and the call is aborted with a trap if it does not match.

The , , and instructions are tail-call variants of the previous ones. That is, they first return from the current function before actually performing the respective call. It is guaranteed that no sequence of nested calls using only these instructions can cause resource exhaustion due to hitting an implementation’s limit on the number of active calls.

2.4.10. Expressions

Function bodies, initialization values for globals, elements and offsets of element segments, and offsets of data segments are given as expressions, which are sequences of instructions terminated by an marker.

In some places, validation restricts expressions to be constant, which limits the set of allowable instructions.

2.5. Modules

WebAssembly programs are organized into modules, which are the unit of deployment, loading, and compilation. A module collects definitions for types, functions, tables, memories, and globals. In addition, it can declare imports and exports and provide initialization in the form of data and element segments, or a start function.

Each of the vectors – and thus the entire module – may be empty.

2.5.1. Indices

Definitions are referenced with zero-based indices. Each class of definition has its own index space, as distinguished by the following classes.

The index space for functions, tables, memories and globals includes respective imports declared in the same module. The indices of these imports precede the indices of other definitions in the same index space.

Element indices reference element segments and data indices reference data segments.

The index space for locals is only accessible inside a function and includes the parameters of that function, which precede the local variables.

Label indices reference structured control instructions inside an instruction sequence.

Each aggregate type provides an index space for its fields.

2.5.1.1. Conventions
  • The meta variable ranges over label indices.

  • The meta variables range over indices in any of the other index spaces.

  • The notation denotes the set of indices from index space occurring free in . Sometimes this set is reinterpreted as the vector of its elements.

Note

For example, if is , then , or equivalently, the vector .

2.5.2. Types

The component of a module defines a vector of recursive types, each of consisting of a list of sub types referenced by individual type indices. All function or aggregate types used in a module must be defined in this component.

2.5.3. Functions

The component of a module defines a vector of functions with the following structure:

The of a function declares its signature by reference to a type defined in the module. The parameters of the function are referenced through 0-based local indices in the function’s body; they are mutable.

The declare a vector of mutable local variables and their types. These variables are referenced through local indices in the function’s body. The index of the first local is the smallest index not referencing a parameter.

The is an instruction sequence that upon termination must produce a stack matching the function type’s result type.

Functions are referenced through function indices, starting with the smallest index not referencing a function import.

2.5.4. Tables

The component of a module defines a vector of tables described by their table type:

A table is an array of opaque values of a particular reference type. Moreover, each table slot is initialized with the value given by a constant initializer expression. Tables can further be initialized through element segments.

The size in the limits of the table type specifies the initial size of that table, while its , if present, restricts the size to which it can grow later.

Tables are referenced through table indices, starting with the smallest index not referencing a table import. Most constructs implicitly reference table index .

2.5.5. Memories

The component of a module defines a vector of linear memories (or memories for short) as described by their memory type:

A memory is a vector of raw uninterpreted bytes. The size in the limits of the memory type specifies the initial size of that memory, while its , if present, restricts the size to which it can grow later. Both are in units of page size.

Memories can be initialized through data segments.

Memories are referenced through memory indices, starting with the smallest index not referencing a memory import. Most constructs implicitly reference memory index .

Note

In the current version of WebAssembly, at most one memory may be defined or imported in a single module, and all constructs implicitly reference this memory . This restriction may be lifted in future versions.

2.5.6. Globals

The component of a module defines a vector of global variables (or globals for short):

Each global stores a single value of the given global type. Its also specifies whether a global is immutable or mutable. Moreover, each global is initialized with an value given by a constant initializer expression.

Globals are referenced through global indices, starting with the smallest index not referencing a global import.

2.5.7. Element Segments

The initial contents of a table is uninitialized. Element segments can be used to initialize a subrange of a table from a static vector of elements.

The component of a module defines a vector of element segments. Each element segment defines a reference type and a corresponding list of constant element expressions.

Element segments have a mode that identifies them as either passive, active, or declarative. A passive element segment’s elements can be copied to a table using the instruction. An active element segment copies its elements into a table during instantiation, as specified by a table index and a constant expression defining an offset into that table. A declarative element segment is not available at runtime but merely serves to forward-declare references that are formed in code with instructions like .

The is given by a constant expression.

Element segments are referenced through element indices.

2.5.8. Data Segments

The initial contents of a memory are zero bytes. Data segments can be used to initialize a range of memory from a static vector of bytes.

The component of a module defines a vector of data segments.

Like element segments, data segments have a mode that identifies them as either passive or active. A passive data segment’s contents can be copied into a memory using the instruction. An active data segment copies its contents into a memory during instantiation, as specified by a memory index and a constant expression defining an offset into that memory.

Data segments are referenced through data indices.

Note

In the current version of WebAssembly, at most one memory is allowed in a module. Consequently, the only valid is .

2.5.9. Start Function

The component of a module declares the function index of a start function that is automatically invoked when the module is instantiated, after tables and memories have been initialized.

Note

The start function is intended for initializing the state of a module. The module and its exports are not accessible externally before this initialization has completed.

2.5.10. Exports

The component of a module defines a set of exports that become accessible to the host environment once the module has been instantiated.

Each export is labeled by a unique name. Exportable definitions are functions, tables, memories, and globals, which are referenced through a respective descriptor.

2.5.10.1. Conventions

The following auxiliary notation is defined for sequences of exports, filtering out indices of a specific kind in an order-preserving fashion:

2.5.11. Imports

The component of a module defines a set of imports that are required for instantiation.

Each import is labeled by a two-level name space, consisting of a name and a for an entity within that module. Importable definitions are functions, tables, memories, and globals. Each import is specified by a descriptor with a respective type that a definition provided during instantiation is required to match.

Every import defines an index in the respective index space. In each index space, the indices of imports go before the first index of any definition contained in the module itself.

Note

Unlike export names, import names are not necessarily unique. It is possible to import the same / pair multiple times; such imports may even have different type descriptions, including different kinds of entities. A module with such imports can still be instantiated depending on the specifics of how an embedder allows resolving and supplying imports. However, embedders are not required to support such overloading, and a WebAssembly module itself cannot implement an overloaded name.

3. Validation

3.1. Conventions

Validation checks that a WebAssembly module is well-formed. Only valid modules can be instantiated.

Validity is defined by a type system over the abstract syntax of a module and its contents. For each piece of abstract syntax, there is a typing rule that specifies the constraints that apply to it. All rules are given in two equivalent forms:

  1. In prose, describing the meaning in intuitive form.

  2. In formal notation, describing the rule in mathematical form. [1]

Note

The prose and formal rules are equivalent, so that understanding of the formal notation is not required to read this specification. The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.

In both cases, the rules are formulated in a declarative manner. That is, they only formulate the constraints, they do not define an algorithm. The skeleton of a sound and complete algorithm for type-checking instruction sequences according to this specification is provided in the appendix.

3.1.1. Types

To define the semantics, the definition of some sorts of types is extended to include additional forms. By virtue of not being representable in either the binary format or the text format, these forms cannot be used in a program; they only occur during validation or execution.

The unique value type is a bottom type that matches all value types. Similarly, is also used as a bottom type of all heap types.

Note

No validation rule uses bottom types explicitly, but various rules can pick any value or heap type, including bottom. This ensures the existence of principal types, and thus a validation algorithm without back tracking.

A concrete heap type can consist of a defined type directly. this occurs as the result of substituting a type index with its definition.

A concrete heap type may also be a recursive type index. Such an index refers to the -th component of a surrounding recursive type. It occurs as the result of rolling up the definition of a recursive type.

Finally, the representation of supertypes in a sub type is generalized from mere type indices to heap types. They occur as defined types or recursive type indices after substituting type indices or rolling up recursive types.

Note

It is an invariant of the semantics that sub types occur only in one of two forms: either as “syntactic” types as in a source module, where all supertypes are type indices, or as “semantic” types, where all supertypes are resolved to either defined types or recursive type indices.

A type of any form is closed when it does not contain a heap type that is a type index or a recursive type index without a surrounding recursive type, i.e., all type indices have been substituted with their defined type and all free recursive type indices have been unrolled.

Note

Recursive type indices are internal to a recursive type. They are distinguished from regular type indices and represented such that two closed types are syntactically equal if and only if they have the same recursive structure.

3.1.1.1. Convention

Note

This definition computes an approximation of the reference type that is inhabited by all values from except those from . Since the type system does not have general union types, the defnition only affects the presence of null and cannot express the absence of other values.

3.1.2. Defined Types

Defined types denote the individual types defined in a module. Each such type is represented as a projection from the recursive type group it originates from, indexed by its position in that group.

Defined types do not occur in the binary or text format, but are formed by rolling up the recursive types defined in a module.

It is hence an invariant of the semantics that all recursive types occurring in defined types are rolled up.

3.1.2.1. Conventions
  • denotes the parallel substitution of type indices with defined types in type , provided .

  • denotes the parallel substitution of recursive type indices with defined types in type , provided .

  • is shorthand for the substitution , where .

3.1.3. Rolling and Unrolling

In order to allow comparing recursive types for equivalence, their representation is changed such that all type indices internal to the same recursive type are replaced by recursive type indices.

Note

This representation is independent of the type index space, so that it is meaningful across module boundaries. Moreover, this representation ensures that types with equivalent recursive structure are also syntactically equal, hence allowing a simple equality check on (closed) types. It gives rise to an iso-recursive interpretation of types.

The representation change is performed by two auxiliary operations on the syntax of recursive types:

These operations are extended to defined types and defined as follows:

In addition, the following auxiliary function denotes the expansion of a defined type:

3.1.4. Instruction Types

Instruction types classify the behaviour of instructions or instruction sequences, by describing how they manipulate the operand stack and the initialization status of locals:

An instruction type describes the required input stack with argument values of types that an instruction pops off and the provided output stack with result values of types that it pushes back. Moreover, it enumerates the indices of locals that have been set by the instruction or sequence.

Note

Instruction types are only used for validation, they do not occur in programs.

3.1.5. Local Types

Local types classify locals, by describing their value type as well as their initialization status:

Note

Local types are only used for validation, they do not occur in programs.

3.1.6. Contexts

Validity of an individual definition is specified relative to a context, which collects relevant information about the surrounding module and the definitions in scope:

  • Types: the list of types defined in the current module.

  • Functions: the list of functions declared in the current module, represented by a defined type that expands to their function type.

  • Tables: the list of tables declared in the current module, represented by their table type.

  • Memories: the list of memories declared in the current module, represented by their memory type.

  • Globals: the list of globals declared in the current module, represented by their global type.

  • Element Segments: the list of element segments declared in the current module, represented by the elements’ reference type.

  • Data Segments: the list of data segments declared in the current module, each represented by an entry.

  • Locals: the list of locals declared in the current function (including parameters), represented by their local type.

  • Labels: the stack of labels accessible from the current position, represented by their result type.

  • Return: the return type of the current function, represented as an optional result type that is absent when no return is allowed, as in free-standing expressions.

  • References: the list of function indices that occur in the module outside functions and can hence be used to form references inside them.

In other words, a context contains a sequence of suitable types for each index space, describing each defined entry in that space. Locals, labels and return type are only used for validating instructions in function bodies, and are left empty elsewhere. The label stack is the only part of the context that changes as validation of an instruction sequence proceeds.

More concretely, contexts are defined as records with abstract syntax:

In addition to field access written the following notation is adopted for manipulating contexts:

  • When spelling out a context, empty fields are omitted.

  • denotes the same context as but with the elements prepended to its component sequence.

Note

Indexing notation like is used to look up indices in their respective index space in the context. Context extension notation is primarily used to locally extend relative index spaces, such as label indices. Accordingly, the notation is defined to append at the front of the respective sequence, introducing a new relative index and shifting the existing ones.

3.1.6.1. Convention

Any form of type can be closed to bring it into closed form relative to a context it is valid in by substituting each type index occurring in it with the corresponding defined type , after first closing the types in themselves.

3.1.7. Prose Notation

Validation is specified by stylised rules for each relevant part of the abstract syntax. The rules not only state constraints defining when a phrase is valid, they also classify it with a type. The following conventions are adopted in stating these rules.

  • A phrase is said to be “valid with type ” if and only if all constraints expressed by the respective rules are met. The form of depends on what is.

    Note

    For example, if is a function, then is a function type; for an that is a global, is a global type; and so on.

  • The rules implicitly assume a given context .

  • In some places, this context is locally extended to a context with additional entries. The formulation “Under context , … statement …” is adopted to express that the following statement must apply under the assumptions embodied in the extended context.

3.1.8. Formal Notation

Note

This section gives a brief explanation of the notation for specifying typing rules formally. For the interested reader, a more thorough introduction can be found in respective text books. [2]

The proposition that a phrase has a respective type is written . In general, however, typing is dependent on a context . To express this explicitly, the complete form is a judgement , which says that holds under the assumptions encoded in .

The formal typing rules use a standard approach for specifying type systems, rendering them into deduction rules. Every rule has the following general form:

Such a rule is read as a big implication: if all premises hold, then the conclusion holds. Some rules have no premises; they are axioms whose conclusion holds unconditionally. The conclusion always is a judgment , and there is one respective rule for each relevant construct of the abstract syntax.

Note

For example, the typing rule for the instruction can be given as an axiom:

The instruction is always valid with type (saying that it consumes two values and produces one), independent of any side conditions.

An instruction like can be typed as follows:

Here, the premise enforces that the immediate global index exists in the context. The instruction produces a value of its respective type (and does not consume any values). If does not exist then the premise does not hold, and the instruction is ill-typed.

Finally, a structured instruction requires a recursive rule, where the premise is itself a typing judgement:

A instruction is only valid when the instruction sequence in its body is. Moreover, the result type must match the block’s annotation . If so, then the instruction has the same type as the body. Inside the body an additional label of the corresponding result type is available, which is expressed by extending the context with the additional label information for the premise.

3.2. Types

Simple types, such as number types are universally valid. However, restrictions apply to most other types, such as reference types, function types, as well as the limits of table types and memory types, which must be checked during validation.

Moreover, block types are converted to plain function types for ease of processing.

3.2.1. Number Types

Number types are always valid.

3.2.2. Vector Types

Vector types are always valid.

3.2.3. Heap Types

Concrete heap types are only valid when the type index is, while abstract ones are vacuously valid.

3.2.3.1.
  • The heap type is valid.

3.2.3.2.
  • The type must be defined in the context.

  • Then the heap type is valid.

3.2.4. Reference Types

Reference types are valid when the referenced heap type is.

3.2.4.1.
  • The heap type must be valid.

  • Then the reference type is valid.

3.2.5. Value Types

Valid value types are either valid number types, valid vector types, or valid reference types.

3.2.6. Block Types

Block types may be expressed in one of two forms, both of which are converted to instruction types by the following rules.

3.2.6.1.
3.2.6.2.

3.2.7. Result Types

3.2.7.1.
  • Each value type in the type sequence must be valid.

  • Then the result type is valid.

3.2.8. Instruction Types

3.2.8.1.

3.2.9. Function Types

3.2.9.1.

3.2.10. Composite Types

3.2.10.1.
3.2.10.2.
3.2.10.3.

3.2.11. Field Types

3.2.11.1.
3.2.11.2.
  • The packed type is valid.

3.2.12. Recursive Types

Recursive types are validated for a specific type index that denotes the index of the type defined by the recursive group.

3.2.12.1.
3.2.12.2.

Note

The side condition on the index ensures that a declared supertype is a previously defined types, preventing cyclic subtype hierarchies.

Future versions of WebAssembly may allow more than one supertype.

3.2.13. Defined Types

3.2.13.1.

3.2.14. Limits

Limits must have meaningful bounds that are within a given range.

3.2.14.1.
  • The value of must not be larger than .

  • If the maximum is not empty, then:

    • Its value must not be larger than .

    • Its value must not be smaller than .

  • Then the limit is valid within range .

3.2.15. Table Types

3.2.15.1.
  • The limits must be valid within range .

  • The reference type must be valid.

  • Then the table type is valid.

3.2.16. Memory Types

3.2.16.1.
  • The limits must be valid within range .

  • Then the memory type is valid.

3.2.17. Global Types

3.2.17.1.
  • The value type must be valid.

  • Then the global type is valid.

3.2.18. External Types

3.2.18.1.
3.2.18.2.
3.2.18.3.
3.2.18.4.

3.2.19. Defaultable Types

A type is defaultable if it has a default value for initialization.

3.2.19.1. Value Types

3.3. Matching

On most types, a notion of subtyping is defined that is applicable in validation rules, during module instantiation when checking the types of imports, or during execution, when performing casts.

3.3.1. Number Types

A number type matches a number type if and only if:

3.3.2. Vector Types

A vector type matches a vector type if and only if:

3.3.3. Heap Types

A heap type matches a heap type if and only if:

3.3.4. Reference Types

A reference type matches a reference type if and only if:

3.3.5. Value Types

A value type matches a value type if and only if:

3.3.6. Result Types

Subtyping is lifted to result types in a pointwise manner. That is, a result type matches a result type if and only if:

3.3.7. Instruction Types

Subtyping is further lifted to instruction types. An instruction type matches a type if and only if:

Note

Instruction types are contravariant in their input and covariant in their output. Subtyping also incorporates a sort of “frame” condition, which allows adding arbitrary invariant stack elements on both sides in the super type.

Finally, the supertype may ignore variables from the init set . It may also add variables to the init set, provided these are already set in the context, i.e., are vacuously initialized.

3.3.8. Function Types

A function type matches a type if and only if:

3.3.9. Composite Types

A composite type matches a type if and only if:

3.3.10. Field Types

A field type matches a type if and only if:

A storage type matches a type if and only if:

A packed type matches a type if and only if:

3.3.11. Defined Types

A defined type matches a type if and only if:

Note

Note that there is no explicit definition of type _equivalence_, since it coincides with syntactic equality, as used in the premise of the former rule above.

3.3.12. Limits

Limits match limits if and only if:

  • is larger than or equal to .

  • Either:

    • is empty.

  • Or:

    • Both and are non-empty.

    • is smaller than or equal to .

3.3.13. Table Types

A table type matches if and only if:

3.3.14. Memory Types

A memory type matches if and only if:

3.3.15. Global Types

A global type matches if and only if:

3.3.16. External Types

3.3.16.1. Functions

An external type matches if and only if:

3.3.16.2. Tables

An external type matches if and only if:

3.3.16.3. Memories

An external type matches if and only if:

3.3.16.4. Globals

An external type matches if and only if:

3.4. Instructions

Instructions are classified by instruction types that describe how they manipulate the operand stack and initialize locals: A type describes the required input stack with argument values of types that an instruction pops off and the provided output stack with result values of types that it pushes back. Moreover, it enumerates the indices of locals that have been set by the instruction. In most cases, this is empty.

Note

For example, the instruction has type , consuming two values and producing one. The instruction has type , provided is the type declared for the local .

Typing extends to instruction sequences . Such a sequence has an instruction type if the accumulative effect of executing the instructions is consuming values of types off the operand stack, pushing new values of types , and setting all locals .

For some instructions, the typing rules do not fully constrain the type, and therefore allow for multiple types. Such instructions are called polymorphic. Two degrees of polymorphism can be distinguished:

In both cases, the unconstrained types or type sequences can be chosen arbitrarily, as long as they meet the constraints imposed for the surrounding parts of the program.

Note

For example, the instruction is valid with type , for any possible number type . Consequently, both instruction sequences

and

are valid, with in the typing of being instantiated to or , respectively.

The instruction is stack-polymorphic, and hence valid with type for any possible sequences of value types and . Consequently,

is valid by assuming type for the instruction. In contrast,

is invalid, because there is no possible type to pick for the instruction that would make the sequence well-typed.

The Appendix describes a type checking algorithm that efficiently implements validation of instruction sequences as prescribed by the rules given here.

3.4.1. Numeric Instructions

3.4.1.1.
  • The instruction is valid with type .

3.4.1.2.
  • The instruction is valid with type .

3.4.1.3.
  • The instruction is valid with type .

3.4.1.4.
  • The instruction is valid with type .

3.4.1.5.
  • The instruction is valid with type .

3.4.1.6.
  • The instruction is valid with type .

3.4.2. Reference Instructions

3.4.2.1.
3.4.2.2.
3.4.2.3.
3.4.2.4.
3.4.2.5.
3.4.2.6.

Note

The liberty to pick a supertype allows typing the instruction with the least precise super type of as input, that is, the top type in the corresponding heap subtyping hierarchy.

3.4.2.7.

Note

The liberty to pick a supertype allows typing the instruction with the least precise super type of as input, that is, the top type in the corresponding heap subtyping hierarchy.

3.4.3. Aggregate Reference Instructions

3.4.3.1.
3.4.3.2.
3.4.3.3.
3.4.3.4.
3.4.3.5.
3.4.3.6.
3.4.3.7.
3.4.3.8.
3.4.3.9.
3.4.3.10.
3.4.3.11.
3.4.3.12.
3.4.3.13.
3.4.3.14.
3.4.3.15.
3.4.3.16.

3.4.4. Scalar Reference Instructions

3.4.4.1.
3.4.4.2.

3.4.5. External Reference Instructions

3.4.5.1.
3.4.5.2.

3.4.6. Vector Instructions

Vector instructions can have a prefix to describe the shape of the operand. Packed numeric types, and , are not value types. An auxiliary function maps such packed type shapes to value types:

The following auxiliary function denotes the number of lanes in a vector shape, i.e., its dimension:

3.4.6.1.
  • The instruction is valid with type .

3.4.6.2.
3.4.6.3.
3.4.6.4.
3.4.6.5.
  • The instruction is valid with type .

3.4.6.6.
3.4.6.7.
3.4.6.8.
3.4.6.9.
3.4.6.10.
3.4.6.11.
3.4.6.12.
3.4.6.13.
3.4.6.14.
3.4.6.15.
  • The instruction is valid with type .

3.4.6.16.
3.4.6.17.
3.4.6.18.
  • The instruction is valid with type .

3.4.6.19.
3.4.6.20.
3.4.6.21.

3.4.7. Parametric Instructions

3.4.7.1.

Note

Both and without annotation are value-polymorphic instructions.

3.4.7.2.

Note

In future versions of WebAssembly, may allow more than one value per choice.

3.4.8. Variable Instructions

3.4.8.1.
3.4.8.2.
  • The local must be defined in the context.

  • Let be the local type .

  • Then the instruction is valid with type .

3.4.8.3.
  • The local must be defined in the context.

  • Let be the local type .

  • Then the instruction is valid with type .

3.4.8.4.
3.4.8.5.

3.4.9. Table Instructions

3.4.9.1.
3.4.9.2.
3.4.9.3.
  • The table must be defined in the context.

  • Then the instruction is valid with type .

3.4.9.4.
3.4.9.5.
3.4.9.6.
3.4.9.7.
3.4.9.8.
  • The element segment must be defined in the context.

  • Then the instruction is valid with type .

3.4.10. Memory Instructions

3.4.10.1.
  • The memory must be defined in the context.

  • The alignment must not be larger than the bit width of divided by .

  • Then the instruction is valid with type .

3.4.10.2.
  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.3.
  • The memory must be defined in the context.

  • The alignment must not be larger than the bit width of divided by .

  • Then the instruction is valid with type .

3.4.10.4.
  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.5.
  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.6.
  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.7.
  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.8.
  • The lane index must be smaller than .

  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.9.
  • The lane index must be smaller than .

  • The memory must be defined in the context.

  • The alignment must not be larger than .

  • Then the instruction is valid with type .

3.4.10.10.
  • The memory must be defined in the context.

  • Then the instruction is valid with type .

3.4.10.11.
  • The memory must be defined in the context.

  • Then the instruction is valid with type .

3.4.10.12.
  • The memory must be defined in the context.

  • Then the instruction is valid with type .

3.4.10.13.
  • The memory must be defined in the context.

  • Then the instruction is valid with type .

3.4.10.14.
  • The memory must be defined in the context.

  • The data segment must be defined in the context.

  • Then the instruction is valid with type .

3.4.10.15.
  • The data segment must be defined in the context.

  • Then the instruction is valid with type .

3.4.11. Control Instructions

3.4.11.1.
  • The instruction is valid with type .

3.4.11.2.
  • The instruction is valid with any valid type of the form .

Note

The instruction is stack-polymorphic.

3.4.11.3.

Note

The notation inserts the new label type at index , shifting all others.

3.4.11.4.

Note

The notation inserts the new label type at index , shifting all others.

3.4.11.5.
  • The block type must be valid as some instruction type .

  • Let be the same context as , but with the result type prepended to the vector.

  • Under context , the instruction sequence must be valid with type .

  • Under context , the instruction sequence must be valid with type .

  • Then the compound instruction is valid with type .

Note

The notation inserts the new label type at index , shifting all others.

3.4.11.6.
  • The label must be defined in the context.

  • Let be the result type .

  • Then the instruction is valid with any valid type of the form .

Note

The label index space in the context contains the most recent label first, so that performs a relative lookup as expected.

The instruction is stack-polymorphic.

3.4.11.7.
  • The label must be defined in the context.

  • Let be the result type .

  • Then the instruction is valid with type .

Note

The label index space in the context contains the most recent label first, so that performs a relative lookup as expected.

3.4.11.8.
  • The label must be defined in the context.

  • For each label in , the label must be defined in the context.

  • There must be a sequence of value types, such that:

  • Then the instruction is valid with any valid type of the form .

Note

The label index space in the context contains the most recent label first, so that performs a relative lookup as expected.

The instruction is stack-polymorphic.

Furthermore, the result type is also chosen non-deterministically in this rule. Although it may seem necessary to compute as the greatest lower bound of all label types in practice, a simple linear algorithm does not require this.

3.4.11.9.
3.4.11.10.
  • The label must be defined in the context.

  • Let be the result type .

  • The result type must contain at least one type.

  • Let the value type be the last element in the sequence , and the remainder of the sequence preceding it.

  • The value type must be a reference type of the form .

  • Then the instruction is valid with type .

3.4.11.11.
3.4.11.12.
3.4.11.13.
  • The return type must not be absent in the context.

  • Let be the result type of .

  • Then the instruction is valid with any valid type of the form .

Note

The instruction is stack-polymorphic.

is absent (set to ) when validating an expression that is not a function body. This differs from it being set to the empty result type (), which is the case for functions not returning anything.

3.4.11.14.
3.4.11.15.
3.4.11.16.
3.4.11.17.

Note

The instruction is stack-polymorphic.

3.4.11.18.

Note

The instruction is stack-polymorphic.

3.4.11.19.

Note

The instruction is stack-polymorphic.

3.4.12. Instruction Sequences

Typing of instruction sequences is defined recursively.

3.4.12.1. Empty Instruction Sequence:
  • The empty instruction sequence is valid with type .

3.4.12.2. Non-empty Instruction Sequence:
  • The instruction must be valid with some type .

  • Let be the same context as , but with:

  • Under the context , the instruction sequence must be valid with some type .

  • Then the combined instruction sequence is valid with type .

3.4.12.3. Subsumption for

Note

In combination with the previous rule, subsumption allows to compose instructions whose types would not directly fit otherwise. For example, consider the instruction sequence

To type this sequence, its subsequence needs to be valid with an intermediate type. But the direct type of is , not matching the two inputs expected by . The subsumption rule allows to weaken the type of to the supertype , such that it can be composed with and yields the intermediate type for the subsequence. That can in turn be composed with the first constant.

Furthermore, subsumption allows to drop init variables from the instruction type in a context where they are not needed, for example, at the end of the body of a block.

3.4.13. Expressions

Expressions are classified by result types of the form .

3.4.13.1.
3.4.13.2. Constant Expressions

Note

Currently, constant expressions occurring in globals are further constrained in that contained instructions are only allowed to refer to imported or previously defined globals. Constant expressions occurring in tables may only have instructions that refer to imported globals. This is enforced in the validation rule for modules by constraining the context accordingly.

The definition of constant expression may be extended in future versions of WebAssembly.

3.5. Modules

Modules are valid when all the components they contain are valid. Furthermore, most definitions are themselves classified with a suitable type.

3.5.1. Types

The sequence of types defined in a module is validated incrementally, yielding a suitable context.

3.5.1.1.

Note

Despite the appearance, the context is effectively an _output_ of this judgement.

3.5.2. Functions

Functions are classified by defined types that expand to function types of the form .

3.5.2.1.

3.5.3. Locals

Locals are classified with local types.

3.5.3.1.

Note

For cases where both rules are applicable, the former yields the more permissable type.

3.5.4. Tables

Tables are classified by table types.

3.5.4.1.

3.5.5. Memories

Memories are classified by memory types.

3.5.5.1.

3.5.6. Globals

Globals are classified by global types of the form .

Sequences of globals are handled incrementally, such that each definition has access to previous definitions.

3.5.6.1.
3.5.6.2.
  • If the sequence is empty, then it is valid with the empty sequence of global types.

  • Else:

    • The first global definition must be valid with some type global type .

    • Let be the same context as , but with the global type apppended to the vector.

    • Under context , the remainder of the sequence must be valid with some sequence of global types.

    • Then the sequence is valid with the sequence of global types consisting of prepended to .

3.5.7. Element Segments

Element segments are classified by the reference type of their elements.

3.5.7.1.
3.5.7.2.
3.5.7.3.
3.5.7.4.

3.5.8. Data Segments

Data segments are not classified by any type but merely checked for well-formedness.

3.5.8.1.
  • The data mode must be valid.

  • Then the data segment is valid.

3.5.8.2.
  • The data mode is valid.

3.5.8.3.

3.5.9. Start Function

Start function declarations are not classified by any type.

3.5.9.1.

3.5.10. Exports

Exports and export descriptions are classified by their external type.

3.5.10.1.
3.5.10.2.
3.5.10.3.
3.5.10.4.
  • The memory must be defined in the context.

  • Then the export description is valid with external type .

3.5.10.5.

3.5.11. Imports

Imports and import descriptions are classified by external types.

3.5.11.1.
3.5.11.2.
3.5.11.3.
3.5.11.4.
3.5.11.5.

3.5.12. Modules

Modules are classified by their mapping from the external types of their imports to those of their exports.

A module is entirely closed, that is, its components can only refer to definitions that appear in the module itself. Consequently, no initial context is required. Instead, the context for validation of the module’s content is constructed from the definitions in the module.

The external types classifying a module may contain free type indices that refer to types defined within the module.

Note

All functions in a module are mutually recursive. Consequently, the definition of the context in this rule is recursive: it depends on the outcome of validation of the function, table, memory, and global definitions contained in the module, which itself depends on . However, this recursion is just a specification device. All types needed to construct can easily be determined from a simple pre-pass over the module that does not perform any actual validation.

Globals, however, are not recursive but evaluated sequentially, such that each constant expressions only has access to imported or previously defined globals.

Note

The restriction on the number of memories may be lifted in future versions of WebAssembly.

4. Execution

4.1. Conventions

WebAssembly code is executed when instantiating a module or invoking an exported function on the resulting module instance.

Execution behavior is defined in terms of an abstract machine that models the program state. It includes a stack, which records operand values and control constructs, and an abstract store containing global state.

For each instruction, there is a rule that specifies the effect of its execution on the program state. Furthermore, there are rules describing the instantiation of a module. As with validation, all rules are given in two equivalent forms:

  1. In prose, describing the execution in intuitive form.

  2. In formal notation, describing the rule in mathematical form. [1]

Note

As with validation, the prose and formal rules are equivalent, so that understanding of the formal notation is not required to read this specification. The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.

4.1.1. Prose Notation

Execution is specified by stylised, step-wise rules for each instruction of the abstract syntax. The following conventions are adopted in stating these rules.

  • The execution rules implicitly assume a given store .

  • The execution rules also assume the presence of an implicit stack that is modified by pushing or popping values, labels, and frames.

  • Certain rules require the stack to contain at least one frame. The most recent frame is referred to as the current frame.

  • Both the store and the current frame are mutated by replacing some of their components. Such replacement is assumed to apply globally.

  • The execution of an instruction may trap, in which case the entire computation is aborted and no further modifications to the store are performed by it. (Other computations can still be initiated afterwards.)

  • The execution of an instruction may also end in a jump to a designated target, which defines the next instruction to execute.

  • Execution can enter and exit instruction sequences that form blocks.

  • Instruction sequences are implicitly executed in order, unless a trap or jump occurs.

  • In various places the rules contain assertions expressing crucial invariants about the program state.

4.1.2. Formal Notation

Note

This section gives a brief explanation of the notation for specifying execution formally. For the interested reader, a more thorough introduction can be found in respective text books. [2]

The formal execution rules use a standard approach for specifying operational semantics, rendering them into reduction rules. Every rule has the following general form:

A configuration is a syntactic description of a program state. Each rule specifies one step of execution. As long as there is at most one reduction rule applicable to a given configuration, reduction – and thereby execution – is deterministic. WebAssembly has only very few exceptions to this, which are noted explicitly in this specification.

For WebAssembly, a configuration typically is a tuple consisting of the current store , the call frame of the current function, and the sequence of instructions that is to be executed. (A more precise definition is given later.)

To avoid unnecessary clutter, the store and the frame are omitted from reduction rules that do not touch them.

There is no separate representation of the stack. Instead, it is conveniently represented as part of the configuration’s instruction sequence. In particular, values are defined to coincide with instructions, and a sequence of instructions can be interpreted as an operand “stack” that grows to the right.

Note

For example, the reduction rule for the instruction can be given as follows:

Per this rule, two instructions and the instruction itself are removed from the instruction stream and replaced with one new instruction. This can be interpreted as popping two values off the stack and pushing the result.

When no result is produced, an instruction reduces to the empty sequence:

Labels and frames are similarly defined to be part of an instruction sequence.

The order of reduction is determined by the definition of an appropriate evaluation context.

Reduction terminates when no more reduction rules are applicable. Soundness of the WebAssembly type system guarantees that this is only the case when the original instruction sequence has either been reduced to a sequence of instructions, which can be interpreted as the values of the resulting operand stack, or if a trap occurred.

Note

For example, the following instruction sequence,

terminates after three steps:

where and and .

4.2. Runtime Structure

Store, stack, and other runtime structure forming the WebAssembly abstract machine, such as values or module instances, are made precise in terms of additional auxiliary syntax.

4.2.1. Values

WebAssembly computations manipulate values of either the four basic number types, i.e., integers and floating-point data of 32 or 64 bit width each, or vectors of 128 bit width, or of reference type.

In most places of the semantics, values of different types can occur. In order to avoid ambiguities, values are therefore represented with an abstract syntax that makes their type explicit. It is convenient to reuse the same notation as for the instructions and producing them.

References other than null are represented with additional administrative instructions. They either are scalar references, containing a 31-bit integer, structure references, pointing to a specific structure address, array references, pointing to a specific array address, function references, pointing to a specific function address, or host references pointing to an uninterpreted form of host address defined by the embedder. Any of the aformentioned references can furthermore be wrapped up as an external reference.

Note

Future versions of WebAssembly may add additional forms of reference.

Value types can have an associated default value; it is the respective value for number types, for vector types, and null for nullable reference types. For other references, no default value is defined, hence is an optional value .

4.2.1.1. Convention
  • The meta variable ranges over reference values where clear from context.

4.2.2. Results

A result is the outcome of a computation. It is either a sequence of values or a trap.

4.2.3. Store

The store represents all global state that can be manipulated by WebAssembly programs. It consists of the runtime representation of all instances of functions, tables, memories, and globals, element segments, data segments, and structures or arrays that have been allocated during the life time of the abstract machine. [1]

It is an invariant of the semantics that no element or data instance is addressed from anywhere else but the owning module instances.

Syntactically, the store is defined as a record listing the existing instances of each category:

4.2.3.1. Convention
  • The meta variable ranges over stores where clear from context.

4.2.4. Addresses

Function instances, table instances, memory instances, and global instances, element instances, data instances and structure or array instances in the store are referenced with abstract addresses. These are simply indices into the respective store component. In addition, an embedder may supply an uninterpreted set of host addresses.

An embedder may assign identity to exported store objects corresponding to their addresses, even where this identity is not observable from within WebAssembly code itself (such as for function instances or immutable globals).

Note

Addresses are dynamic, globally unique references to runtime objects, in contrast to indices, which are static, module-local references to their original definitions. A memory address denotes the abstract address of a memory instance in the store, not an offset inside a memory instance.

There is no specific limit on the number of allocations of store objects, hence logical addresses can be arbitrarily large natural numbers.

4.2.4.1. Conventions
  • The notation denotes the set of addresses from address space occurring free in . We sometimes reinterpret this set as the vector of its elements.

4.2.5. Module Instances

A module instance is the runtime representation of a module. It is created by instantiating a module, and collects runtime representations of all entities that are imported, defined, or exported by the module.

Each component references runtime instances corresponding to respective declarations from the original module – whether imported or defined – in the order of their static indices. Function instances, table instances, memory instances, and global instances are referenced with an indirection through their respective addresses in the store.

It is an invariant of the semantics that all export instances in a given module instance have different names.

4.2.6. Function Instances

A function instance is the runtime representation of a function. It effectively is a closure of the original function over the runtime module instance of its originating module. The module instance is used to resolve references to other definitions during execution of the function.

A host function is a function expressed outside WebAssembly but passed to a module as an import. The definition and behavior of host functions are outside the scope of this specification. For the purpose of this specification, it is assumed that when invoked, a host function behaves non-deterministically, but within certain constraints that ensure the integrity of the runtime.

Note

Function instances are immutable, and their identity is not observable by WebAssembly code. However, the embedder might provide implicit or explicit means for distinguishing their addresses.

4.2.7. Table Instances

A table instance is the runtime representation of a table. It records its type and holds a vector of reference values.

Table elements can be mutated through table instructions, the execution of an active element segment, or by external means provided by the embedder.

It is an invariant of the semantics that all table elements have a type matching the element type of . It also is an invariant that the length of the element vector never exceeds the maximum size of , if present.

4.2.8. Memory Instances

A memory instance is the runtime representation of a linear memory. It records its type and holds a vector of bytes.

The length of the vector always is a multiple of the WebAssembly page size, which is defined to be the constant – abbreviated .

The bytes can be mutated through memory instructions, the execution of an active data segment, or by external means provided by the embedder.

It is an invariant of the semantics that the length of the byte vector, divided by page size, never exceeds the maximum size of , if present.

4.2.9. Global Instances

A global instance is the runtime representation of a global variable. It records its type and holds an individual value.

The value of mutable globals can be mutated through variable instructions or by external means provided by the embedder.

It is an invariant of the semantics that the value has a type matching the value type of .

4.2.10. Element Instances

An element instance is the runtime representation of an element segment. It holds a vector of references and their common type.

4.2.11. Data Instances

An data instance is the runtime representation of a data segment. It holds a vector of bytes.

4.2.12. Export Instances

An export instance is the runtime representation of an export. It defines the export’s name and the associated external value.

4.2.13. External Values

An external value is the runtime representation of an entity that can be imported or exported. It is an address denoting either a function instance, table instance, memory instance, or global instances in the shared store.

4.2.13.1. Conventions

The following auxiliary notation is defined for sequences of external values. It filters out entries of a specific kind in an order-preserving fashion:

4.2.14. Aggregate Instances

A structure instance is the runtime representation of a heap object allocated from a structure type. Likewise, an array instance is the runtime representation of a heap object allocated from an array type. Both record their respective defined type and hold a vector of the values of their fields.

4.2.14.1. Conventions

4.2.15. Stack

Besides the store, most instructions interact with an implicit stack. The stack contains three kinds of entries:

These entries can occur on the stack in any order during the execution of a program. Stack entries are described by abstract syntax as follows.

Note

It is possible to model the WebAssembly semantics using separate stacks for operands, control constructs, and calls. However, because the stacks are interdependent, additional book keeping about associated stack heights would be required. For the purpose of this specification, an interleaved representation is simpler.

4.2.15.1. Values

Values are represented by themselves.

4.2.15.2. Labels

Labels carry an argument arity and their associated branch target, which is expressed syntactically as an instruction sequence:

Intuitively, is the continuation to execute when the branch is taken, in place of the original control construct.

Note

For example, a loop label has the form

When performing a branch to this label, this executes the loop, effectively restarting it from the beginning. Conversely, a simple block label has the form

When branching, the empty continuation ends the targeted block, such that execution can proceed with consecutive instructions.

4.2.15.3. Activation Frames

Activation frames carry the return arity of the respective function, hold the values of its locals (including arguments) in the order corresponding to their static local indices, and a reference to the function’s own module instance:

Locals may be uninitialized, in which case they are empty. Locals are mutated by respective variable instructions.

4.2.15.4. Conventions
  • The meta variable ranges over labels where clear from context.

  • The meta variable ranges over frame states where clear from context.

  • The following auxiliary definition takes a block type and looks up the instruction type that it denotes in the current frame:

4.2.16. Administrative Instructions

Note

This section is only relevant for the formal notation.

In order to express the reduction of traps, calls, and control instructions, the syntax of instructions is extended to include the following administrative instructions:

The instruction represents the occurrence of a trap. Traps are bubbled up through nested instruction sequences, ultimately reducing the entire program to a single instruction, signalling abrupt termination.

The instruction represents unboxed scalar reference values, and represent structure and array reference values, respectively, and instruction represents function reference values. Similarly, represents host references and represents any externalized reference.

The instruction represents the imminent invocation of a function instance, identified by its address. It unifies the handling of different forms of calls. Analogously, represents the imminent tail invocation of a function instance.

The and instructions model labels and frames “on the stack”. Moreover, the administrative syntax maintains the nesting structure of the original structured control instruction or function body and their instruction sequences with an marker. That way, the end of the inner instruction sequence is known when part of an outer sequence.

Note

For example, the reduction rule for is:

This replaces the block with a label instruction, which can be interpreted as “pushing” the label on the stack. When is reached, i.e., the inner instruction sequence has been reduced to the empty sequence – or rather, a sequence of instructions representing the resulting values – then the instruction is eliminated courtesy of its own reduction rule:

This can be interpreted as removing the label from the stack and only leaving the locally accumulated operand values.

4.2.16.1. Block Contexts

In order to specify the reduction of branches, the following syntax of block contexts is defined, indexed by the count of labels surrounding a hole that marks the place where the next step of computation is taking place:

This definition allows to index active labels surrounding a branch or return instruction.

Note

For example, the reduction of a simple branch can be defined as follows:

Here, the hole of the context is instantiated with a branch instruction. When a branch occurs, this rule replaces the targeted label and associated instruction sequence with the label’s continuation. The selected label is identified through the label index , which corresponds to the number of surrounding instructions that must be hopped over – which is exactly the count encoded in the index of a block context.

4.2.16.2. Configurations

A configuration consists of the current store and an executing thread.

A thread is a computation over instructions that operates relative to the state of a current frame referring to the module instance in which the computation runs, i.e., where the current function originates from.

Note

The current version of WebAssembly is single-threaded, but configurations with multiple threads may be supported in the future.

4.2.16.3. Evaluation Contexts

Finally, the following definition of evaluation context and associated structural rules enable reduction inside instruction sequences and administrative forms as well as the propagation of traps:

Reduction terminates when a thread’s instruction sequence has been reduced to a result, that is, either a sequence of values or to a .

Note

The restriction on evaluation contexts rules out contexts like and for which .

For an example of reduction under evaluation contexts, consider the following instruction sequence.

This can be decomposed into where

Moreover, this is the only possible choice of evaluation context where the contents of the hole matches the left-hand side of a reduction rule.

4.3. Numerics

Numeric primitives are defined in a generic manner, by operators indexed over a bit width .

Some operators are non-deterministic, because they can return one of several possible results (such as different NaN values). Technically, each operator thus returns a set of allowed values. For convenience, deterministic results are expressed as plain values, which are assumed to be identified with a respective singleton set.

Some operators are partial, because they are not defined on certain inputs. Technically, an empty set of results is returned for these inputs.

In formal notation, each operator is defined by equational clauses that apply in decreasing order of precedence. That is, the first clause that is applicable to the given arguments defines the result. In some cases, similar clauses are combined into one by using the notation or . When several of these placeholders occur in a single clause, then they must be resolved consistently: either the upper sign is chosen for all of them or the lower sign.

Note

For example, the operator is defined as follows:

This definition is to be read as a shorthand for the following expansion of each clause into two separate ones:

Numeric operators are lifted to input sequences by applying the operator element-wise, returning a sequence of results. When there are multiple inputs, they must be of equal length.

Note

For example, the unary operator , when given a sequence of floating-point values, return a sequence of floating-point results:

The binary operator , when given two sequences of integers of the same length, , return a sequence of integer results:

Conventions:

  • The meta variable is used to range over single bits.

  • The meta variable is used to range over (signless) magnitudes of floating-point values, including and .

  • The meta variable is used to range over (signless) rational magnitudes, excluding or .

  • The notation denotes the inverse of a bijective function .

  • Truncation of rational values is written , with the usual mathematical definition:

  • Saturation of integers is written and . The arguments to these two functions range over arbitrary signed integers.

    • Unsigned saturation, clamps to between and :

    • Signed saturation, clamps to between and :

4.3.1. Representations

Numbers and numeric vectors have an underlying binary representation as a sequence of bits:

The first case of these applies to representations of both integer value types and packed types.

Each of these functions is a bijection, hence they are invertible.

4.3.1.1. Integers

Integers are represented as base two unsigned numbers:

Boolean operators like , , or are lifted to bit sequences of equal length by applying them pointwise.

4.3.1.2. Floating-Point

Floating-point values are represented in the respective binary format defined by [IEEE-754-2019] (Section 3.4):

where and .

4.3.1.3. Vectors

Numeric vectors of type have the same underlying representation as an . They can also be interpreted as a sequence of numeric values packed into a with a particular , provided that .

This function is a bijection on , hence it is invertible.

4.3.1.4. Storage

When a number is stored into memory, it is converted into a sequence of bytes in little endian byte order:

Again these functions are invertible bijections.

4.3.2. Integer Operations

4.3.2.1. Sign Interpretation

Integer operators are defined on values. Operators that use a signed interpretation convert the value using the following definition, which takes the two’s complement when the value lies in the upper half of the value range (i.e., its most significant bit is ):

This function is bijective, and hence invertible.

4.3.2.2. Boolean Interpretation

The integer result of predicates – i.e., tests and relational operators – is defined with the help of the following auxiliary function producing the value or depending on a condition.

4.3.2.3.
  • Return the result of adding and modulo .

4.3.2.4.
  • Return the result of subtracting from modulo .

4.3.2.5.
  • Return the result of multiplying and modulo .

4.3.2.6.
  • If is , then the result is undefined.

  • Else, return the result of dividing by , truncated toward zero.

Note

This operator is partial.

4.3.2.7.
  • Let be the signed interpretation of .

  • Let be the signed interpretation of .

  • If is , then the result is undefined.

  • Else if divided by is , then the result is undefined.

  • Else, return the result of dividing by , truncated toward zero.

Note

This operator is partial. Besides division by , the result of is not representable as an -bit signed integer.

4.3.2.8.
  • If is , then the result is undefined.

  • Else, return the remainder of dividing by .

Note

This operator is partial.

As long as both operators are defined, it holds that .

4.3.2.9.
  • Let be the signed interpretation of .

  • Let be the signed interpretation of .

  • If is , then the result is undefined.

  • Else, return the remainder of dividing by , with the sign of the dividend .

Note

This operator is partial.

As long as both operators are defined, it holds that .

4.3.2.10.
  • Return the bitwise negation of .

4.3.2.11.
  • Return the bitwise conjunction of and .

4.3.2.12.
  • Return the bitwise conjunction of and the bitwise negation of .

4.3.2.13.
  • Return the bitwise disjunction of and .

4.3.2.14.
  • Return the bitwise exclusive disjunction of and .

4.3.2.15.
  • Let be modulo .

  • Return the result of shifting left by bits, modulo .

4.3.2.16.
  • Let be modulo .

  • Return the result of shifting right by bits, extended with bits.

4.3.2.17.
  • Let be modulo .

  • Return the result of shifting right by bits, extended with the most significant bit of the original value.

4.3.2.18.
  • Let be modulo .

  • Return the result of rotating left by bits.

4.3.2.19.
  • Let be modulo .

  • Return the result of rotating right by bits.

4.3.2.20.
  • Return the count of leading zero bits in ; all bits are considered leading zeros if is .

4.3.2.21.
  • Return the count of trailing zero bits in ; all bits are considered trailing zeros if is .

4.3.2.22.
  • Return the count of non-zero bits in .

4.3.2.23.
  • Return if is zero, otherwise.

4.3.2.24.
  • Return if equals , otherwise.

4.3.2.25.
  • Return if does not equal , otherwise.

4.3.2.26.
  • Return if is less than , otherwise.

4.3.2.27.
4.3.2.28.
  • Return if is greater than , otherwise.

4.3.2.29.
4.3.2.30.
  • Return if is less than or equal to , otherwise.

4.3.2.31.
4.3.2.32.
  • Return if is greater than or equal to , otherwise.

4.3.2.33.
4.3.2.34.
  • Let be the result of computing .

  • Return .

4.3.2.35.
  • Let be the bitwise conjunction of and .

  • Let be the bitwise negation of .

  • Let be the bitwise conjunction of and .

  • Return the bitwise disjunction of and .

4.3.2.36.
  • Let be the signed interpretation of .

  • If is greater than or equal to , then return .

  • Else return the negation of j, modulo .

4.3.2.37.
  • Return the result of negating , modulo .

4.3.2.38.
  • Return if is , return otherwise.

4.3.2.39.
  • Return if is , return otherwise.

4.3.2.40.
  • Return if is , return otherwise.

4.3.2.41.
  • Return if is , return otherwise.

4.3.2.42.
  • Let be the result of adding and .

  • Return .

4.3.2.43.
  • Let be the signed interpretation of

  • Let be the signed interpretation of

  • Let be the result of adding and .

  • Return .

4.3.2.44.
  • Let be the result of subtracting from .

  • Return .

4.3.2.45.
  • Let be the signed interpretation of

  • Let be the signed interpretation of

  • Let be the result of subtracting from .

  • Return .

4.3.2.46.
  • Let be the result of adding , , and .

  • Return the result of dividing by , truncated toward zero.

4.3.2.47.
  • Return the result of .

4.3.3. Floating-Point Operations

Floating-point arithmetic follows the [IEEE-754-2019] standard, with the following qualifications:

  • All operators use round-to-nearest ties-to-even, except where otherwise specified. Non-default directed rounding attributes are not supported.

  • Following the recommendation that operators propagate NaN payloads from their operands is permitted but not required.

  • All operators use “non-stop” mode, and floating-point exceptions are not otherwise observable. In particular, neither alternate floating-point exception handling attributes nor operators on status flags are supported. There is no observable difference between quiet and signalling NaNs.

Note

Some of these limitations may be lifted in future versions of WebAssembly.

4.3.3.1. Rounding

Rounding always is round-to-nearest ties-to-even, in correspondence with [IEEE-754-2019] (Section 4.3.1).

An exact floating-point number is a rational number that is exactly representable as a floating-point number of given bit width .

A limit number for a given floating-point bit width is a positive or negative number whose magnitude is the smallest power of that is not exactly representable as a floating-point number of width (that magnitude is for and for ).

A candidate number is either an exact floating-point number or a positive or negative limit number for the given bit width .

A candidate pair is a pair of candidate numbers, such that no candidate number exists that lies between the two.

A real number is converted to a floating-point value of bit width as follows:

  • If is , then return .

  • Else if is an exact floating-point number, then return .

  • Else if greater than or equal to the positive limit, then return .

  • Else if is less than or equal to the negative limit, then return .

  • Else if and are a candidate pair such that , then:

    • If , then let be .

    • Else if , then let be .

    • Else if and the significand of is even, then let be .

    • Else, let be .

  • If is , then:

    • If , then return .

    • Else, return .

  • Else if is a limit number, then:

    • If , then return .

    • Else, return .

  • Else, return .

where:

4.3.3.2. NaN Propagation

When the result of a floating-point operator other than , , or is a NaN, then its sign is non-deterministic and the payload is computed as follows:

  • If the payload of all NaN inputs to the operator is canonical (including the case that there are no NaN inputs), then the payload of the output is canonical as well.

  • Otherwise the payload is picked non-deterministically among all arithmetic NaNs; that is, its most significant bit is and all others are unspecified.

This non-deterministic result is expressed by the following auxiliary function producing a set of allowed outputs from a set of inputs:

4.3.3.3.
  • If either or is a NaN, then return an element of .

  • Else if both and are infinities of opposite signs, then return an element of .

  • Else if both and are infinities of equal sign, then return that infinity.

  • Else if either or is an infinity, then return that infinity.

  • Else if both and are zeroes of opposite sign, then return positive zero.

  • Else if both and are zeroes of equal sign, then return that zero.

  • Else if either or is a zero, then return the other operand.

  • Else if both and are values with the same magnitude but opposite signs, then return positive zero.

  • Else return the result of adding and , rounded to the nearest representable value.

4.3.3.4.
  • If either or is a NaN, then return an element of .

  • Else if both and are infinities of equal signs, then return an element of .

  • Else if both and are infinities of opposite sign, then return .

  • Else if is an infinity, then return that infinity.

  • Else if is an infinity, then return that infinity negated.

  • Else if both and are zeroes of equal sign, then return positive zero.

  • Else if both and are zeroes of opposite sign, then return .

  • Else if is a zero, then return .

  • Else if is a zero, then return negated.

  • Else if both and are the same value, then return positive zero.

  • Else return the result of subtracting from , rounded to the nearest representable value.

Note

Up to the non-determinism regarding NaNs, it always holds that .

4.3.3.5.
  • If either or is a NaN, then return an element of .

  • Else if one of and is a zero and the other an infinity, then return an element of .

  • Else if both and are infinities of equal sign, then return positive infinity.

  • Else if both and are infinities of opposite sign, then return negative infinity.

  • Else if either or is an infinity and the other a value with equal sign, then return positive infinity.

  • Else if either or is an infinity and the other a value with opposite sign, then return negative infinity.

  • Else if both and are zeroes of equal sign, then return positive zero.

  • Else if both and are zeroes of opposite sign, then return negative zero.

  • Else return the result of multiplying and , rounded to the nearest representable value.

4.3.3.6.
  • If either or is a NaN, then return an element of .

  • Else if both and are infinities, then return an element of .

  • Else if both and are zeroes, then return an element of .

  • Else if is an infinity and a value with equal sign, then return positive infinity.

  • Else if is an infinity and a value with opposite sign, then return negative infinity.

  • Else if is an infinity and a value with equal sign, then return positive zero.

  • Else if is an infinity and a value with opposite sign, then return negative zero.

  • Else if is a zero and a value with equal sign, then return positive zero.

  • Else if is a zero and a value with opposite sign, then return negative zero.

  • Else if is a zero and a value with equal sign, then return positive infinity.

  • Else if is a zero and a value with opposite sign, then return negative infinity.

  • Else return the result of dividing by , rounded to the nearest representable value.

4.3.3.7.
  • If either or is a NaN, then return an element of .

  • Else if either or is a negative infinity, then return negative infinity.

  • Else if either or is a positive infinity, then return the other value.

  • Else if both and are zeroes of opposite signs, then return negative zero.

  • Else return the smaller value of and .

4.3.3.8.
  • If either or is a NaN, then return an element of .

  • Else if either or is a positive infinity, then return positive infinity.

  • Else if either or is a negative infinity, then return the other value.

  • Else if both and are zeroes of opposite signs, then return positive zero.

  • Else return the larger value of and .

4.3.3.9.
  • If and have the same sign, then return .

  • Else return with negated sign.

4.3.3.10.
  • If is a NaN, then return with positive sign.

  • Else if is an infinity, then return positive infinity.

  • Else if is a zero, then return positive zero.

  • Else if is a positive value, then .

  • Else return negated.

4.3.3.11.
  • If is a NaN, then return with negated sign.

  • Else if is an infinity, then return that infinity negated.

  • Else if is a zero, then return that zero negated.

  • Else return negated.

4.3.3.12.
  • If is a NaN, then return an element of .

  • Else if is negative infinity, then return an element of .

  • Else if is positive infinity, then return positive infinity.

  • Else if is a zero, then return that zero.

  • Else if has a negative sign, then return an element of .

  • Else return the square root of .

4.3.3.13.
  • If is a NaN, then return an element of .

  • Else if is an infinity, then return .

  • Else if is a zero, then return .

  • Else if is smaller than but greater than , then return negative zero.

  • Else return the smallest integral value that is not smaller than .

4.3.3.14.
  • If is a NaN, then return an element of .

  • Else if is an infinity, then return .

  • Else if is a zero, then return .

  • Else if is greater than but smaller than , then return positive zero.

  • Else return the largest integral value that is not larger than .

4.3.3.15.
  • If is a NaN, then return an element of .

  • Else if is an infinity, then return .

  • Else if is a zero, then return .

  • Else if is greater than but smaller than , then return positive zero.

  • Else if is smaller than but greater than , then return negative zero.

  • Else return the integral value with the same sign as and the largest magnitude that is not larger than the magnitude of .

4.3.3.16.
  • If is a NaN, then return an element of .

  • Else if is an infinity, then return .

  • Else if is a zero, then return .

  • Else if is greater than but smaller than or equal to , then return positive zero.

  • Else if is smaller than but greater than or equal to , then return negative zero.

  • Else return the integral value that is nearest to ; if two values are equally near, return the even one.

4.3.3.17.
  • If either or is a NaN, then return .

  • Else if both and are zeroes, then return .

  • Else if both and are the same value, then return .

  • Else return .

4.3.3.18.
  • If either or is a NaN, then return .

  • Else if both and are zeroes, then return .

  • Else if both and are the same value, then return .

  • Else return .

4.3.3.19.
  • If either or is a NaN, then return .

  • Else if and are the same value, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if both and are zeroes, then return .

  • Else if is smaller than , then return .

  • Else return .

4.3.3.20.
  • If either or is a NaN, then return .

  • Else if and are the same value, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if both and are zeroes, then return .

  • Else if is larger than , then return .

  • Else return .

4.3.3.21.
  • If either or is a NaN, then return .

  • Else if and are the same value, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if both and are zeroes, then return .

  • Else if is smaller than or equal to , then return .

  • Else return .

4.3.3.22.
  • If either or is a NaN, then return .

  • Else if and are the same value, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if is positive infinity, then return .

  • Else if is negative infinity, then return .

  • Else if both and are zeroes, then return .

  • Else if is smaller than or equal to , then return .

  • Else return .

4.3.3.23.
  • If is less than then return .

  • Else return .

4.3.3.24.
  • If is less than then return .

  • Else return .

4.3.4. Conversions

4.3.4.1.
  • Return .

Note

In the abstract syntax, unsigned extension just reinterprets the same value.

4.3.4.2.
  • Let be the signed interpretation of of size .

  • Return the two’s complement of relative to size .

4.3.4.3.
  • Return modulo .

4.3.4.4.
  • If is a NaN, then the result is undefined.

  • Else if is an infinity, then the result is undefined.

  • Else if is a number and is a value within range of the target type, then return that value.

  • Else the result is undefined.

Note

This operator is partial. It is not defined for NaNs, infinities, or values for which the result is out of range.

4.3.4.5.
  • If is a NaN, then the result is undefined.

  • Else if is an infinity, then the result is undefined.

  • If is a number and is a value within range of the target type, then return that value.

  • Else the result is undefined.

Note

This operator is partial. It is not defined for NaNs, infinities, or values for which the result is out of range.

4.3.4.6.
  • If is a NaN, then return .

  • Else if is negative infinity, then return .

  • Else if is positive infinity, then return .

  • Else, return .

4.3.4.7.
  • If is a NaN, then return .

  • Else if is negative infinity, then return .

  • Else if is positive infinity, then return .

  • Else, return .

4.3.4.8.
  • If is a canonical NaN, then return an element of (i.e., a canonical NaN of size ).

  • Else if is a NaN, then return an element of (i.e., any arithmetic NaN of size ).

  • Else, return .

4.3.4.9.
  • If is a canonical NaN, then return an element of (i.e., a canonical NaN of size ).

  • Else if is a NaN, then return an element of (i.e., any NaN of size ).

  • Else if is an infinity, then return that infinity.

  • Else if is a zero, then return that zero.

  • Else, return .

4.3.4.10.
4.3.4.11.
4.3.4.12.
  • Let be the bit sequence .

  • Return the constant for which .

4.3.4.13.
4.3.4.14.

4.4. Types

Execution has to check and compare types in a few places, such as executing or instantiating modules.

It is an invariant of the semantics that all types occurring during execution are closed.

Note

Runtime type checks generally involve types from multiple modules or types not defined by a module at all, such that module-local type indices are not meaningful.

4.4.1. Instantiation

Any form of type can be instantiated into a closed type inside a module instance by substituting each type index occurring in it with the corresponding defined type .

Note

This is the runtime equivalent to type closure.

4.5. Values

4.5.1. Value Typing

For the purpose of checking argument values against the parameter types of exported functions, values are classified by value types. The following auxiliary typing rules specify this typing relation relative to a store in which possibly referenced addresses live.

4.5.1.1. Numeric Values
4.5.1.2. Vector Values
4.5.1.3. Null References

Note

A null reference is typed with the least type in its respective hierarchy. That ensures that it is compatible with any nullable type in that hierarchy.

4.5.1.4. Scalar References
4.5.1.5. Structure References
4.5.1.6. Array References
4.5.1.7. Function References
4.5.1.8. Host References

Note

A host reference is considered internalized by this rule.

4.5.1.9. External References
4.5.1.10. Subsumption
  • The value must be valid with some value type .

  • The value type matches another valid type .

  • Then the value is valid with type .

4.5.2. External Typing

For the purpose of checking external values against imports, such values are classified by external types. The following auxiliary typing rules specify this typing relation relative to a store in which the referenced instances live.

4.5.2.1.
4.5.2.2.
4.5.2.3.
4.5.2.4.
4.5.2.5. Subsumption
  • The external value must be valid with some external type .

  • The external type matches another valid type .

  • Then the external value is valid with type .

4.6. Instructions

WebAssembly computation is performed by executing individual instructions.

4.6.1. Numeric Instructions

Numeric instructions are defined in terms of the generic numeric operators. The mapping of numeric instructions to their underlying operators is expressed by the following definition:

And for conversion operators:

Where the underlying operators are partial, the corresponding instruction will trap when the result is not defined. Where the underlying operators are non-deterministic, because they may return one of multiple possible NaN values, so are the corresponding instructions.

Note

For example, the result of instruction applied to operands invokes , which maps to the generic via the above definition. Similarly, applied to invokes , which maps to the generic .

4.6.1.1.
  1. Push the value to the stack.

Note

No formal reduction rule is required for this instruction, since instructions already are values.

4.6.1.2.
  1. Assert: due to validation, a value of value type is on the top of the stack.

  2. Pop the value from the stack.

  3. If is defined, then:

    1. Let be a possible result of computing .

    2. Push the value to the stack.

  4. Else:

    1. Trap.

4.6.1.3.
  1. Assert: due to validation, two values of value type are on the top of the stack.

  2. Pop the value from the stack.

  3. Pop the value from the stack.

  4. If is defined, then:

    1. Let be a possible result of computing .

    2. Push the value to the stack.

  5. Else:

    1. Trap.

4.6.1.4.
  1. Assert: due to validation, a value of value type is on the top of the stack.

  2. Pop the value from the stack.

  3. Let be the result of computing .

  4. Push the value to the stack.

4.6.1.5.
  1. Assert: due to validation, two values of value type are on the top of the stack.

  2. Pop the value from the stack.

  3. Pop the value from the stack.

  4. Let be the result of computing .

  5. Push the value to the stack.

4.6.1.6.
  1. Assert: due to validation, a value of value type is on the top of the stack.

  2. Pop the value from the stack.

  3. If is defined:

    1. Let be a possible result of computing .

    2. Push the value to the stack.

  4. Else:

    1. Trap.

4.6.2. Reference Instructions

4.6.2.1.
  1. Let be the current frame.

  2. Assert: due to validation, the defined type exists.

  3. Let be the defined type .

  4. Push the value to the stack.

Note

No formal reduction rule is required for the case , since the instruction form is already a value.

4.6.2.2.
  1. Let be the current frame.

  2. Assert: due to validation, exists.

  3. Let be the function address .

  4. Push the value to the stack.

4.6.2.3.
  1. Assert: due to validation, a reference value is on the top of the stack.

  2. Pop the value from the stack.

  3. If is , then:

    1. Push the value to the stack.

  4. Else:

    1. Push the value to the stack.

4.6.2.4.
  1. Assert: due to validation, a reference value is on the top of the stack.

  2. Pop the value from the stack.

  3. If is , then:

    1. Trap.

  4. Push the value back to the stack.

4.6.2.5.
  1. Assert: due to validation, two reference values are on the top of the stack.

  2. Pop the value from the stack.

  3. Pop the value from the stack.

  4. If is the same as , then:

    1. Push the value to the stack.

  5. Else:

    1. Push the value to the stack.

4.6.2.6.
  1. Let be the current frame.

  2. Let be the reference type .

  3. Assert: due to validation, is closed.

  4. Assert: due to validation, a reference value is on the top of the stack.

  5. Pop the value from the stack.

  6. Assert: due to validation, the reference value is valid with some reference type.

  7. Let be the reference type of .

  8. If the reference type matches , then:

    1. Push the value to the stack.

  9. Else:

    1. Push the value to the stack.

4.6.2.7.
  1. Let be the current frame.

  2. Let be the reference type .

  3. Assert: due to validation, is closed.

  4. Assert: due to validation, a reference value is on the top of the stack.

  5. Pop the value from the stack.

  6. Assert: due to validation, the reference value is valid with some reference type.

  7. Let be the reference type of .

  8. If the reference type matches , then:

    1. Push the value back to the stack.

  9. Else:

    1. Trap.

4.6.2.8.
  1. Assert: due to validation, a value of type is on the top of the stack.

  2. Pop the value from the stack.

  3. Let be the result of computing .

  4. Push the reference value to the stack.

4.6.2.9.
  1. Assert: due to validation, a value of type is on the top of the stack.

  2. Pop the value from the stack.

  3. If is , then:

    1. Trap.

  4. Assert: due to validation, a is a scalar reference.

  5. Let be the reference value .

  6. Let be the result of computing .

  7. Push the value to the stack.

4.6.2.10.
  1. Let be the current frame.

  2. Assert: due to validation, the defined type exists.

  3. Let be the defined type .

  4. Assert: due to validation, the expansion of is a structure type.

  5. Let be the expanded structure type of .

  6. Let be the length of the field type sequence .

  7. Assert: due to validation, values are on the top of the stack.

  8. Pop the values from the stack.

  9. For every value in and corresponding field type in :

    1. Let be the result of computing .

  10. Let the concatenation of all field values .

  11. Let be the structure instance .

  12. Let be the length of .

  13. Append to .

  14. Push the structure reference to the stack.

4.6.2.11.
  1. Let be the current frame.

  2. Assert: due to validation, the defined type exists.

  3. Let be the defined type .

  4. Assert: due to validation, the expansion of is a structure type.

  5. Let be the expanded structure type of .

  6. Let be the length of the field type sequence .

  7. For every field type in :

    1. Let be the value type .

    2. Assert: due to validation, is defined.

    3. Push the value to the stack.

  8. Execute the instruction .

4.6.2.12.
  1. Let be the current frame.

  2. Assert: due to validation, the defined type exists.

  3. Let be the defined type .

  4. Assert: due to validation, the expansion of is a structure type with at least fields.

  5. Let be the expanded structure type of .

  6. Let be the -th field type of .

  7. Assert: due to validation, a value of type is on the top of the stack.

  8. Pop the value from the stack.

  9. If is , then:

    1. Trap.

  10. Assert: due to validation, a is a structure reference.

  11. Let be the reference value .

  12. Assert: due to validation, the structure instance exists and has at least fields.

  13. Let be the field value .

  14. Let be the result of computing .

  15. Push the value to the stack.

4.6.2.13.
  1. Let be the current frame.

  2. Assert: due to validation, the defined type exists.

  3. Let be the defined type .

  4. Assert: due to validation, the expansion of is a structure type with at least fields.

  5. Let