Values¶

Bytes¶

Bytes encode themselves.

\begin{array}{r} \begin{array}{llclll} byte & ::= & 0 x 00 & \Rightarrow & 0 x 00 \\ | & \dots \\ | & 0 xFF & \Rightarrow & 0 xFF \end{array} \end{array}

Integers¶

All integers are encoded using the LEB128 variable-length integer encoding, in either unsigned or signed variant.

Unsigned integers are encoded in unsigned LEB128 format. As an additional constraint, the total number of bytes encoding a value of type $u N$ must not exceed $ceil (N / 7)$ bytes.

\begin{array}{r} \begin{array}{llcllll} u N & ::= & n : byte & \Rightarrow & n & (if n < 2^{7} \land n < 2^{N}) \\ | & n : byte m : u (N - 7) & \Rightarrow & 2^{7} \cdot m + (n - 2^{7}) & (if n \geq 2^{7} \land N > 7) \end{array} \end{array}

Signed integers are encoded in signed LEB128 format, which uses a two’s complement representation. As an additional constraint, the total number of bytes encoding a value of type $s N$ must not exceed $ceil (N / 7)$ bytes.

\begin{array}{r} \begin{array}{llcllll} s N & ::= & n : byte & \Rightarrow & n & (if n < 2^{6} \land n < 2^{N - 1}) \\ | & n : byte & \Rightarrow & n - 2^{7} & (if 2^{6} \leq n < 2^{7} \land n \geq 2^{7} - 2^{N - 1}) \\ | & n : byte m : s (N - 7) & \Rightarrow & 2^{7} \cdot m + (n - 2^{7}) & (if n \geq 2^{7} \land N > 7) \end{array} \end{array}

Uninterpreted integers are encoded as signed integers.

\begin{array}{llcllll} i N & ::= & n : s N & \Rightarrow & i & (if n = {signed}_{N} (i)) \end{array}

Note

The side conditions $N > 7$ in the productions for non-terminal bytes of the $u$ and $s$ encodings restrict the encoding’s length. However, “trailing zeros” are still allowed within these bounds. For example, $0 x 03$ and $0 x 83 0 x 00$ are both well-formed encodings for the value $3$ as a $u 8$ . Similarly, either of $0 x 7 e$ and $0 xFE 0 x 7 F$ and $0 xFE 0 xFF 0 x 7 F$ are well-formed encodings of the value $- 2$ as a $s 16$ .

The side conditions on the value $n$ of terminal bytes further enforce that any unused bits in these bytes must be $0$ for positive values and $1$ for negative ones. For example, $0 x 83 0 x 10$ is malformed as a $u 8$ encoding. Similarly, both $0 x 83 0 x 3 E$ and $0 xFF 0 x 7 B$ are malformed as $s 8$ encodings.

Floating-Point¶

Floating-point values are encoded directly by their IEEE 754 (Section 3.4) bit pattern in little endian byte order:

\begin{array}{r} \begin{array}{llcllll} f N & ::= & b^{*} : {byte}^{N / 8} & \Rightarrow & {bytes}_{f N}^{- 1} (b^{*}) \end{array} \end{array}

Names¶

Names are encoded as a vector of bytes containing the Unicode (Section 3.9) UTF-8 encoding of the name’s character sequence.

\begin{array}{r} \begin{array}{llclllll} name & ::= & b^{*} : vec (byte) & \Rightarrow & name & (if utf 8 (name) = b^{*}) \end{array} \end{array}

The auxiliary $utf 8$ function expressing this encoding is defined as follows:

\begin{array}{r} \begin{array}{l} \begin{array}{lcll} utf 8 (c^{*}) & = & (utf 8 (c))^{*} \\ utf 8 (c) & = & b & (\begin{array}{cl} if & c < U + 80 \\ \land & c = b) \end{array} \\ utf 8 (c) & = & b_{1} b_{2} & (\begin{array}{cl} if & U + 80 \leq c < U + 800 \\ \land & c = 2^{6} (b_{1} - 0 xC 0) + (b_{2} - 0 x 80)) \end{array} \\ utf 8 (c) & = & b_{1} b_{2} b_{3} & (\begin{array}{cl} if & U + 800 \leq c < U + D 800 \lor U + E 000 \leq c < U + 10000 \\ \land & c = 2^{12} (b_{1} - 0 xE 0) + 2^{6} (b_{2} - 0 x 80) + (b_{3} - 0 x 80)) \end{array} \\ utf 8 (c) & = & b_{1} b_{2} b_{3} b_{4} & (\begin{array}{cl} if & U + 10000 \leq c < U + 110000 \\ \land & c = 2^{18} (b_{1} - 0 xF 0) + 2^{12} (b_{2} - 0 x 80) + 2^{6} (b_{3} - 0 x 80) + (b_{4} - 0 x 80)) \end{array} \end{array} \\ where b_{2}, b_{3}, b_{4} < 0 xC 0 \end{array} \end{array}

Note

Unlike in some other formats, name strings are not 0-terminated.