Integers
All integers are encoded using the LEB128 variable-length integer encoding, in either unsigned or signed variant.
Unsigned integers are encoded in unsigned LEB128 format.
As an additional constraint, the total number of bytes encoding a value of type \(\href{../syntax/values.html#syntax-int}{\mathit{u}N}\) must not exceed \(\mathrm{ceil}(N/7)\) bytes.
\[\begin{split}\begin{array}{llclll@{\qquad}l}
\def\mathdef992#1{{}}\mathdef992{unsigned integer} & \href{../binary/values.html#binary-int}{\def\mathdef993#1{{\mathtt{u}#1}}\mathdef993{N}} &::=&
n{:}\href{../binary/values.html#binary-byte}{\mathtt{byte}} &\Rightarrow& n & (\mathrel{\mbox{if}} n < 2^7 \wedge n < 2^N) \\ &&|&
n{:}\href{../binary/values.html#binary-byte}{\mathtt{byte}}~~m{:}\def\mathdef1033#1{{\mathtt{u}#1}}\mathdef1033{(N\mathtt{-7})} &\Rightarrow&
2^7\cdot m + (n-2^7) & (\mathrel{\mbox{if}} n \geq 2^7 \wedge N > 7) \\
\end{array}\end{split}\]
Signed integers are encoded in signed LEB128 format, which uses a two’s complement representation.
As an additional constraint, the total number of bytes encoding a value of type \(\href{../syntax/values.html#syntax-int}{\mathit{s}N}\) must not exceed \(\mathrm{ceil}(N/7)\) bytes.
\[\begin{split}\begin{array}{llclll@{\qquad}l}
\def\mathdef992#1{{}}\mathdef992{signed integer} & \href{../binary/values.html#binary-int}{\def\mathdef999#1{{\mathtt{s}#1}}\mathdef999{N}} &::=&
n{:}\href{../binary/values.html#binary-byte}{\mathtt{byte}} &\Rightarrow& n & (\mathrel{\mbox{if}} n < 2^6 \wedge n < 2^{N-1}) \\ &&|&
n{:}\href{../binary/values.html#binary-byte}{\mathtt{byte}} &\Rightarrow& n-2^7 & (\mathrel{\mbox{if}} 2^6 \leq n < 2^7 \wedge n \geq 2^7-2^{N-1}) \\ &&|&
n{:}\href{../binary/values.html#binary-byte}{\mathtt{byte}}~~m{:}\def\mathdef1034#1{{\mathtt{s}#1}}\mathdef1034{(N\mathtt{-7})} &\Rightarrow&
2^7\cdot m + (n-2^7) & (\mathrel{\mbox{if}} n \geq 2^7 \wedge N > 7) \\
\end{array}\end{split}\]
Uninterpreted integers are encoded as signed integers.
\[\begin{array}{llclll@{\qquad\qquad}l}
\def\mathdef992#1{{}}\mathdef992{uninterpreted integer} & \href{../binary/values.html#binary-int}{\def\mathdef1004#1{{\mathtt{i}#1}}\mathdef1004{N}} &::=&
n{:}\href{../binary/values.html#binary-int}{\def\mathdef999#1{{\mathtt{s}#1}}\mathdef999{N}} &\Rightarrow& i & (\mathrel{\mbox{if}} n = \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_{\href{../syntax/values.html#syntax-int}{\mathit{i}N}}(i))
\end{array}\]
Note
The side conditions \(N > 7\) in the productions for non-terminal bytes of the \(\def\mathdef1035#1{{\mathit{u#1}}}\mathdef1035{}\) and \(\def\mathdef1036#1{{\mathit{s#1}}}\mathdef1036{}\) encodings restrict the encoding’s length.
However, “trailing zeros” are still allowed within these bounds.
For example, \(\def\mathdef1037#1{\mathtt{0x#1}}\mathdef1037{03}\) and \(\def\mathdef1038#1{\mathtt{0x#1}}\mathdef1038{83}~\def\mathdef1039#1{\mathtt{0x#1}}\mathdef1039{00}\) are both well-formed encodings for the value \(3\) as a \(\href{../syntax/values.html#syntax-int}{\mathit{u8}}\).
Similarly, either of \(\def\mathdef1040#1{\mathtt{0x#1}}\mathdef1040{7e}\) and \(\def\mathdef1041#1{\mathtt{0x#1}}\mathdef1041{FE}~\def\mathdef1042#1{\mathtt{0x#1}}\mathdef1042{7F}\) and \(\def\mathdef1043#1{\mathtt{0x#1}}\mathdef1043{FE}~\def\mathdef1044#1{\mathtt{0x#1}}\mathdef1044{FF}~\def\mathdef1045#1{\mathtt{0x#1}}\mathdef1045{7F}\) are well-formed encodings of the value \(-2\) as a \(\href{../syntax/values.html#syntax-int}{\mathit{s16}}\).
The side conditions on the value \(n\) of terminal bytes further enforce that
any unused bits in these bytes must be \(0\) for positive values and \(1\) for negative ones.
For example, \(\def\mathdef1046#1{\mathtt{0x#1}}\mathdef1046{83}~\def\mathdef1047#1{\mathtt{0x#1}}\mathdef1047{10}\) is malformed as a \(\href{../syntax/values.html#syntax-int}{\mathit{u8}}\) encoding.
Similarly, both \(\def\mathdef1048#1{\mathtt{0x#1}}\mathdef1048{83}~\def\mathdef1049#1{\mathtt{0x#1}}\mathdef1049{3E}\) and \(\def\mathdef1050#1{\mathtt{0x#1}}\mathdef1050{FF}~\def\mathdef1051#1{\mathtt{0x#1}}\mathdef1051{7B}\) are malformed as \(\href{../syntax/values.html#syntax-int}{\mathit{s8}}\) encodings.
Names
Names are encoded as a vector of bytes containing the Unicode (Section 3.9) UTF-8 encoding of the name’s character sequence.
\[\begin{split}\begin{array}{llclllll}
\def\mathdef992#1{{}}\mathdef992{name} & \href{../binary/values.html#binary-name}{\mathtt{name}} &::=&
b^\ast{:}\href{../binary/conventions.html#binary-vec}{\mathtt{vec}}(\href{../binary/values.html#binary-byte}{\mathtt{byte}}) &\Rightarrow& \href{../syntax/values.html#syntax-name}{\mathit{name}}
&& (\mathrel{\mbox{if}} \href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(\href{../syntax/values.html#syntax-name}{\mathit{name}}) = b^\ast) \\
\end{array}\end{split}\]
The auxiliary \(\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}\) function expressing this encoding is defined as follows:
\[\begin{split}\begin{array}{@{}l@{}}
\begin{array}{@{}lcl@{\qquad}l@{}}
\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(c^\ast) &=& (\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(c))^\ast \\[1ex]
\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(c) &=& b &
(\begin{array}[t]{@{}c@{~}l@{}}
\mathrel{\mbox{if}} & c < \def\mathdef1052#1{\mathrm{U{+}#1}}\mathdef1052{80} \\
\wedge & c = b) \\
\end{array} \\
\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(c) &=& b_1~b_2 &
(\begin{array}[t]{@{}c@{~}l@{}}
\mathrel{\mbox{if}} & \def\mathdef1053#1{\mathrm{U{+}#1}}\mathdef1053{80} \leq c < \def\mathdef1054#1{\mathrm{U{+}#1}}\mathdef1054{800} \\
\wedge & c = 2^6(b_1-\def\mathdef1055#1{\mathtt{0x#1}}\mathdef1055{C0})+(b_2-\def\mathdef1056#1{\mathtt{0x#1}}\mathdef1056{80})) \\
\end{array} \\
\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(c) &=& b_1~b_2~b_3 &
(\begin{array}[t]{@{}c@{~}l@{}}
\mathrel{\mbox{if}} & \def\mathdef1057#1{\mathrm{U{+}#1}}\mathdef1057{800} \leq c < \def\mathdef1058#1{\mathrm{U{+}#1}}\mathdef1058{D800} \vee \def\mathdef1059#1{\mathrm{U{+}#1}}\mathdef1059{E000} \leq c < \def\mathdef1060#1{\mathrm{U{+}#1}}\mathdef1060{10000} \\
\wedge & c = 2^{12}(b_1-\def\mathdef1061#1{\mathtt{0x#1}}\mathdef1061{E0})+2^6(b_2-\def\mathdef1062#1{\mathtt{0x#1}}\mathdef1062{80})+(b_3-\def\mathdef1063#1{\mathtt{0x#1}}\mathdef1063{80})) \\
\end{array} \\
\href{../binary/values.html#binary-utf8}{\mathrm{utf8}}(c) &=& b_1~b_2~b_3~b_4 &
(\begin{array}[t]{@{}c@{~}l@{}}
\mathrel{\mbox{if}} & \def\mathdef1064#1{\mathrm{U{+}#1}}\mathdef1064{10000} \leq c < \def\mathdef1065#1{\mathrm{U{+}#1}}\mathdef1065{110000} \\
\wedge & c = 2^{18}(b_1-\def\mathdef1066#1{\mathtt{0x#1}}\mathdef1066{F0})+2^{12}(b_2-\def\mathdef1067#1{\mathtt{0x#1}}\mathdef1067{80})+2^6(b_3-\def\mathdef1068#1{\mathtt{0x#1}}\mathdef1068{80})+(b_4-\def\mathdef1069#1{\mathtt{0x#1}}\mathdef1069{80})) \\
\end{array} \\
\end{array} \\
\mathrel{\mbox{where}} b_2, b_3, b_4 < \def\mathdef1070#1{\mathtt{0x#1}}\mathdef1070{C0} \\
\end{array}\end{split}\]
Note
Unlike in some other formats, name strings are not 0-terminated.