# Numerics¶

Numeric primitives are defined in a generic manner, by operators indexed over a bit width $$N$$.

Some operators are non-deterministic, because they can return one of several possible results (such as different NaN values). Technically, each operator thus returns a set of allowed values. For convenience, deterministic results are expressed as plain values, which are assumed to be identified with a respective singleton set.

Some operators are partial, because they are not defined on certain inputs. Technically, an empty set of results is returned for these inputs.

In formal notation, each operator is defined by equational clauses that apply in decreasing order of precedence. That is, the first clause that is applicable to the given arguments defines the result. In some cases, similar clauses are combined into one by using the notation $$\pm$$ or $$\mp$$. When several of these placeholders occur in a single clause, then they must be resolved consistently: either the upper sign is chosen for all of them or the lower sign.

Note

For example, the $$\href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}$$ operator is defined as follows:

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(\pm p_1, \pm p_2) &=& \pm p_1 \\ \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(\pm p_1, \mp p_2) &=& \mp p_1 \\ \end{array}\end{split}$

This definition is to be read as a shorthand for the following expansion of each clause into two separate ones:

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(+ p_1, + p_2) &=& + p_1 \\ \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(- p_1, - p_2) &=& - p_1 \\ \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(+ p_1, - p_2) &=& - p_1 \\ \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(- p_1, + p_2) &=& + p_1 \\ \end{array}\end{split}$

Numeric operators are lifted to input sequences by applying the operator element-wise, returning a sequence of results. When there are multiple inputs, they must be of equal length.

$\begin{array}{lll@{\qquad}l} op(c_1^n, \dots, c_k^n) &=& op(c_1^n, \dots, c_k^n)~\dots~op(c_1^n[n-1], \dots, c_k^n[n-1]) \end{array}$

Note

For example, the unary operator $$\href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}$$, when given a sequence of floating-point values, return a sequence of floating-point results:

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(z^n) &=& \href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(z)~\dots~\href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(z[n]) \end{array}$

The binary operator $$\href{../exec/numerics.html#op-iadd}{\mathrm{iadd}}$$, when given two sequences of integers of the same length, $$n$$, return a sequence of integer results:

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-iadd}{\mathrm{iadd}}_N(i_1^n, i_2^n) &=& \href{../exec/numerics.html#op-iadd}{\mathrm{iadd}}_N(i_1, i_2)~\dots~\href{../exec/numerics.html#op-iadd}{\mathrm{iadd}}_N(i_1[n], i_2[n]) \end{array}$

Conventions:

• The meta variable $$d$$ is used to range over single bits.

• The meta variable $$p$$ is used to range over (signless) magnitudes of floating-point values, including $$\href{../syntax/values.html#syntax-float}{\mathsf{nan}}$$ and $$\infty$$.

• The meta variable $$q$$ is used to range over (signless) rational magnitudes, excluding $$\href{../syntax/values.html#syntax-float}{\mathsf{nan}}$$ or $$\infty$$.

• The notation $$f^{-1}$$ denotes the inverse of a bijective function $$f$$.

• Truncation of rational values is written $$\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\pm q)$$, with the usual mathematical definition:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\pm q) &=& \pm i & (\mathrel{\mbox{if}} i \in \mathbb{N} \wedge +q - 1 < i \leq +q) \\ \end{array}\end{split}$
• Saturation of integers is written $$\href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i)$$ and $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(i)$$. The arguments to these two functions range over arbitrary signed integers.

• Unsigned saturation, $$\href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i)$$ clamps $$i$$ to between $$0$$ and $$2^N-1$$:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i) &=& 2^N-1 & (\mathrel{\mbox{if}} i > 2^N-1)\\ \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i) &=& 0 & (\mathrel{\mbox{if}} i < 0) \\ \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i) &=& i & (\mathrel{\mbox{otherwise}}) \\ \end{array}\end{split}$
• Signed saturation, $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(i)$$ clamps $$i$$ to between $$-2^{N-1}$$ and $$2^{N-1}-1$$:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(i) &=& \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N^{-1}(-2^{N-1}) & (\mathrel{\mbox{if}} i < -2^{N-1})\\ \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(i) &=& \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N^{-1}(2^{N-1}-1) & (\mathrel{\mbox{if}} i > 2^{N-1}-1)\\ \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(i) &=& i & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

## Representations¶

Numbers have an underlying binary representation as a sequence of bits:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_{\mathsf{i}N}(i) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i) \\ \href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_{\mathsf{f}N}(z) &=& \href{../exec/numerics.html#aux-fbits}{\mathrm{fbits}}_N(z) \\ \end{array}\end{split}$

Each of these functions is a bijection, hence they are invertible.

### Integers¶

Integers are represented as base two unsigned numbers:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i) &=& d_{N-1}~\dots~d_0 & (i = 2^{N-1}\cdot d_{N-1} + \dots + 2^0\cdot d_0) \\ \end{array}\end{split}$

Boolean operators like $$\wedge$$, $$\vee$$, or $$\veebar$$ are lifted to bit sequences of equal length by applying them pointwise.

### Floating-Point¶

Floating-point values are represented in the respective binary format defined by IEEE 754-2019 (Section 3.4):

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-fbits}{\mathrm{fbits}}_N(\pm (1+m\cdot 2^{-M})\cdot 2^e) &=& \href{../exec/numerics.html#aux-fsign}{\mathrm{fsign}}({\pm})~\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_E(e+\href{../exec/numerics.html#aux-fbias}{\mathrm{fbias}}_N)~\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_M(m) \\ \href{../exec/numerics.html#aux-fbits}{\mathrm{fbits}}_N(\pm (0+m\cdot 2^{-M})\cdot 2^e) &=& \href{../exec/numerics.html#aux-fsign}{\mathrm{fsign}}({\pm})~(0)^E~\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_M(m) \\ \href{../exec/numerics.html#aux-fbits}{\mathrm{fbits}}_N(\pm \infty) &=& \href{../exec/numerics.html#aux-fsign}{\mathrm{fsign}}({\pm})~(1)^E~(0)^M \\ \href{../exec/numerics.html#aux-fbits}{\mathrm{fbits}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-fsign}{\mathrm{fsign}}({\pm})~(1)^E~\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_M(n) \\[1ex] \href{../exec/numerics.html#aux-fbias}{\mathrm{fbias}}_N &=& 2^{E-1}-1 \\ \href{../exec/numerics.html#aux-fsign}{\mathrm{fsign}}({+}) &=& 0 \\ \href{../exec/numerics.html#aux-fsign}{\mathrm{fsign}}({-}) &=& 1 \\ \end{array}\end{split}$

where $$M = \href{../syntax/values.html#aux-significand}{\mathrm{signif}}(N)$$ and $$E = \href{../syntax/values.html#aux-exponent}{\mathrm{expon}}(N)$$.

### Storage¶

When a number is stored into memory, it is converted into a sequence of bytes in little endian byte order:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-bytes}{\mathrm{bytes}}_t(i) &=& \href{../exec/numerics.html#aux-littleendian}{\mathrm{littleendian}}(\href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_t(i)) \\[1ex] \href{../exec/numerics.html#aux-littleendian}{\mathrm{littleendian}}(\epsilon) &=& \epsilon \\ \href{../exec/numerics.html#aux-littleendian}{\mathrm{littleendian}}(d^8~{d'}^\ast~) &=& \href{../exec/numerics.html#aux-littleendian}{\mathrm{littleendian}}({d'}^\ast)~\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_8^{-1}(d^8) \\ \end{array}\end{split}$

Again these functions are invertible bijections.

### Vectors¶

Numeric vectors have the same underlying representation as an $$\href{../syntax/values.html#syntax-int}{\mathit{i128}}$$. They can also be interpreted as a sequence of numeric values packed into a $$\href{../syntax/types.html#syntax-valtype}{\mathsf{v128}}$$ with a particular $$\href{../syntax/instructions.html#syntax-shape}{\mathit{shape}}$$.

$\begin{split}\begin{array}{l} \begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-lanes}{\mathrm{lanes}}_{t\mathsf{x}N}(c) &=& c_0~\dots~c_{N-1} \\ \end{array} \\ \qquad \begin{array}[t]{@{}r@{~}l@{}} (\mathrel{\mbox{where}} & B = |t| / 8 \\ \wedge & b^{16} = \href{../exec/numerics.html#aux-bytes}{\mathrm{bytes}}_{\href{../syntax/values.html#syntax-int}{\mathit{i128}}}(c) \\ \wedge & c_i = \href{../exec/numerics.html#aux-bytes}{\mathrm{bytes}}_{t}^{-1}(b^{16}[i \cdot B \href{../syntax/conventions.html#notation-slice}{\mathrel{\mathbf{:}}} B])) \end{array} \end{array}\end{split}$

These functions are bijections, so they are invertible.

## Integer Operations¶

Integer operators are defined on $$\href{../syntax/values.html#syntax-int}{\mathit{i}N}$$ values. Operators that use a signed interpretation convert the value using the following definition, which takes the two’s complement when the value lies in the upper half of the value range (i.e., its most significant bit is $$1$$):

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i) &=& i & (0 \leq i < 2^{N-1}) \\ \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i) &=& i - 2^N & (2^{N-1} \leq i < 2^N) \\ \end{array}\end{split}$

This function is bijective, and hence invertible.

### Boolean Interpretation¶

The integer result of predicates – i.e., tests and relational operators – is defined with the help of the following auxiliary function producing the value $$1$$ or $$0$$ depending on a condition.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(C) &=& 1 & (\mathrel{\mbox{if}} C) \\ \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(C) &=& 0 & (\mathrel{\mbox{otherwise}}) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-iadd}{\mathrm{iadd}}_N(i_1, i_2)$$¶

• Return the result of adding $$i_1$$ and $$i_2$$ modulo $$2^N$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-iadd}{\mathrm{iadd}}_N(i_1, i_2) &=& (i_1 + i_2) \mathbin{\mathrm{mod}} 2^N \end{array}$

### $$\href{../exec/numerics.html#op-isub}{\mathrm{isub}}_N(i_1, i_2)$$¶

• Return the result of subtracting $$i_2$$ from $$i_1$$ modulo $$2^N$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-isub}{\mathrm{isub}}_N(i_1, i_2) &=& (i_1 - i_2 + 2^N) \mathbin{\mathrm{mod}} 2^N \end{array}$

### $$\href{../exec/numerics.html#op-imul}{\mathrm{imul}}_N(i_1, i_2)$$¶

• Return the result of multiplying $$i_1$$ and $$i_2$$ modulo $$2^N$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-imul}{\mathrm{imul}}_N(i_1, i_2) &=& (i_1 \cdot i_2) \mathbin{\mathrm{mod}} 2^N \end{array}$

### $$\href{../exec/numerics.html#op-idiv-u}{\mathrm{idiv\_u}}_N(i_1, i_2)$$¶

• If $$i_2$$ is $$0$$, then the result is undefined.

• Else, return the result of dividing $$i_1$$ by $$i_2$$, truncated toward zero.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-idiv-u}{\mathrm{idiv\_u}}_N(i_1, 0) &=& \{\} \\ \href{../exec/numerics.html#op-idiv-u}{\mathrm{idiv\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(i_1 / i_2) \\ \end{array}\end{split}$

Note

This operator is partial.

### $$\href{../exec/numerics.html#op-idiv-s}{\mathrm{idiv\_s}}_N(i_1, i_2)$$¶

• Let $$j_1$$ be the signed interpretation of $$i_1$$.

• Let $$j_2$$ be the signed interpretation of $$i_2$$.

• If $$j_2$$ is $$0$$, then the result is undefined.

• Else if $$j_1$$ divided by $$j_2$$ is $$2^{N-1}$$, then the result is undefined.

• Else, return the result of dividing $$j_1$$ by $$j_2$$, truncated toward zero.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-idiv-s}{\mathrm{idiv\_s}}_N(i_1, 0) &=& \{\} \\ \href{../exec/numerics.html#op-idiv-s}{\mathrm{idiv\_s}}_N(i_1, i_2) &=& \{\} \qquad\qquad (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) / \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2) = 2^{N-1}) \\ \href{../exec/numerics.html#op-idiv-s}{\mathrm{idiv\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N^{-1}(\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) / \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2))) \\ \end{array}\end{split}$

Note

This operator is partial. Besides division by $$0$$, the result of $$(-2^{N-1})/(-1) = +2^{N-1}$$ is not representable as an $$N$$-bit signed integer.

### $$\href{../exec/numerics.html#op-irem-u}{\mathrm{irem\_u}}_N(i_1, i_2)$$¶

• If $$i_2$$ is $$0$$, then the result is undefined.

• Else, return the remainder of dividing $$i_1$$ by $$i_2$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-irem-u}{\mathrm{irem\_u}}_N(i_1, 0) &=& \{\} \\ \href{../exec/numerics.html#op-irem-u}{\mathrm{irem\_u}}_N(i_1, i_2) &=& i_1 - i_2 \cdot \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(i_1 / i_2) \\ \end{array}\end{split}$

Note

This operator is partial.

As long as both operators are defined, it holds that $$i_1 = i_2\cdot\href{../exec/numerics.html#op-idiv-u}{\mathrm{idiv\_u}}(i_1, i_2) + \href{../exec/numerics.html#op-irem-u}{\mathrm{irem\_u}}(i_1, i_2)$$.

### $$\href{../exec/numerics.html#op-irem-s}{\mathrm{irem\_s}}_N(i_1, i_2)$$¶

• Let $$j_1$$ be the signed interpretation of $$i_1$$.

• Let $$j_2$$ be the signed interpretation of $$i_2$$.

• If $$i_2$$ is $$0$$, then the result is undefined.

• Else, return the remainder of dividing $$j_1$$ by $$j_2$$, with the sign of the dividend $$j_1$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-irem-s}{\mathrm{irem\_s}}_N(i_1, 0) &=& \{\} \\ \href{../exec/numerics.html#op-irem-s}{\mathrm{irem\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N^{-1}(j_1 - j_2 \cdot \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(j_1 / j_2)) \\ && (\mathrel{\mbox{where}} j_1 = \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) \wedge j_2 = \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \\ \end{array}\end{split}$

Note

This operator is partial.

As long as both operators are defined, it holds that $$i_1 = i_2\cdot\href{../exec/numerics.html#op-idiv-s}{\mathrm{idiv\_s}}(i_1, i_2) + \href{../exec/numerics.html#op-irem-s}{\mathrm{irem\_s}}(i_1, i_2)$$.

### $$\href{../exec/numerics.html#op-inot}{\mathrm{inot}}_N(i)$$¶

• Return the bitwise negation of $$i$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-inot}{\mathrm{inot}}_N(i) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i) \veebar \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(2^N-1)) \end{array}$

### $$\href{../exec/numerics.html#op-iand}{\mathrm{iand}}_N(i_1, i_2)$$¶

• Return the bitwise conjunction of $$i_1$$ and $$i_2$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-iand}{\mathrm{iand}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) \wedge \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-iandnot}{\mathrm{iandnot}}_N(i_1, i_2)$$¶

• Return the bitwise conjunction of $$i_1$$ and the bitwise negation of $$i_2$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-iandnot}{\mathrm{iandnot}}_N(i_1, i_2) &=& \href{../exec/numerics.html#op-iand}{\mathrm{iand}}_N(i_1, \href{../exec/numerics.html#op-inot}{\mathrm{inot}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-ior}{\mathrm{ior}}_N(i_1, i_2)$$¶

• Return the bitwise disjunction of $$i_1$$ and $$i_2$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ior}{\mathrm{ior}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) \vee \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-ixor}{\mathrm{ixor}}_N(i_1, i_2)$$¶

• Return the bitwise exclusive disjunction of $$i_1$$ and $$i_2$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ixor}{\mathrm{ixor}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(\href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) \veebar \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-ishl}{\mathrm{ishl}}_N(i_1, i_2)$$¶

• Let $$k$$ be $$i_2$$ modulo $$N$$.

• Return the result of shifting $$i_1$$ left by $$k$$ bits, modulo $$2^N$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ishl}{\mathrm{ishl}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(d_2^{N-k}~0^k) & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) = d_1^k~d_2^{N-k} \wedge k = i_2 \mathbin{\mathrm{mod}} N) \end{array}$

### $$\href{../exec/numerics.html#op-ishr-u}{\mathrm{ishr\_u}}_N(i_1, i_2)$$¶

• Let $$k$$ be $$i_2$$ modulo $$N$$.

• Return the result of shifting $$i_1$$ right by $$k$$ bits, extended with $$0$$ bits.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ishr-u}{\mathrm{ishr\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(0^k~d_1^{N-k}) & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) = d_1^{N-k}~d_2^k \wedge k = i_2 \mathbin{\mathrm{mod}} N) \end{array}$

### $$\href{../exec/numerics.html#op-ishr-s}{\mathrm{ishr\_s}}_N(i_1, i_2)$$¶

• Let $$k$$ be $$i_2$$ modulo $$N$$.

• Return the result of shifting $$i_1$$ right by $$k$$ bits, extended with the most significant bit of the original value.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ishr-s}{\mathrm{ishr\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(d_0^{k+1}~d_1^{N-k-1}) & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) = d_0~d_1^{N-k-1}~d_2^k \wedge k = i_2 \mathbin{\mathrm{mod}} N) \end{array}$

### $$\href{../exec/numerics.html#op-irotl}{\mathrm{irotl}}_N(i_1, i_2)$$¶

• Let $$k$$ be $$i_2$$ modulo $$N$$.

• Return the result of rotating $$i_1$$ left by $$k$$ bits.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-irotl}{\mathrm{irotl}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(d_2^{N-k}~d_1^k) & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) = d_1^k~d_2^{N-k} \wedge k = i_2 \mathbin{\mathrm{mod}} N) \end{array}$

### $$\href{../exec/numerics.html#op-irotr}{\mathrm{irotr}}_N(i_1, i_2)$$¶

• Let $$k$$ be $$i_2$$ modulo $$N$$.

• Return the result of rotating $$i_1$$ right by $$k$$ bits.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-irotr}{\mathrm{irotr}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N^{-1}(d_2^k~d_1^{N-k}) & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i_1) = d_1^{N-k}~d_2^k \wedge k = i_2 \mathbin{\mathrm{mod}} N) \end{array}$

### $$\href{../exec/numerics.html#op-iclz}{\mathrm{iclz}}_N(i)$$¶

• Return the count of leading zero bits in $$i$$; all bits are considered leading zeros if $$i$$ is $$0$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-iclz}{\mathrm{iclz}}_N(i) &=& k & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i) = 0^k~(1~d^\ast)^?) \end{array}$

### $$\href{../exec/numerics.html#op-ictz}{\mathrm{ictz}}_N(i)$$¶

• Return the count of trailing zero bits in $$i$$; all bits are considered trailing zeros if $$i$$ is $$0$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ictz}{\mathrm{ictz}}_N(i) &=& k & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i) = (d^\ast~1)^?~0^k) \end{array}$

### $$\href{../exec/numerics.html#op-ipopcnt}{\mathrm{ipopcnt}}_N(i)$$¶

• Return the count of non-zero bits in $$i$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ipopcnt}{\mathrm{ipopcnt}}_N(i) &=& k & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-ibits}{\mathrm{ibits}}_N(i) = (0^\ast~1)^k~0^\ast) \end{array}$

### $$\href{../exec/numerics.html#op-ieqz}{\mathrm{ieqz}}_N(i)$$¶

• Return $$1$$ if $$i$$ is zero, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ieqz}{\mathrm{ieqz}}_N(i) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i = 0) \end{array}$

### $$\href{../exec/numerics.html#op-ieq}{\mathrm{ieq}}_N(i_1, i_2)$$¶

• Return $$1$$ if $$i_1$$ equals $$i_2$$, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ieq}{\mathrm{ieq}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i_1 = i_2) \end{array}$

### $$\href{../exec/numerics.html#op-ine}{\mathrm{ine}}_N(i_1, i_2)$$¶

• Return $$1$$ if $$i_1$$ does not equal $$i_2$$, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ine}{\mathrm{ine}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i_1 \neq i_2) \end{array}$

### $$\href{../exec/numerics.html#op-ilt-u}{\mathrm{ilt\_u}}_N(i_1, i_2)$$¶

• Return $$1$$ if $$i_1$$ is less than $$i_2$$, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ilt-u}{\mathrm{ilt\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i_1 < i_2) \end{array}$

### $$\href{../exec/numerics.html#op-ilt-s}{\mathrm{ilt\_s}}_N(i_1, i_2)$$¶

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ilt-s}{\mathrm{ilt\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) < \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-igt-u}{\mathrm{igt\_u}}_N(i_1, i_2)$$¶

• Return $$1$$ if $$i_1$$ is greater than $$i_2$$, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-igt-u}{\mathrm{igt\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i_1 > i_2) \end{array}$

### $$\href{../exec/numerics.html#op-igt-s}{\mathrm{igt\_s}}_N(i_1, i_2)$$¶

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-igt-s}{\mathrm{igt\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) > \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-ile-u}{\mathrm{ile\_u}}_N(i_1, i_2)$$¶

• Return $$1$$ if $$i_1$$ is less than or equal to $$i_2$$, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ile-u}{\mathrm{ile\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i_1 \leq i_2) \end{array}$

### $$\href{../exec/numerics.html#op-ile-s}{\mathrm{ile\_s}}_N(i_1, i_2)$$¶

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ile-s}{\mathrm{ile\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) \leq \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-ige-u}{\mathrm{ige\_u}}_N(i_1, i_2)$$¶

• Return $$1$$ if $$i_1$$ is greater than or equal to $$i_2$$, $$0$$ otherwise.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ige-u}{\mathrm{ige\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(i_1 \geq i_2) \end{array}$

### $$\href{../exec/numerics.html#op-ige-s}{\mathrm{ige\_s}}_N(i_1, i_2)$$¶

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ige-s}{\mathrm{ige\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) \geq \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-iextendn-s}{\mathrm{iextend}M\mathrm{\_s}}_N(i)$$¶

• Return $$\href{../exec/numerics.html#op-extend-s}{\mathrm{extend}^{\mathsf{s}}}_{M,N}(i)$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-iextendn-s}{\mathrm{iextend}M\mathrm{\_s}}_{N}(i) &=& \href{../exec/numerics.html#op-extend-s}{\mathrm{extend}^{\mathsf{s}}}_{M,N}(i) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-ibitselect}{\mathrm{ibitselect}}_N(i_1, i_2, i_3)$$¶

• Let $$j_1$$ be the bitwise conjunction of $$i_1$$ and $$i_3$$.

• Let $$j_3'$$ be the bitwise negation of $$i_3$$.

• Let $$j_2$$ be the bitwise conjunction of $$i_2$$ and $$j_3'$$.

• Return the bitwise disjunction of $$j_1$$ and $$j_2$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ibitselect}{\mathrm{ibitselect}}_N(i_1, i_2, i_3) &=& \href{../exec/numerics.html#op-ior}{\mathrm{ior}}_N(\href{../exec/numerics.html#op-iand}{\mathrm{iand}}_N(i_1, i_3), \href{../exec/numerics.html#op-iand}{\mathrm{iand}}_N(i_2, \href{../exec/numerics.html#op-inot}{\mathrm{inot}}_N(i_3))) \end{array}$

### $$\href{../exec/numerics.html#op-iabs}{\mathrm{iabs}}_N(i)$$¶

• Let $$j$$ be the signed interpretation of $$i$$.

• If $$j$$ is greater than or equal to $$0$$, then return $$i$$.

• Else return the negation of j, modulo $$2^N$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-iabs}{\mathrm{iabs}}_N(i) &=& i & (\mathrel{\mbox{if}} \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i) \ge 0) \\ \href{../exec/numerics.html#op-iabs}{\mathrm{iabs}}_N(i) &=& -\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i) \mathbin{\mathrm{mod}} 2^N & (\mathrel{\mbox{otherwise}}) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-ineg}{\mathrm{ineg}}_N(i)$$¶

• Return the result of negating $$i$$, modulo $$2^N$$.

$\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ineg}{\mathrm{ineg}}_N(i) &=& (2^N - i) \mathbin{\mathrm{mod}} 2^N \end{array}$

### $$\href{../exec/numerics.html#op-imin-u}{\mathrm{imin\_u}}_N(i_1, i_2)$$¶

• Return $$i_1$$ if $$\href{../exec/numerics.html#op-ilt-u}{\mathrm{ilt\_u}}_N(i_1, i_2)$$ is $$1$$, return $$i_2$$ otherwise.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-imin-u}{\mathrm{imin\_u}}_N(i_1, i_2) &=& i_1 & (\mathrel{\mbox{if}} \href{../exec/numerics.html#op-ilt-u}{\mathrm{ilt\_u}}_N(i_1, i_2) = 1)\\ \href{../exec/numerics.html#op-imin-u}{\mathrm{imin\_u}}_N(i_1, i_2) &=& i_2 & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-imin-s}{\mathrm{imin\_s}}_N(i_1, i_2)$$¶

• Return $$i_1$$ if $$\href{../exec/numerics.html#op-ilt-s}{\mathrm{ilt\_s}}_N(i_1, i_2)$$ is $$1$$, return $$i_2$$ otherwise.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-imin-s}{\mathrm{imin\_s}}_N(i_1, i_2) &=& i_1 & (\mathrel{\mbox{if}} \href{../exec/numerics.html#op-ilt-s}{\mathrm{ilt\_s}}_N(i_1, i_2) = 1)\\ \href{../exec/numerics.html#op-imin-s}{\mathrm{imin\_s}}_N(i_1, i_2) &=& i_2 & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-imax-u}{\mathrm{imax\_u}}_N(i_1, i_2)$$¶

• Return $$i_1$$ if $$\href{../exec/numerics.html#op-igt-u}{\mathrm{igt\_u}}_N(i_1, i_2)$$ is $$1$$, return $$i_2$$ otherwise.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-imax-u}{\mathrm{imax\_u}}_N(i_1, i_2) &=& i_1 & (\mathrel{\mbox{if}} \href{../exec/numerics.html#op-igt-u}{\mathrm{igt\_u}}_N(i_1, i_2) = 1)\\ \href{../exec/numerics.html#op-imax-u}{\mathrm{imax\_u}}_N(i_1, i_2) &=& i_2 & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-imax-s}{\mathrm{imax\_s}}_N(i_1, i_2)$$¶

• Return $$i_1$$ if $$\href{../exec/numerics.html#op-igt-s}{\mathrm{igt\_s}}_N(i_1, i_2)$$ is $$1$$, return $$i_2$$ otherwise.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-imax-s}{\mathrm{imax\_s}}_N(i_1, i_2) &=& i_1 & (\mathrel{\mbox{if}} \href{../exec/numerics.html#op-igt-s}{\mathrm{igt\_s}}_N(i_1, i_2) = 1)\\ \href{../exec/numerics.html#op-imax-s}{\mathrm{imax\_s}}_N(i_1, i_2) &=& i_2 & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-iaddsat-u}{\mathrm{iaddsat\_u}}_N(i_1, i_2)$$¶

• Let $$i$$ be the result of adding $$i_1$$ and $$i_2$$.

• Return $$\href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i)$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-iaddsat-u}{\mathrm{iaddsat\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i_1 + i_2) \end{array}$

### $$\href{../exec/numerics.html#op-iaddsat-s}{\mathrm{iaddsat\_s}}_N(i_1, i_2)$$¶

• Let $$j_1$$ be the signed interpretation of $$i_1$$

• Let $$j_2$$ be the signed interpretation of $$i_2$$

• Let $$j$$ be the result of adding $$j_1$$ and $$j_2$$.

• Return $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(j)$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-iaddsat-s}{\mathrm{iaddsat\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) + \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-isubsat-u}{\mathrm{isubsat\_u}}_N(i_1, i_2)$$¶

• Let $$i$$ be the result of subtracting $$i_2$$ from $$i_1$$.

• Return $$\href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i)$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-isubsat-u}{\mathrm{isubsat\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(i_1 - i_2) \end{array}$

### $$\href{../exec/numerics.html#op-isubsat-s}{\mathrm{isubsat\_s}}_N(i_1, i_2)$$¶

• Let $$j_1$$ be the signed interpretation of $$i_1$$

• Let $$j_2$$ be the signed interpretation of $$i_2$$

• Let $$j$$ be the result of subtracting $$j_2$$ from $$j_1$$.

• Return $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(j)$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-isubsat-s}{\mathrm{isubsat\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_1) - \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N(i_2)) \end{array}$

### $$\href{../exec/numerics.html#op-iavgr-u}{\mathrm{iavgr\_u}}_N(i_1, i_2)$$¶

• Let $$j$$ be the result of adding $$i_1$$, $$i_2$$, and $$1$$.

• Return the result of dividing $$j$$ by $$2$$, truncated toward zero.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-iavgr-u}{\mathrm{iavgr\_u}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}((i_1 + i_2 + 1) / 2) \end{array}$

### $$\href{../exec/numerics.html#op-iq15mulrsat-s}{\mathrm{iq15mulrsat\_s}}_N(i_1, i_2)$$¶

• Return the result of $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#op-ishr-s}{\mathrm{ishr\_s}}_N(i_1 \cdot i_2 + 2^{14}, 15))$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-iq15mulrsat-s}{\mathrm{iq15mulrsat\_s}}_N(i_1, i_2) &=& \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#op-ishr-s}{\mathrm{ishr\_s}}_N(i_1 \cdot i_2 + 2^{14}, 15)) \end{array}$

## Floating-Point Operations¶

Floating-point arithmetic follows the IEEE 754-2019 standard, with the following qualifications:

• All operators use round-to-nearest ties-to-even, except where otherwise specified. Non-default directed rounding attributes are not supported.

• Following the recommendation that operators propagate NaN payloads from their operands is permitted but not required.

• All operators use “non-stop” mode, and floating-point exceptions are not otherwise observable. In particular, neither alternate floating-point exception handling attributes nor operators on status flags are supported. There is no observable difference between quiet and signalling NaNs.

Note

Some of these limitations may be lifted in future versions of WebAssembly.

### Rounding¶

Rounding always is round-to-nearest ties-to-even, in correspondence with IEEE 754-2019 (Section 4.3.1).

An exact floating-point number is a rational number that is exactly representable as a floating-point number of given bit width $$N$$.

A limit number for a given floating-point bit width $$N$$ is a positive or negative number whose magnitude is the smallest power of $$2$$ that is not exactly representable as a floating-point number of width $$N$$ (that magnitude is $$2^{128}$$ for $$N = 32$$ and $$2^{1024}$$ for $$N = 64$$).

A candidate number is either an exact floating-point number or a positive or negative limit number for the given bit width $$N$$.

A candidate pair is a pair $$z_1,z_2$$ of candidate numbers, such that no candidate number exists that lies between the two.

A real number $$r$$ is converted to a floating-point value of bit width $$N$$ as follows:

• If $$r$$ is $$0$$, then return $$+0$$.

• Else if $$r$$ is an exact floating-point number, then return $$r$$.

• Else if $$r$$ greater than or equal to the positive limit, then return $$+\infty$$.

• Else if $$r$$ is less than or equal to the negative limit, then return $$-\infty$$.

• Else if $$z_1$$ and $$z_2$$ are a candidate pair such that $$z_1 < r < z_2$$, then:

• If $$|r - z_1| < |r - z_2|$$, then let $$z$$ be $$z_1$$.

• Else if $$|r - z_1| > |r - z_2|$$, then let $$z$$ be $$z_2$$.

• Else if $$|r - z_1| = |r - z_2|$$ and the significand of $$z_1$$ is even, then let $$z$$ be $$z_1$$.

• Else, let $$z$$ be $$z_2$$.

• If $$z$$ is $$0$$, then:

• If $$r < 0$$, then return $$-0$$.

• Else, return $$+0$$.

• Else if $$z$$ is a limit number, then:

• If $$r < 0$$, then return $$-\infty$$.

• Else, return $$+\infty$$.

• Else, return $$z$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(0) &=& +0 \\ \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(r) &=& r & (\mathrel{\mbox{if}} r \in \mathrm{exact}_N) \\ \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(r) &=& +\infty & (\mathrel{\mbox{if}} r \geq +\mathrm{limit}_N) \\ \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(r) &=& -\infty & (\mathrel{\mbox{if}} r \leq -\mathrm{limit}_N) \\ \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(r) &=& \mathrm{closest}_N(r, z_1, z_2) & (\mathrel{\mbox{if}} z_1 < r < z_2 \wedge (z_1,z_2) \in \mathrm{candidatepair}_N) \\[1ex] \mathrm{closest}_N(r, z_1, z_2) &=& \mathrm{rectify}_N(r, z_1) & (\mathrel{\mbox{if}} |r-z_1|<|r-z_2|) \\ \mathrm{closest}_N(r, z_1, z_2) &=& \mathrm{rectify}_N(r, z_2) & (\mathrel{\mbox{if}} |r-z_1|>|r-z_2|) \\ \mathrm{closest}_N(r, z_1, z_2) &=& \mathrm{rectify}_N(r, z_1) & (\mathrel{\mbox{if}} |r-z_1|=|r-z_2| \wedge \mathrm{even}_N(z_1)) \\ \mathrm{closest}_N(r, z_1, z_2) &=& \mathrm{rectify}_N(r, z_2) & (\mathrel{\mbox{if}} |r-z_1|=|r-z_2| \wedge \mathrm{even}_N(z_2)) \\[1ex] \mathrm{rectify}_N(r, \pm \mathrm{limit}_N) &=& \pm \infty \\ \mathrm{rectify}_N(r, 0) &=& +0 \qquad (r \geq 0) \\ \mathrm{rectify}_N(r, 0) &=& -0 \qquad (r < 0) \\ \mathrm{rectify}_N(r, z) &=& z \\ \end{array}\end{split}$

where:

$\begin{split}\begin{array}{lll@{\qquad}l} \mathrm{exact}_N &=& \href{../syntax/values.html#syntax-float}{\mathit{f}N} \cap \mathbb{Q} \\ \mathrm{limit}_N &=& 2^{2^{\href{../syntax/values.html#aux-exponent}{\mathrm{expon}}(N)-1}} \\ \mathrm{candidate}_N &=& \mathrm{exact}_N \cup \{+\mathrm{limit}_N, -\mathrm{limit}_N\} \\ \mathrm{candidatepair}_N &=& \{ (z_1, z_2) \in \mathrm{candidate}_N^2 ~|~ z_1 < z_2 \wedge \forall z \in \mathrm{candidate}_N, z \leq z_1 \vee z \geq z_2\} \\[1ex] \mathrm{even}_N((d + m\cdot 2^{-M}) \cdot 2^e) &\Leftrightarrow& m \mathbin{\mathrm{mod}} 2 = 0 \\ \mathrm{even}_N(\pm \mathrm{limit}_N) &\Leftrightarrow& \mathrm{true} \\ \end{array}\end{split}$

### NaN Propagation¶

When the result of a floating-point operator other than $$\href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}$$, $$\href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}$$, or $$\href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}$$ is a NaN, then its sign is non-deterministic and the payload is computed as follows:

• If the payload of all NaN inputs to the operator is canonical (including the case that there are no NaN inputs), then the payload of the output is canonical as well.

• Otherwise the payload is picked non-deterministically among all arithmetic NaNs; that is, its most significant bit is $$1$$ and all others are unspecified.

This non-deterministic result is expressed by the following auxiliary function producing a set of allowed outputs from a set of inputs:

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z^\ast\} &=& \{ + \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), - \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n) ~|~ n = \href{../syntax/values.html#aux-canon}{\mathrm{canon}}_N \} & (\mathrel{\mbox{if}} \forall \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n) \in z^\ast,~ n = \href{../syntax/values.html#aux-canon}{\mathrm{canon}}_N) \\ \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z^\ast\} &=& \{ + \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), - \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n) ~|~ n \geq \href{../syntax/values.html#aux-canon}{\mathrm{canon}}_N \} & (\mathrel{\mbox{otherwise}}) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if both $$z_1$$ and $$z_2$$ are infinities of opposite signs, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$.

• Else if both $$z_1$$ and $$z_2$$ are infinities of equal sign, then return that infinity.

• Else if either $$z_1$$ or $$z_2$$ is an infinity, then return that infinity.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of opposite sign, then return positive zero.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of equal sign, then return that zero.

• Else if either $$z_1$$ or $$z_2$$ is a zero, then return the other operand.

• Else if both $$z_1$$ and $$z_2$$ are values with the same magnitude but opposite signs, then return positive zero.

• Else return the result of adding $$z_1$$ and $$z_2$$, rounded to the nearest representable value.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2\} \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_1\} \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm \infty, \mp \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm \infty, \pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(z_1, \pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm \infty, z_2) &=& \pm \infty \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm 0, \mp 0) &=& +0 \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm 0, \pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(z_1, \pm 0) &=& z_1 \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm 0, z_2) &=& z_2 \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(\pm q, \mp q) &=& +0 \\ \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(z_1 + z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if both $$z_1$$ and $$z_2$$ are infinities of equal signs, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$.

• Else if both $$z_1$$ and $$z_2$$ are infinities of opposite sign, then return $$z_1$$.

• Else if $$z_1$$ is an infinity, then return that infinity.

• Else if $$z_2$$ is an infinity, then return that infinity negated.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of equal sign, then return positive zero.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of opposite sign, then return $$z_1$$.

• Else if $$z_2$$ is a zero, then return $$z_1$$.

• Else if $$z_1$$ is a zero, then return $$z_2$$ negated.

• Else if both $$z_1$$ and $$z_2$$ are the same value, then return positive zero.

• Else return the result of subtracting $$z_2$$ from $$z_1$$, rounded to the nearest representable value.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2\} \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_1\} \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm \infty, \pm \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm \infty, \mp \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(z_1, \pm \infty) &=& \mp \infty \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm \infty, z_2) &=& \pm \infty \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm 0, \pm 0) &=& +0 \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm 0, \mp 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(z_1, \pm 0) &=& z_1 \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm 0, \pm q_2) &=& \mp q_2 \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(\pm q, \pm q) &=& +0 \\ \href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(z_1 - z_2) \\ \end{array}\end{split}$

Note

Up to the non-determinism regarding NaNs, it always holds that $$\href{../exec/numerics.html#op-fsub}{\mathrm{fsub}}_N(z_1, z_2) = \href{../exec/numerics.html#op-fadd}{\mathrm{fadd}}_N(z_1, \href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}_N(z_2))$$.

### $$\href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if one of $$z_1$$ and $$z_2$$ is a zero and the other an infinity, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$.

• Else if both $$z_1$$ and $$z_2$$ are infinities of equal sign, then return positive infinity.

• Else if both $$z_1$$ and $$z_2$$ are infinities of opposite sign, then return negative infinity.

• Else if either $$z_1$$ or $$z_2$$ is an infinity and the other a value with equal sign, then return positive infinity.

• Else if either $$z_1$$ or $$z_2$$ is an infinity and the other a value with opposite sign, then return negative infinity.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of equal sign, then return positive zero.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of opposite sign, then return negative zero.

• Else return the result of multiplying $$z_1$$ and $$z_2$$, rounded to the nearest representable value.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2\} \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_1\} \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \infty, \pm 0) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \infty, \mp 0) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm 0, \pm \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm 0, \mp \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \infty, \pm \infty) &=& +\infty \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \infty, \mp \infty) &=& -\infty \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm q_1, \pm \infty) &=& +\infty \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm q_1, \mp \infty) &=& -\infty \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \infty, \pm q_2) &=& +\infty \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm \infty, \mp q_2) &=& -\infty \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm 0, \pm 0) &=& + 0 \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(\pm 0, \mp 0) &=& - 0 \\ \href{../exec/numerics.html#op-fmul}{\mathrm{fmul}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(z_1 \cdot z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if both $$z_1$$ and $$z_2$$ are infinities, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if $$z_1$$ is an infinity and $$z_2$$ a value with equal sign, then return positive infinity.

• Else if $$z_1$$ is an infinity and $$z_2$$ a value with opposite sign, then return negative infinity.

• Else if $$z_2$$ is an infinity and $$z_1$$ a value with equal sign, then return positive zero.

• Else if $$z_2$$ is an infinity and $$z_1$$ a value with opposite sign, then return negative zero.

• Else if $$z_1$$ is a zero and $$z_2$$ a value with equal sign, then return positive zero.

• Else if $$z_1$$ is a zero and $$z_2$$ a value with opposite sign, then return negative zero.

• Else if $$z_2$$ is a zero and $$z_1$$ a value with equal sign, then return positive infinity.

• Else if $$z_2$$ is a zero and $$z_1$$ a value with opposite sign, then return negative infinity.

• Else return the result of dividing $$z_1$$ by $$z_2$$, rounded to the nearest representable value.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2\} \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_1\} \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm \infty, \pm \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm \infty, \mp \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm 0, \pm 0) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm 0, \mp 0) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm \infty, \pm q_2) &=& +\infty \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm \infty, \mp q_2) &=& -\infty \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm q_1, \pm \infty) &=& +0 \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm q_1, \mp \infty) &=& -0 \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm 0, \pm q_2) &=& +0 \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm 0, \mp q_2) &=& -0 \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm q_1, \pm 0) &=& +\infty \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(\pm q_1, \mp 0) &=& -\infty \\ \href{../exec/numerics.html#op-fdiv}{\mathrm{fdiv}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(z_1 / z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if either $$z_1$$ or $$z_2$$ is a negative infinity, then return negative infinity.

• Else if either $$z_1$$ or $$z_2$$ is a positive infinity, then return the other value.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of opposite signs, then return negative zero.

• Else return the smaller value of $$z_1$$ and $$z_2$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2\} \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_1\} \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(+ \infty, z_2) &=& z_2 \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(- \infty, z_2) &=& - \infty \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(z_1, + \infty) &=& z_1 \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(z_1, - \infty) &=& - \infty \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(\pm 0, \mp 0) &=& -0 \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(z_1, z_2) &=& z_1 & (\mathrel{\mbox{if}} z_1 \leq z_2) \\ \href{../exec/numerics.html#op-fmin}{\mathrm{fmin}}_N(z_1, z_2) &=& z_2 & (\mathrel{\mbox{if}} z_2 \leq z_1) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z_1, z_2\}$$.

• Else if either $$z_1$$ or $$z_2$$ is a positive infinity, then return positive infinity.

• Else if either $$z_1$$ or $$z_2$$ is a negative infinity, then return the other value.

• Else if both $$z_1$$ and $$z_2$$ are zeroes of opposite signs, then return positive zero.

• Else return the larger value of $$z_1$$ and $$z_2$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2\} \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_1\} \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(+ \infty, z_2) &=& + \infty \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(- \infty, z_2) &=& z_2 \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(z_1, + \infty) &=& + \infty \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(z_1, - \infty) &=& z_1 \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(\pm 0, \mp 0) &=& +0 \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(z_1, z_2) &=& z_1 & (\mathrel{\mbox{if}} z_1 \geq z_2) \\ \href{../exec/numerics.html#op-fmax}{\mathrm{fmax}}_N(z_1, z_2) &=& z_2 & (\mathrel{\mbox{if}} z_2 \geq z_1) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(z_1, z_2)$$¶

• If $$z_1$$ and $$z_2$$ have the same sign, then return $$z_1$$.

• Else return $$z_1$$ with negated sign.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(\pm p_1, \pm p_2) &=& \pm p_1 \\ \href{../exec/numerics.html#op-fcopysign}{\mathrm{fcopysign}}_N(\pm p_1, \mp p_2) &=& \mp p_1 \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(z)$$¶

• If $$z$$ is a NaN, then return $$z$$ with positive sign.

• Else if $$z$$ is an infinity, then return positive infinity.

• Else if $$z$$ is a zero, then return positive zero.

• Else if $$z$$ is a positive value, then $$z$$.

• Else return $$z$$ negated.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& +\href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n) \\ \href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(\pm \infty) &=& +\infty \\ \href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(\pm 0) &=& +0 \\ \href{../exec/numerics.html#op-fabs}{\mathrm{fabs}}_N(\pm q) &=& +q \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}_N(z)$$¶

• If $$z$$ is a NaN, then return $$z$$ with negated sign.

• Else if $$z$$ is an infinity, then return that infinity negated.

• Else if $$z$$ is a zero, then return that zero negated.

• Else return $$z$$ negated.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \mp \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n) \\ \href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}_N(\pm \infty) &=& \mp \infty \\ \href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}_N(\pm 0) &=& \mp 0 \\ \href{../exec/numerics.html#op-fneg}{\mathrm{fneg}}_N(\pm q) &=& \mp q \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(z)$$¶

• If $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z\}$$.

• Else if $$z$$ is negative infinity, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$.

• Else if $$z$$ is positive infinity, then return positive infinity.

• Else if $$z$$ is a zero, then return that zero.

• Else if $$z$$ has a negative sign, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$.

• Else return the square root of $$z$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)\} \\ \href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(- \infty) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(+ \infty) &=& + \infty \\ \href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(\pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(- q) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} \\ \href{../exec/numerics.html#op-fsqrt}{\mathrm{fsqrt}}_N(+ q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N\left(\sqrt{q}\right) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fceil}{\mathrm{fceil}}_N(z)$$¶

• If $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z\}$$.

• Else if $$z$$ is an infinity, then return $$z$$.

• Else if $$z$$ is a zero, then return $$z$$.

• Else if $$z$$ is smaller than $$0$$ but greater than $$-1$$, then return negative zero.

• Else return the smallest integral value that is not smaller than $$z$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fceil}{\mathrm{fceil}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)\} \\ \href{../exec/numerics.html#op-fceil}{\mathrm{fceil}}_N(\pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-fceil}{\mathrm{fceil}}_N(\pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-fceil}{\mathrm{fceil}}_N(- q) &=& -0 & (\mathrel{\mbox{if}} -1 < -q < 0) \\ \href{../exec/numerics.html#op-fceil}{\mathrm{fceil}}_N(\pm q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(i) & (\mathrel{\mbox{if}} \pm q \leq i < \pm q + 1) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-ffloor}{\mathrm{ffloor}}_N(z)$$¶

• If $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z\}$$.

• Else if $$z$$ is an infinity, then return $$z$$.

• Else if $$z$$ is a zero, then return $$z$$.

• Else if $$z$$ is greater than $$0$$ but smaller than $$1$$, then return positive zero.

• Else return the largest integral value that is not larger than $$z$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ffloor}{\mathrm{ffloor}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)\} \\ \href{../exec/numerics.html#op-ffloor}{\mathrm{ffloor}}_N(\pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-ffloor}{\mathrm{ffloor}}_N(\pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-ffloor}{\mathrm{ffloor}}_N(+ q) &=& +0 & (\mathrel{\mbox{if}} 0 < +q < 1) \\ \href{../exec/numerics.html#op-ffloor}{\mathrm{ffloor}}_N(\pm q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(i) & (\mathrel{\mbox{if}} \pm q - 1 < i \leq \pm q) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(z)$$¶

• If $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z\}$$.

• Else if $$z$$ is an infinity, then return $$z$$.

• Else if $$z$$ is a zero, then return $$z$$.

• Else if $$z$$ is greater than $$0$$ but smaller than $$1$$, then return positive zero.

• Else if $$z$$ is smaller than $$0$$ but greater than $$-1$$, then return negative zero.

• Else return the integral value with the same sign as $$z$$ and the largest magnitude that is not larger than the magnitude of $$z$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)\} \\ \href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(\pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(\pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(+ q) &=& +0 & (\mathrel{\mbox{if}} 0 < +q < 1) \\ \href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(- q) &=& -0 & (\mathrel{\mbox{if}} -1 < -q < 0) \\ \href{../exec/numerics.html#op-ftrunc}{\mathrm{ftrunc}}_N(\pm q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(\pm i) & (\mathrel{\mbox{if}} +q - 1 < i \leq +q) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(z)$$¶

• If $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{z\}$$.

• Else if $$z$$ is an infinity, then return $$z$$.

• Else if $$z$$ is a zero, then return $$z$$.

• Else if $$z$$ is greater than $$0$$ but smaller than or equal to $$0.5$$, then return positive zero.

• Else if $$z$$ is smaller than $$0$$ but greater than or equal to $$-0.5$$, then return negative zero.

• Else return the integral value that is nearest to $$z$$; if two values are equally near, return the even one.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)\} \\ \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(\pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(\pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(+ q) &=& +0 & (\mathrel{\mbox{if}} 0 < +q \leq 0.5) \\ \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(- q) &=& -0 & (\mathrel{\mbox{if}} -0.5 \leq -q < 0) \\ \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(\pm q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(\pm i) & (\mathrel{\mbox{if}} |i - q| < 0.5) \\ \href{../exec/numerics.html#op-fnearest}{\mathrm{fnearest}}_N(\pm q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(\pm i) & (\mathrel{\mbox{if}} |i - q| = 0.5 \wedge i~\mbox{even}) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-feq}{\mathrm{feq}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return $$0$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return $$1$$.

• Else if both $$z_1$$ and $$z_2$$ are the same value, then return $$1$$.

• Else return $$0$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-feq}{\mathrm{feq}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& 0 \\ \href{../exec/numerics.html#op-feq}{\mathrm{feq}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-feq}{\mathrm{feq}}_N(\pm 0, \mp 0) &=& 1 \\ \href{../exec/numerics.html#op-feq}{\mathrm{feq}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(z_1 = z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fne}{\mathrm{fne}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return $$1$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return $$0$$.

• Else if both $$z_1$$ and $$z_2$$ are the same value, then return $$0$$.

• Else return $$1$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fne}{\mathrm{fne}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& 1 \\ \href{../exec/numerics.html#op-fne}{\mathrm{fne}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 1 \\ \href{../exec/numerics.html#op-fne}{\mathrm{fne}}_N(\pm 0, \mp 0) &=& 0 \\ \href{../exec/numerics.html#op-fne}{\mathrm{fne}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(z_1 \neq z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return $$0$$.

• Else if $$z_1$$ and $$z_2$$ are the same value, then return $$0$$.

• Else if $$z_1$$ is positive infinity, then return $$0$$.

• Else if $$z_1$$ is negative infinity, then return $$1$$.

• Else if $$z_2$$ is positive infinity, then return $$1$$.

• Else if $$z_2$$ is negative infinity, then return $$0$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return $$0$$.

• Else if $$z_1$$ is smaller than $$z_2$$, then return $$1$$.

• Else return $$0$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& 0 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z, z) &=& 0 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(+ \infty, z_2) &=& 0 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(- \infty, z_2) &=& 1 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_1, + \infty) &=& 1 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_1, - \infty) &=& 0 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(\pm 0, \mp 0) &=& 0 \\ \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(z_1 < z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return $$0$$.

• Else if $$z_1$$ and $$z_2$$ are the same value, then return $$0$$.

• Else if $$z_1$$ is positive infinity, then return $$1$$.

• Else if $$z_1$$ is negative infinity, then return $$0$$.

• Else if $$z_2$$ is positive infinity, then return $$0$$.

• Else if $$z_2$$ is negative infinity, then return $$1$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return $$0$$.

• Else if $$z_1$$ is larger than $$z_2$$, then return $$1$$.

• Else return $$0$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& 0 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(z, z) &=& 0 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(+ \infty, z_2) &=& 1 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(- \infty, z_2) &=& 0 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(z_1, + \infty) &=& 0 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(z_1, - \infty) &=& 1 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(\pm 0, \mp 0) &=& 0 \\ \href{../exec/numerics.html#op-fgt}{\mathrm{fgt}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(z_1 > z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return $$0$$.

• Else if $$z_1$$ and $$z_2$$ are the same value, then return $$1$$.

• Else if $$z_1$$ is positive infinity, then return $$0$$.

• Else if $$z_1$$ is negative infinity, then return $$1$$.

• Else if $$z_2$$ is positive infinity, then return $$1$$.

• Else if $$z_2$$ is negative infinity, then return $$0$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return $$1$$.

• Else if $$z_1$$ is smaller than or equal to $$z_2$$, then return $$1$$.

• Else return $$0$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& 0 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(z, z) &=& 1 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(+ \infty, z_2) &=& 0 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(- \infty, z_2) &=& 1 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(z_1, + \infty) &=& 1 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(z_1, - \infty) &=& 0 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(\pm 0, \mp 0) &=& 1 \\ \href{../exec/numerics.html#op-fle}{\mathrm{fle}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(z_1 \leq z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(z_1, z_2)$$¶

• If either $$z_1$$ or $$z_2$$ is a NaN, then return $$0$$.

• Else if $$z_1$$ and $$z_2$$ are the same value, then return $$1$$.

• Else if $$z_1$$ is positive infinity, then return $$1$$.

• Else if $$z_1$$ is negative infinity, then return $$0$$.

• Else if $$z_2$$ is positive infinity, then return $$0$$.

• Else if $$z_2$$ is negative infinity, then return $$1$$.

• Else if both $$z_1$$ and $$z_2$$ are zeroes, then return $$1$$.

• Else if $$z_1$$ is smaller than or equal to $$z_2$$, then return $$1$$.

• Else return $$0$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n), z_2) &=& 0 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(z_1, \pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(z, z) &=& 1 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(+ \infty, z_2) &=& 1 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(- \infty, z_2) &=& 0 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(z_1, + \infty) &=& 0 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(z_1, - \infty) &=& 1 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(\pm 0, \mp 0) &=& 1 \\ \href{../exec/numerics.html#op-fge}{\mathrm{fge}}_N(z_1, z_2) &=& \href{../exec/numerics.html#aux-bool}{\mathrm{bool}}(z_1 \geq z_2) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fpmin}{\mathrm{fpmin}}_N(z_1, z_2)$$¶

• If $$z_2$$ is less than $$z_1$$ then return $$z_2$$.

• Else return $$z_1$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fpmin}{\mathrm{fpmin}}_N(z_1, z_2) &=& z_2 & (\mathrel{\mbox{if}} \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_2, z_1) = 1) \\ \href{../exec/numerics.html#op-fpmin}{\mathrm{fpmin}}_N(z_1, z_2) &=& z_1 & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-fpmax}{\mathrm{fpmax}}_N(z_1, z_2)$$¶

• If $$z_1$$ is less than $$z_2$$ then return $$z_2$$.

• Else return $$z_1$$.

$\begin{split}\begin{array}{@{}lcll} \href{../exec/numerics.html#op-fpmax}{\mathrm{fpmax}}_N(z_1, z_2) &=& z_2 & (\mathrel{\mbox{if}} \href{../exec/numerics.html#op-flt}{\mathrm{flt}}_N(z_1, z_2) = 1) \\ \href{../exec/numerics.html#op-fpmax}{\mathrm{fpmax}}_N(z_1, z_2) &=& z_1 & (\mathrel{\mbox{otherwise}}) \end{array}\end{split}$

## Conversions¶

### $$\href{../exec/numerics.html#op-extend-u}{\mathrm{extend}^{\mathsf{u}}}_{M,N}(i)$$¶

• Return $$i$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-extend-u}{\mathrm{extend}^{\mathsf{u}}}_{M,N}(i) &=& i \\ \end{array}\end{split}$

Note

In the abstract syntax, unsigned extension just reinterprets the same value.

### $$\href{../exec/numerics.html#op-extend-s}{\mathrm{extend}^{\mathsf{s}}}_{M,N}(i)$$¶

• Let $$j$$ be the signed interpretation of $$i$$ of size $$M$$.

• Return the two’s complement of $$j$$ relative to size $$N$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-extend-s}{\mathrm{extend}^{\mathsf{s}}}_{M,N}(i) &=& \href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_N^{-1}(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_M(i)) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-wrap}{\mathrm{wrap}}_{M,N}(i)$$¶

• Return $$i$$ modulo $$2^N$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-wrap}{\mathrm{wrap}}_{M,N}(i) &=& i \mathbin{\mathrm{mod}} 2^N \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-trunc-u}{\mathrm{trunc}^{\mathsf{u}}}_{M,N}(z)$$¶

• If $$z$$ is a NaN, then the result is undefined.

• Else if $$z$$ is an infinity, then the result is undefined.

• Else if $$z$$ is a number and $$\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(z)$$ is a value within range of the target type, then return that value.

• Else the result is undefined.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-trunc-u}{\mathrm{trunc}^{\mathsf{u}}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \{\} \\ \href{../exec/numerics.html#op-trunc-u}{\mathrm{trunc}^{\mathsf{u}}}_{M,N}(\pm \infty) &=& \{\} \\ \href{../exec/numerics.html#op-trunc-u}{\mathrm{trunc}^{\mathsf{u}}}_{M,N}(\pm q) &=& \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\pm q) & (\mathrel{\mbox{if}} -1 < \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\pm q) < 2^N) \\ \href{../exec/numerics.html#op-trunc-u}{\mathrm{trunc}^{\mathsf{u}}}_{M,N}(\pm q) &=& \{\} & (\mathrel{\mbox{otherwise}}) \\ \end{array}\end{split}$

Note

This operator is partial. It is not defined for NaNs, infinities, or values for which the result is out of range.

### $$\href{../exec/numerics.html#op-trunc-s}{\mathrm{trunc}^{\mathsf{s}}}_{M,N}(z)$$¶

• If $$z$$ is a NaN, then the result is undefined.

• Else if $$z$$ is an infinity, then the result is undefined.

• If $$z$$ is a number and $$\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(z)$$ is a value within range of the target type, then return that value.

• Else the result is undefined.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-trunc-s}{\mathrm{trunc}^{\mathsf{s}}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \{\} \\ \href{../exec/numerics.html#op-trunc-s}{\mathrm{trunc}^{\mathsf{s}}}_{M,N}(\pm \infty) &=& \{\} \\ \href{../exec/numerics.html#op-trunc-s}{\mathrm{trunc}^{\mathsf{s}}}_{M,N}(\pm q) &=& \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\pm q) & (\mathrel{\mbox{if}} -2^{N-1} - 1 < \href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(\pm q) < 2^{N-1}) \\ \href{../exec/numerics.html#op-trunc-s}{\mathrm{trunc}^{\mathsf{s}}}_{M,N}(\pm q) &=& \{\} & (\mathrel{\mbox{otherwise}}) \\ \end{array}\end{split}$

Note

This operator is partial. It is not defined for NaNs, infinities, or values for which the result is out of range.

### $$\href{../exec/numerics.html#op-trunc-sat-u}{\mathrm{trunc\_sat\_u}}_{M,N}(z)$$¶

• If $$z$$ is a NaN, then return $$0$$.

• Else if $$z$$ is negative infinity, then return $$0$$.

• Else if $$z$$ is positive infinity, then return $$2^N - 1$$.

• Else, return $$\href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(z))$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-trunc-sat-u}{\mathrm{trunc\_sat\_u}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-trunc-sat-u}{\mathrm{trunc\_sat\_u}}_{M,N}(- \infty) &=& 0 \\ \href{../exec/numerics.html#op-trunc-sat-u}{\mathrm{trunc\_sat\_u}}_{M,N}(+ \infty) &=& 2^N - 1 \\ \href{../exec/numerics.html#op-trunc-sat-u}{\mathrm{trunc\_sat\_u}}_{M,N}(z) &=& \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(z)) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-trunc-sat-s}{\mathrm{trunc\_sat\_s}}_{M,N}(z)$$¶

• If $$z$$ is a NaN, then return $$0$$.

• Else if $$z$$ is negative infinity, then return $$-2^{N-1}$$.

• Else if $$z$$ is positive infinity, then return $$2^{N-1} - 1$$.

• Else, return $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(z))$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-trunc-sat-s}{\mathrm{trunc\_sat\_s}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& 0 \\ \href{../exec/numerics.html#op-trunc-sat-s}{\mathrm{trunc\_sat\_s}}_{M,N}(- \infty) &=& -2^{N-1} \\ \href{../exec/numerics.html#op-trunc-sat-s}{\mathrm{trunc\_sat\_s}}_{M,N}(+ \infty) &=& 2^{N-1}-1 \\ \href{../exec/numerics.html#op-trunc-sat-s}{\mathrm{trunc\_sat\_s}}_{M,N}(z) &=& \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#aux-trunc}{\mathrm{trunc}}(z)) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-promote}{\mathrm{promote}}_{M,N}(z)$$¶

• If $$z$$ is a canonical NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$ (i.e., a canonical NaN of size $$N$$).

• Else if $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(1)\}$$ (i.e., any arithmetic NaN of size $$N$$).

• Else, return $$z$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-promote}{\mathrm{promote}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} & (\mathrel{\mbox{if}} n = \href{../syntax/values.html#aux-canon}{\mathrm{canon}}_N) \\ \href{../exec/numerics.html#op-promote}{\mathrm{promote}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{+ \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(1)\} & (\mathrel{\mbox{otherwise}}) \\ \href{../exec/numerics.html#op-promote}{\mathrm{promote}}_{M,N}(z) &=& z \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-demote}{\mathrm{demote}}_{M,N}(z)$$¶

• If $$z$$ is a canonical NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\}$$ (i.e., a canonical NaN of size $$N$$).

• Else if $$z$$ is a NaN, then return an element of $$\href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(1)\}$$ (i.e., any NaN of size $$N$$).

• Else if $$z$$ is an infinity, then return that infinity.

• Else if $$z$$ is a zero, then return that zero.

• Else, return $$\href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(z)$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-demote}{\mathrm{demote}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{\} & (\mathrel{\mbox{if}} n = \href{../syntax/values.html#aux-canon}{\mathrm{canon}}_N) \\ \href{../exec/numerics.html#op-demote}{\mathrm{demote}}_{M,N}(\pm \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(n)) &=& \href{../exec/numerics.html#aux-nans}{\mathrm{nans}}_N\{+ \href{../syntax/values.html#syntax-float}{\mathsf{nan}}(1)\} & (\mathrel{\mbox{otherwise}}) \\ \href{../exec/numerics.html#op-demote}{\mathrm{demote}}_{M,N}(\pm \infty) &=& \pm \infty \\ \href{../exec/numerics.html#op-demote}{\mathrm{demote}}_{M,N}(\pm 0) &=& \pm 0 \\ \href{../exec/numerics.html#op-demote}{\mathrm{demote}}_{M,N}(\pm q) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(\pm q) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-convert-u}{\mathrm{convert}^{\mathsf{u}}}_{M,N}(i)$$¶

• Return $$\href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(i)$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-convert-u}{\mathrm{convert}^{\mathsf{u}}}_{M,N}(i) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(i) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-convert-s}{\mathrm{convert}^{\mathsf{s}}}_{M,N}(i)$$¶

• Let $$j$$ be the signed interpretation of $$i$$.

• Return $$\href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(j)$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-convert-s}{\mathrm{convert}^{\mathsf{s}}}_{M,N}(i) &=& \href{../exec/numerics.html#aux-ieee}{\mathrm{float}}_N(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_M(i)) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-reinterpret}{\mathrm{reinterpret}}_{t_1,t_2}(c)$$¶

• Let $$d^\ast$$ be the bit sequence $$\href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_{t_1}(c)$$.

• Return the constant $$c'$$ for which $$\href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_{t_2}(c') = d^\ast$$.

$\begin{split}\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-reinterpret}{\mathrm{reinterpret}}_{t_1,t_2}(c) &=& \href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_{t_2}^{-1}(\href{../exec/numerics.html#aux-bits}{\mathrm{bits}}_{t_1}(c)) \\ \end{array}\end{split}$

### $$\href{../exec/numerics.html#op-narrow-s}{\mathrm{narrow}^{\mathsf{s}}}_{M,N}(i)$$¶

• Let $$j$$ be the signed interpretation of $$i$$ of size $$M$$.

• Return $$\href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(j)$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-narrow-s}{\mathrm{narrow}^{\mathsf{s}}}_{M,N}(i) &=& \href{../exec/numerics.html#aux-sat-s}{\mathrm{sat}^{\mathsf{s}}}_N(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_M(i)) \end{array}$

### $$\href{../exec/numerics.html#op-narrow-u}{\mathrm{narrow}^{\mathsf{u}}}_{M,N}(i)$$¶

• Let $$j$$ be the signed interpretation of $$i$$ of size $$M$$.

• Return $$\href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(j)$$.

$\begin{array}{lll@{\qquad}l} \href{../exec/numerics.html#op-narrow-u}{\mathrm{narrow}^{\mathsf{u}}}_{M,N}(i) &=& \href{../exec/numerics.html#aux-sat-u}{\mathrm{sat}^{\mathsf{u}}}_N(\href{../exec/numerics.html#aux-signed}{\mathrm{signed}}_M(i)) \end{array}$