An implementation of exactly evaluated floating point predicates for arbitrary expressions composed of additions and multiplications.
This provides an expression template system for representing arbitrary expressions at compile-time, and eval functions for evaluating the expressions with various guarantees.
Example:
auto expr = (mult_expr(5, 8.0) + 12.0 - 5.6) * 0.23;
std::optional<float> evaluatable = correct_eval<float>(expr * 32.2);
if(!evaluatable) {
float exact_sign = adaptive_eval<float>(expr * 32.0);
float rounded_exact_val = exact_eval<float>(expr * 32.0);
}
There are 5 primary eval_methods:
correct_eval: Returns anstd::optionalafter evaluating with the specified floating point type if the result is guaranteed to have the correct sign based on relative error analysis. This useseval_checked_fastandeval_with_errinternally; if both of those fail to give a result with the correct sign, it returnsstd::nullopt. If you need to run on vector types, useeval_checked_fastoreval_with_errdirectly instead.eval_checked_fast: Returns a pair with the approximate result as the first parameter, and a bool indicating whether relative error analysis is useful on the result. If the result isn't guaranteed to have the correct sign based on relative error analysis, then the first return is NaN. This works withvector_type. Note that this is not as discerning aseval_with_err, and it can only function with at most one addition of opposite signs (or subtraction of the same signs) above the 2nd level of the expression tree, though when it works it is much faster with expressions with only a few additions/subtractions above the 2nd level of the expression tree.eval_with_err: Returns a pair containing the approximate result and the estimated error if the result is guaranteed to have the correct sign based on interval arithmetic. If not, then it returns NaN. This works with vector types with an associatedselectmethod.adaptive_eval: Returns a value with the correct sign, but not necessarily a good representative of the exact result. For latency sensitive applications, use this. This runseval_checked_fastbefore attempting to perform adaptive evaluation, and if that fails, uses interval arithmetic to ensure that no sign errors can occur. After that, it uses a simple compile-time latency analysis to try to choose the sub-tree to evaluate. This does not support vector types, and adaptive evaluation that ends up exactly evaluating the entire expression will be more expensive thanexact_eval.exact_eval: Computes the result exactly, and then rounds it to the specified output format. This is consistently very slow, however, unlike adaptive_eval, it supports running on SIMD types that satisfy thevector_typeconcept. For high throughput applications, use this on expressions after determining they need exact evaluation.
Note: To run efficiently on a GPU, use the provided GPUVec type as the eval_type parameter to exact_eval
The arith_expr class represents the expression template as a binary tree; it takes 3 template parameters:
Op: A functor representing the arithmetic expression to be applied; generally one ofstd::plus,std::minus, orstd::multipliesLHS: The expression (or number) on the left hand side of the operatorRHS: The expression (or number) on the right hand side of the operator and provides 2 methods:lhs: Returns the expression (or number) on the left hand side of the operatorrhs: Returns the expression (or number) on the right hand side of the operator
This computes exactly by representing intermediate results as a finite series. For example, 435.75 might be represented as (430.0 + 5.0 + 0.75), or as (195.0 - 200.0 + 195.0 + 200.75 - 55.0). Then, results are computed exactly by appending the necessary values to the series. Addition and subtraction are straight-forward, they just append the operand to the series. Multiplication computes the exact product of every pair in the two series being multiplied and creates a new series from the result. For example, (30.0 + 2.0) * (10.0 + 0.5) = (30.0 * 10.0 + 30.0 * 0.5 + 2.0 * 10.0 + 2.0 * 0.5), which might expand to (300.0 + 10.0 + 5.0 + 20.0 + 1.0). Division can't currently be computed exactly; but todo in the future, it will be handled by rewriting the expression in terms of multiplications. Todo in the future, square-roots will be handled with the repeated squaring method; rewriting (a + sqrt(b)) as (a - sqrt(b)) * (a + sqrt(b)) / (a - sqrt(b)) = (a * a - b) / (a - sqrt(b)).
There are three concepts that are important to this library:
arith_number: Any data-type that has+,-,*, operators and functionsabsandmul_sub.absis the regular absolute value function, andmul_subcomputesa * b - crounded to1/2 epsilonfor your data-type.mul_subis typically implemented withfmaasfma(a, b, -c)scalar_type: Anyarith_numberthat does not have an[]operator.vector_type: Anyarith_numberthat also has an[]operator and aselectmethod.selecttakes 3 parameters: the first a parameterpwith the type returned byv1 > v2forvector_typesv1andv2, and 2vector_typeparameters, and returns avector_typeparameter containing elements fromv1when the corresponding element inpis true or the element fromv2when false.