Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions docs/simfil-language.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ count(mylist.*)

## Types

Simfil supports the following scalar types: `null`, `bool`, `int`, `float` (double precision), `string` and `re`.
Simfil supports the following scalar types: `null`, `bool`, `int`, `float` (double precision), `string`, `bytes` and `re`.
Additionally, the `model` type represents compound object/array container nodes.
All values but `null` and `false` are considered `true`, implicit boolean conversion takes place for operators
`and` and `or` only.
Expand All @@ -151,6 +151,11 @@ The following types can be target types for a cast:
* `int` - Converts the value to an integer. Returns 0 on failure.
* `float` - Converts the value to a float. Returns 0 on failure.
* `string` - Converts the value to a string. Boolean values are converted to either "true" or "false".
* `bytes` - Converts the value to bytes.

Byte literals are written using the `b` prefix, e.g. `b"hello"` or `b'hello'`.
Escape sequences `\n`, `\r`, `\t`, `\\`, `\"`, and `\'` are supported.
Bytes can also be written explicitly using `\xNN` (hex), e.g. `b"\x41\x00"`.

## Operators

Expand All @@ -161,12 +166,12 @@ The following types can be target types for a cast:
| `[ a ]` | Array/Object subscript, index expression can be of type `int` or `string`. |
| `{ a }` | Sub-Query (inside sub-query `_` represents the value the query is applied to). |
| `. b` or `a . b` | Direct field access; returns the value of field `b` or `null`. |
| `a as b` | Cast a to type b (one of `bool`, `int`, `float` or `string`). |
| `a as b` | Cast a to type b (one of `bool`, `int`, `float`, `string` or `bytes`). |
| `a ?` | Get boolean value of `a` (see ##Types). |
| `a ...` | Unpacks `a` to a list of values (see function `range` under [Functions](#Functions) for example) |
| `typeof a` | Returns the type of the value of its expression (`"null"`, `"bool"`, `"int"`, `"float"` or `"string"`). |
| `typeof a` | Returns the type of the value of its expression (`"null"`, `"bool"`, `"int"`, `"float"`, `"string"` or `"bytes"`). |
| `not a` | Boolean not. |
| `# a` | Returns the length of a string or array value. |
| `# a` | Returns the length of a string, bytes, or array value. |
| `~ a` | Bitwise not. |
| `- a` | Unary minus. |
| `a * b` | Multiplication. |
Expand Down
129 changes: 129 additions & 0 deletions include/simfil/byte-array.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
// Copyright (c) Navigation Data Standard e.V. - See "LICENSE" file.
#pragma once

#include <cstdint>
#include <cstring>
#include <iterator>
#include <optional>
#include <string>
#include <string_view>
#include <utility>

#include <fmt/format.h>

namespace simfil
{

struct ByteArray
{
std::string bytes;

ByteArray() = default;

explicit ByteArray(const char* data)
: bytes(data)
{}

explicit ByteArray(std::string_view data)
: bytes(data)
{}

explicit ByteArray(std::string data)
: bytes(std::move(data))
{}

auto operator==(const ByteArray&) const -> bool = default;

[[nodiscard]] static std::optional<ByteArray> fromHex(std::string_view hex)
{
if (hex.size() % 2 != 0)
return std::nullopt;

std::string decoded;
decoded.reserve(hex.size() / 2);
for (size_t i = 0; i < hex.size(); i += 2) {
const auto upper = decodeHexNibble(hex[i]);
const auto lower = decodeHexNibble(hex[i + 1]);
if (upper < 0 || lower < 0)
return std::nullopt;
decoded.push_back(static_cast<char>((upper << 4) | lower));
}

return ByteArray{std::move(decoded)};
}

[[nodiscard]] std::optional<int64_t> decodeBigEndianI64() const
{
if (bytes.size() > 8) {
for (size_t i = 8; i < bytes.size(); ++i) {
if (static_cast<unsigned char>(bytes[i]) != 0)
return std::nullopt;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big-endian decode logic truncates wrong bytes for overflow

High Severity

The decodeBigEndianI64 function incorrectly handles byte arrays longer than 8 bytes. It checks if trailing bytes (indices 8+) are zero and then uses the leading 8 bytes. In big-endian, trailing bytes are the least significant, so this logic rejects valid values that fit in 64 bits (like 256 in a 10-byte array) while accepting values that overflow (like 2^72 truncated to its high bytes). The check for overflow needs to verify the leading excess bytes are zero (or proper sign extension), not the trailing ones.

Fix in Cursor Fix in Web


const size_t count = bytes.size() <= 8 ? bytes.size() : 8;
uint64_t value = 0;
for (size_t i = 0; i < count; ++i) {
value = (value << 8) | static_cast<unsigned char>(bytes[i]);
}

int64_t signedValue = 0;
std::memcpy(&signedValue, &value, sizeof(signedValue));
return signedValue;
}

[[nodiscard]] std::string toHex(bool uppercase = true) const
{
std::string out;
out.reserve(bytes.size() * 2);

if (uppercase) {
for (unsigned char byte : bytes)
fmt::format_to(std::back_inserter(out), FMT_STRING("{:02X}"), byte);
} else {
for (unsigned char byte : bytes)
fmt::format_to(std::back_inserter(out), FMT_STRING("{:02x}"), byte);
}

return out;
}

[[nodiscard]] std::string toLiteral() const
{
std::string out;
out.reserve(bytes.size() + 3);
out += "b\"";

for (unsigned char byte : bytes) {
switch (byte) {
case '\\': out += "\\\\"; break;
case '"': out += "\\\""; break;
case '\n': out += "\\n"; break;
case '\r': out += "\\r"; break;
case '\t': out += "\\t"; break;
default:
if (byte < 0x20 || byte >= 0x7f)
fmt::format_to(std::back_inserter(out), FMT_STRING("\\x{:02X}"), byte);
else
out.push_back(static_cast<char>(byte));
break;
}
}

out.push_back('"');
return out;
}

[[nodiscard]] static auto decodeHexNibble(char c) -> int
{
if ('0' <= c && c <= '9')
return c - '0';
if ('a' <= c && c <= 'f')
return c - 'a' + 10;
if ('A' <= c && c <= 'F')
return c - 'A' + 10;
return -1;
}
};

} // namespace simfil
100 changes: 94 additions & 6 deletions include/simfil/model/model.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,18 @@
#pragma once

#include "simfil/model/string-pool.h"
#include "simfil/byte-array.h"
#include "tl/expected.hpp"
#if defined(SIMFIL_WITH_MODEL_JSON)
# include "nlohmann/json.hpp"
#endif

#include <memory>
#include <string_view>
#include <type_traits>
#include <vector>
#include <utility>
#include <cassert>
#include <istream>
#include <ostream>

Expand All @@ -19,6 +23,38 @@
namespace simfil
{

namespace res
{
// Tag type for ADL-based resolve hooks implemented by model libraries.
template<typename Target>
struct tag {};
}

namespace detail
{
template<class T>
concept HasModelType = requires { typename T::ModelType; };

template<HasModelType T>
using ModelTypeOf = typename T::ModelType;
}

/**
* ADL customization point for typed node resolution.
* Libraries define resolveInternal(tag, model, node) in their namespace.
*/
template<typename Target, typename ModelType>
model_ptr<Target> resolveInternal(res::tag<Target>, ModelType const&, ModelNode const&) = delete;

class ModelPool;

// Built-in resolve hooks for core node types. Declared here so ADL sees them
// across translation units without relying on friend injection.
template<>
model_ptr<Object> resolveInternal(res::tag<Object>, ModelPool const&, ModelNode const&);
template<>
model_ptr<Array> resolveInternal(res::tag<Array>, ModelPool const&, ModelNode const&);

/**
* Basic node model which only resolves trivial node types.
*/
Expand Down Expand Up @@ -58,6 +94,60 @@ class Model : public std::enable_shared_from_this<Model>
*/
virtual tl::expected<void, Error> resolve(ModelNode const& n, ResolveFn const& cb) const;

/**
* Resolve a node to a specific ModelNode subtype using ADL hooks.
* This provides a clean cast API without exposing model internals.
*/
template<typename Target = ModelNode>
model_ptr<Target> resolve(ModelNodeAddress const& address) const
{
if constexpr (std::is_same_v<Target, ModelNode>) {
return ModelNode::Ptr::make(shared_from_this(), address);
}
return resolve<Target>(*ModelNode::Ptr::make(shared_from_this(), address));
}

template<typename Target = ModelNode>
model_ptr<Target> resolve(ModelNodeAddress const& address, ScalarValueType data) const
{
if constexpr (std::is_same_v<Target, ModelNode>) {
return ModelNode::Ptr::make(shared_from_this(), address, std::move(data));
}
return resolve<Target>(*ModelNode::Ptr::make(shared_from_this(), address, std::move(data)));
}

template<typename Target>
model_ptr<Target> resolve(ModelNode::Ptr const& node) const
{
return resolve<Target>(*node);
}

template<typename Target>
model_ptr<Target> resolve(ModelNode const& node) const
{
if constexpr (std::is_same_v<Target, ModelNode>) {
return model_ptr<ModelNode>(node);
}
else {
if constexpr (!detail::HasModelType<Target>) {
static_assert(detail::HasModelType<Target>, "Target must provide a ModelType alias.");
return {};
}
else {
using ModelType = detail::ModelTypeOf<Target>;
#if !defined(NDEBUG)
// In debug builds, validate the model type to catch misuse early.
auto typedModel = dynamic_cast<ModelType const*>(this);
assert(typedModel && "resolve<T> called on incompatible model type.");
return resolveInternal(res::tag<Target>{}, *typedModel, node);
#else
// In release builds, avoid RTTI overhead on this hot path.
return resolveInternal(res::tag<Target>{}, *static_cast<ModelType const*>(this), node);
#endif
}
}
}

/** Add a small scalar value and get its model node view */
ModelNode::Ptr newSmallValue(bool value);
ModelNode::Ptr newSmallValue(int16_t value);
Expand Down Expand Up @@ -88,6 +178,8 @@ class ModelPool : public Model
template<typename, typename> friend struct BaseArray;

public:
// Keep Model::resolve<T> overloads visible alongside the virtual resolve override.
using Model::resolve;
/**
* The pool consists of multiple ModelNode columns,
* each for a different data type. Each column
Expand All @@ -100,6 +192,7 @@ class ModelPool : public Model
Double,
String,
PooledString,
ByteArray,

FirstCustomColumnId = 128,
};
Expand Down Expand Up @@ -154,14 +247,9 @@ class ModelPool : public Model
ModelNode::Ptr newValue(int64_t const& value);
ModelNode::Ptr newValue(double const& value);
ModelNode::Ptr newValue(std::string_view const& value);
ModelNode::Ptr newValue(simfil::ByteArray const& value);
ModelNode::Ptr newValue(StringId handle);

/** Node-type-specific resolve-functions */
[[nodiscard]]
model_ptr<Object> resolveObject(ModelNode::Ptr const& n) const;
[[nodiscard]]
model_ptr<Array> resolveArray(ModelNode::Ptr const& n) const;

/** Access the field name storage */
[[nodiscard]]
std::shared_ptr<StringPool> strings() const;
Expand Down
Loading
Loading