Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
22155ff
Release 1.0.0
josephbirkner May 12, 2026
275f500
Basic Schema Implementation
johannes-wolf Apr 20, 2026
7ffc2f3
Harden ConstExpr Constructor
johannes-wolf May 9, 2026
3b79989
Add Root Pruning Test
johannes-wolf May 9, 2026
2965362
Improve AST Rewrites for Schema Support
johannes-wolf May 17, 2026
ea6e02f
Address schema pruning review findings
josephbirkner May 19, 2026
06c9fab
Allow late schema ids for external model arrays
josephbirkner May 19, 2026
dc1d429
Cache schema-guided wildcard field plans
josephbirkner May 20, 2026
2f4bb3f
Add sparse wide schema wildcard benchmark
josephbirkner May 20, 2026
5be874c
Benchmark wildcard field plans against basic pruning
josephbirkner May 21, 2026
e715d5e
Benchmark sparse wildcard against exact path
josephbirkner May 21, 2026
054b4b0
Merge pull request #144 from Klebert-Engineering/schema-field-pruning
josephbirkner May 21, 2026
9e9dfa1
Add enum symbols to schema metadata
josephbirkner May 21, 2026
b37fa78
Deduplicate schema metadata collection
josephbirkner May 21, 2026
65482b8
Complete schema fields and enum symbols
josephbirkner May 22, 2026
7a27ead
Merge pull request #148 from Klebert-Engineering/issue-146-schema-enums
josephbirkner May 22, 2026
a58b5dc
Rewrite auto-wildcard queries with schema paths
josephbirkner May 28, 2026
6961c23
Expose referenced schema paths for queries
josephbirkner May 29, 2026
d4c3fa7
Fix simfil string pool copy.
josephbirkner Jun 3, 2026
13ce569
Add schema-aware query rewrites
josephbirkner Jun 4, 2026
b85bb62
Avoid Helgrind false positive in StringPool copy
josephbirkner Jun 5, 2026
9c328f0
Add schema-aware query term extraction
josephbirkner Jun 8, 2026
003b8b4
Support schema-backed enum query shorthand
josephbirkner Jun 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 60 additions & 7 deletions docs/simfil-language.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,66 @@ Example: `*.["field name with spaces"]`.

### Symbols

Simfil parses identifiers containing only uppercase letters and underscores
as strings, but only if not on either side of a path operator `.`.
This means, that expressions like `**.field = ABC` get parsed as
`**.field = "ABC"`. Note that this is not the case if a symbol appears on
either side of `.`, such as `ABC.field`!
Simfil parses unquoted identifiers as field names. String values should be
written as quoted literals, for example `field = "ABC"`.

To force parsing a symbol as a field, you can put it in a path expression:
`_.FIELD` or use the subscript operator `[FIELD]`.
When schema metadata is available, the compiler may reinterpret an unquoted
standalone token as a schema symbol. This is a schema rewrite, not a parser
rule: without schema metadata, `ABC` is the field `ABC`.

### Schema-Aware Field and Enum Resolution

When the caller supplies a schema for the current model, simfil can use that schema while compiling, completing, and evaluating path expressions. This keeps short queries practical without changing the core path syntax.

Schema-aware behavior is conservative:

- A standalone scalar field name can resolve to the concrete schema path that owns that field.
- A recursive wildcard such as `**.speedLimitKmh` can skip schema branches that cannot contain `speedLimitKmh`, which avoids scanning arbitrary object branches when the schema is precise.
- An unquoted operand can resolve to a string constant when the schema proves it belongs to an enum domain.
- A standalone enum-like string literal can resolve to an equality comparison against the schema path that owns that enum value.
- If the same token can mean both a field and an enum-like value, field access wins. This keeps schema shorthand aligned with normal unquoted identifiers.

Examples:

```simfil
speedLimitKmh > 80
```

can be compiled against a schema as the concrete path that owns `speedLimitKmh`, while:

```simfil
SPEED_LIMIT_END
```

can be compiled as a comparison against schema paths whose enum domain contains `SPEED_LIMIT_END`.

Other common schema-aware patterns:

```simfil
**.speedLimitKmh > 80
```

uses the recursive wildcard syntax while still allowing schema-guided pruning, and:

```simfil
"SPEED_LIMIT_END"
```

uses the quoted enum value inserted by schema-aware completion. Schema mode can
also resolve unquoted enum tokens when the schema proves they are enum values,
but quoted strings are the explicit representation.

Use explicit path syntax when you want to force field access:

```simfil
_.WARNING_SIGN
```

or:

```simfil
["WARNING_SIGN"]
```

### Sub-Queries

Expand Down Expand Up @@ -224,6 +276,7 @@ of `expr` are stored for debugging purposes; see `limit`.
*Example*
```
trace(a.**.b{trace("sub", c == "test")})
trace(**.speedLimitKmh, name="speed limits")
```

Arguments:
Expand Down
19 changes: 19 additions & 0 deletions include/simfil/environment.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ namespace simfil
class Expr;
class Function;
class Diagnostics;
class Schema;
struct ResultFn;
struct Debug;

Expand Down Expand Up @@ -61,6 +62,8 @@ struct Trace
struct Environment
{
public:
using QuerySchemaCallback = std::function<const Schema*(SchemaId)>;

/**
* Construct a SIMFIL execution environment with a string cache,
* which is used to map field names to short integer IDs.
Expand Down Expand Up @@ -116,6 +119,12 @@ struct Environment
[[nodiscard]]
auto strings() const -> std::shared_ptr<StringPool>;

/**
* Query an object schema by its schema id.
* Returns nullptr if no callback is configured or the schema is unknown.
*/
auto querySchema(SchemaId schemaId) const -> const Schema*;

public:
std::unique_ptr<std::mutex> warnMtx;
std::vector<std::pair<std::string, std::string>> warnings;
Expand All @@ -129,6 +138,16 @@ struct Environment
/* constant ident -> value */
std::map<std::string, Value, CaseInsensitiveCompare> constants;

QuerySchemaCallback querySchemaCallback;

/**
* Enable cached schema-guided wildcard field traversal plans.
*
* Disabling this keeps the older behavior where wildcard field lookups only
* ask each node schema whether the requested field can appear below it.
*/
bool enableWildcardFieldPlans = true;

Debug* debug = nullptr;
std::shared_ptr<StringPool> stringPool;
};
Expand Down
4 changes: 4 additions & 0 deletions include/simfil/expression-visitor.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@ class UnpackExpr;
class UnaryWordOpExpr;
class BinaryWordOpExpr;
class FieldExpr;
class WildcardFieldExpr;
class PathExpr;
class PathAlternativesExpr;
class AndExpr;
class OrExpr;
struct OperatorEq;
Expand Down Expand Up @@ -53,7 +55,9 @@ class ExprVisitor
virtual void visit(const EachExpr& expr);
virtual void visit(const CallExpression& expr);
virtual void visit(const PathExpr& expr);
virtual void visit(const PathAlternativesExpr& expr);
virtual void visit(const FieldExpr& expr);
virtual void visit(const WildcardFieldExpr& expr);
virtual void visit(const UnpackExpr& expr);
virtual void visit(const UnaryWordOpExpr& expr);
virtual void visit(const BinaryWordOpExpr& expr);
Expand Down
47 changes: 34 additions & 13 deletions include/simfil/expression.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,16 @@
#include "simfil/result.h"

#include <memory>
#include <stdexcept>

namespace simfil
{

class Expr;
class ExprVisitor;

using ExprPtr = std::unique_ptr<Expr>;

class Expr
{
friend class AST;
Expand All @@ -31,17 +35,16 @@ class Expr
VALUE,
};

Expr() = delete;
explicit Expr(ExprId id)
: id_(id)
{}
explicit Expr(ExprId id, const Token& token)
: id_(id)
Expr() = default;
explicit Expr(const Token& token)
{
assert(token.end >= token.begin);
sourceLocation_.offset = token.begin;
sourceLocation_.size = token.end - token.begin;
}
explicit Expr(SourceLocation location)
: sourceLocation_(location)
{}

virtual ~Expr() = default;

Expand All @@ -56,6 +59,26 @@ class Expr
return false;
}

/* Accept expression visitor */
virtual auto accept(ExprVisitor& v) const -> void = 0;

/* Get the number of child expressions */
virtual auto numChildren() const -> std::size_t
{
return 0;
}

/* Get the n-th child expression */
virtual auto childAt(std::size_t) -> ExprPtr&
{
throw std::out_of_range("AST child index out of range");
}

virtual auto childAt(std::size_t) const -> const ExprPtr&
{
throw std::out_of_range("AST child index out of range");
}

/* Debug */
virtual auto toString() const -> std::string = 0;

Expand Down Expand Up @@ -90,11 +113,7 @@ class Expr
return ieval(ctx, std::move(val), res);
}

/* Accept expression visitor */
virtual auto accept(ExprVisitor& v) const -> void = 0;

/* Source location the expression got parsed from */
[[nodiscard]]
auto sourceLocation() const -> SourceLocation
{
return sourceLocation_;
Expand All @@ -110,12 +129,10 @@ class Expr
return ieval(ctx, value, result);
}

ExprId id_;
ExprId id_ = 0;
SourceLocation sourceLocation_;
};

using ExprPtr = std::unique_ptr<Expr>;

class AST
{
public:
Expand All @@ -126,6 +143,8 @@ class AST

~AST();

auto reenumerate() -> void;

auto expr() const -> const Expr&
{
return *expr_;
Expand All @@ -137,6 +156,8 @@ class AST
}

private:
static auto reenumerate(Expr& expr, Expr::ExprId& nextId) -> void;

/* The original query string of the AST */
std::string queryString_;

Expand Down
23 changes: 22 additions & 1 deletion include/simfil/model/model.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#pragma once

#include "simfil/model/string-pool.h"
#include "simfil/model/schema.h"
#include "simfil/byte-array.h"
#include "tl/expected.hpp"
#if defined(SIMFIL_WITH_MODEL_JSON)
Expand Down Expand Up @@ -135,6 +136,16 @@
}
else {
using ModelType = detail::ModelTypeOf<Target>;

// Merged views may expose nodes owned by another model, e.g.
// overlay feature references returned through a base tile. The
// node address must be interpreted by the model that created it.
if (auto owner = node.owningModel(); owner && owner.get() != this) {
if (auto typedOwner = dynamic_cast<ModelType const*>(owner.get())) {

Check failure on line 144 in include/simfil/model/model.h

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Refactor this code to not nest more than 3 if|for|do|while|switch statements.

See more on https://sonarcloud.io/project/issues?id=Klebert-Engineering_simfil&issues=AZ6mDWdr03Dbzg5UyKgh&open=AZ6mDWdr03Dbzg5UyKgh&pullRequest=145
return resolveInternal(res::tag<Target>{}, *typedOwner, node);
}
}

#if !defined(NDEBUG)
// In debug builds, validate the model type to catch misuse early.
auto typedModel = dynamic_cast<ModelType const*>(this);
Expand Down Expand Up @@ -274,7 +285,9 @@
size_t stringDataBytes = 0;
size_t stringRangeBytes = 0;
size_t objectMemberBytes = 0;
size_t objectSchemaBytes = 0;
size_t arrayMemberBytes = 0;
size_t arraySchemaBytes = 0;

[[nodiscard]] size_t totalBytes() const
{
Expand All @@ -284,7 +297,9 @@
+ stringDataBytes
+ stringRangeBytes
+ objectMemberBytes
+ arrayMemberBytes;
+ objectSchemaBytes
+ arrayMemberBytes
+ arraySchemaBytes;
}
};

Expand All @@ -299,12 +314,18 @@
struct Impl;
std::unique_ptr<Impl> impl_;

[[nodiscard]] SchemaId objectSchemaId(ArrayIndex members) const;
auto setObjectSchemaId(ArrayIndex members, SchemaId schemaId) -> tl::expected<void, Error>;
[[nodiscard]] SchemaId arraySchemaId(ArrayIndex members) const;
auto setArraySchemaId(ArrayIndex members, SchemaId schemaId) -> tl::expected<void, Error>;

/**
* Protected object/array member storage access,
* so derived ModelPools can create Object/Array-derived nodes.
*/
Object::Storage& objectMemberStorage();
[[nodiscard]] Object::Storage const& objectMemberStorage() const;

Array::Storage& arrayMemberStorage();
[[nodiscard]] Array::Storage const& arrayMemberStorage() const;
};
Expand Down
19 changes: 17 additions & 2 deletions include/simfil/model/nodes.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include <utility>

#include "arena.h"
#include "schema.h"
#include "string-pool.h"
#include "simfil/byte-array.h"
#include "simfil/error.h"
Expand Down Expand Up @@ -56,8 +57,9 @@ enum class ValueType
Bytes,
TransientObject,
Object,
Array
// If you add types, update TypeFlags::flags bit size!
Array,
// End
LAST_
};

using ScalarValueType = std::variant<
Expand Down Expand Up @@ -276,6 +278,9 @@ struct ModelNode
/// Get an Object model's field names
[[nodiscard]] virtual StringId keyAt(int64_t i) const;

/// Get the schema id for schema-aware container nodes, or NoSchemaId otherwise.
[[nodiscard]] virtual SchemaId schema() const;

/// Get the number of children
[[nodiscard]] virtual uint32_t size() const;

Expand All @@ -288,6 +293,9 @@ struct ModelNode
/// True if the node points at a valid model and address.
[[nodiscard]] inline bool isResolved() const {return model_ && addr_;}

/// Return the model that owns this node address.
[[nodiscard]] inline ModelConstPtr owningModel() const {return model_;}

/// Virtual destructor to allow polymorphism
virtual ~ModelNode() = default;

Expand Down Expand Up @@ -431,6 +439,7 @@ struct ModelNodeBase : public ModelNode
[[nodiscard]] ModelNode::Ptr get(const StringId&) const override;
[[nodiscard]] ModelNode::Ptr at(int64_t) const override;
[[nodiscard]] StringId keyAt(int64_t) const override;
[[nodiscard]] SchemaId schema() const override;
[[nodiscard]] uint32_t size() const override;
bool iterate(IterCallback const&) const override {return true;} // NOLINT (allow discard)

Expand Down Expand Up @@ -547,6 +556,9 @@ struct BaseArray : public MandatoryDerivedModelNodeBase<ModelType>

bool forEach(std::function<bool(ModelNodeType const&)> const& callback) const;

[[nodiscard]] SchemaId schema() const override;
auto setSchema(SchemaId schemaId) -> tl::expected<void, Error>;

[[nodiscard]] ValueType type() const override;
[[nodiscard]] ModelNode::Ptr at(int64_t) const override;
[[nodiscard]] uint32_t size() const override;
Expand Down Expand Up @@ -610,6 +622,9 @@ struct BaseObject : public MandatoryDerivedModelNodeBase<ModelType>
return addFieldInternal(name, static_cast<ModelNode::Ptr>(value));
}

[[nodiscard]] SchemaId schema() const override;
auto setSchema(SchemaId schemaId) -> tl::expected<void, Error>;

[[nodiscard]] ValueType type() const override;
[[nodiscard]] ModelNode::Ptr at(int64_t) const override;
[[nodiscard]] uint32_t size() const override;
Expand Down
Loading
Loading