-
Notifications
You must be signed in to change notification settings - Fork 6
Entities Design Notes
A few chores have been kicked down the street.
- Make
PartialPropertyChangeEvent::emit_in_contextandPartialPropertyChangeEventCore::to_eventconsumeself. - Now that non-derived properties must either be required or else have a default, we can simplify the
Propertytrait and implementation.PropertyInitializationKind::Explicitis now equivalent torequired. - Check whether processing property change events in
Context::set_propertyjustifies use ofsmall_vecor similar to prevent allocation of aVec. Similarlysmallbox. See below section for details. - Make a decision either way about
unsafeinentity::entity_store::get_entity_metadata_static. - Make a decision either way about
unsafeinentity::property_store::get_property_metadata_static. -
define_derived_property!Macro rough edges:- allow omitting empty global properties list
- Minimize or eliminate "type annotations needed"; rustc isn't good at inferring types for closures, which is how we supply function implementation to the macro. There are alternative patterns (e.g. passing name to macro)
-
define_propertymacro in case of no default AND not required, link to docs, more descriptive error message—also, emit generated code so there isn't a long list of unrelated errors.
There are two ways to iterate over all EntityId<E>s:
// Call `query_entities` with an empty query
let result_iterator: QueryResultIterator<Person> = context.query_entities::<Person>(());
// Ask for an `EntityIterator<E>` directly
let person_iterator: EntityIterator<Person> = context.get_population_iterator::<E>();The difference is their types. Both implement the trait Iterator< Item=EntityId<E> >. The QueryResultIterator<Person> type wraps an (unboxed) EntityIterator<Person> internally and is a few bytes larger. This might suggest context.get_population_iterator::<E>() should be preferred for this use case, but it probably doesn't matter in practice.
Suppose we have
define_multi_property!((Age, Status));
// And then later, possibly in a different module
define_multi_property!((Status, Age));Suppose Age changes. Do we emit PropertyChangeEvents for BOTH (Age, Status) AND (Status, Age)?
context.subscribe_to_event::<PropertyChangeEvent<(Age, Status)>>(my_event_handler);
context.subscribe_to_event::<PropertyChangeEvent<(Status, age)>>(another_event_handler);How do we share indexes but emit events separately?
Answer: Rename Property::index() to Property::id(). Every property has its own Property::id() initialized in its ctor (as originally imagined). Then implement Property::index_id(), which returns the id of the PropertyStore managing the value index on behalf of the property.
When a non-derived property is updated, all (derived) properties that depend on it need to be updated:
- Their index needs to be updated
- A property change event needs to be emitted
The property being set is completely known, while the dependents are completely type erased (represented by their index).
Several things make this process annoying:
-
The entity type is known unambiguously, but we store properties with the entity type erased, which makes passing theUPDATE: We decided to include the entity typeEntityId<Entity>to the dependent annoying.Ein thePropertyStore. - The both the old value and the new value of the dependent need to be computed, which involves looking up the values of their dependencies, which requires (immutable) access to not just the
PropertyStorebut theContext(in order to fetch global properties as well).- ...which means we need immutable access to some parts of the
PropertyStore(to look up dependencies) while we mutate other parts of thePropertyStore(to update the index) - ...which means we need to do potentially a lot of potentially redundant property lookups of values
- ...which means we need immutable access to some parts of the
Mechanism to update
The allocation within Context::set_property and related methods is really unfortunate. The smallbox is like Box but stores small enough values inline. (Similarly for the smallvec crate.) We should check if it's worth doing.
Issue #624.
We still have a problem with initialization / registration of properties.
To store properties, we use a Vec<OnceCell<Box<dyn Any>>>; the OnceCell is used because we don't want to bother initializing anything for properties that are never used. Only when client code attempts to actually use a property is the OnceCell initialized.
Except what if the property is derived? It might never be used by client code, but if its dependencies change, internally ixa needs to emit a change event for the property, which means (at minimum) computing the value of the property.
- Our current implementation calls
register_propertyeverywhere we need the property to exist. This effectively initializes all dependents of a property whenever a property is initialized.
What does it mean to "register" / "initialize" a property?
- Creation of a
StoredPeoplePropertiesobject (but not allocation of backing vector) - Determination of
is_required(by virtue of (1)) - Determination of dependencies and dependents
- Creation of
Indexobject (but not allocation of storage for nor computation of index) - Determination of
is_indexed(by virtue of (4))
Observations:
- Indexing is always deferred to the first query.
- Allocation of storage is always done on an as-needed basis.
- A property must be explicitly set as indexed, so (5) only actually sets the default
is_indexedtofalse. It doesn't actually do anything. In fact, (4) is just a convenience. - (1) and (4) are partly about instantiating a mechanism that can perform functions on behalf of the (typed) property in a type-erased way.
Let's tease out registration/initialization that is independent of a particular Context instance vs. registration / initialization that is Context-specific.
Solution so far:
- We collect metadata via ctors before
main(), computing- Property Dependents (NOT dependencies, which are known statically)
- List of properties associated with each entity
- List of required properties associated with each entity
- An
indexfor each property which we use for fast lookup. - Storing a constructor (function pointer) that knows how to construct a
Box<PropertyValueStoreCore<E, P>>type-erased as aBox<dyn Any>.
My strategy has been to put implementation that depends on knowledge of types in Box<dyn PropertyValueStore<E>>. In set_value I can then look up the relevant PropertyValueStore<E> and call the method. So far this has made a lot of sense, because the implementation generally requires access to data stored in the underlying PropertyValueStoreCore<E, P>. But if it's implementation that doesn't depend on the data in the PropertyValueStoreCore<E, P>, maybe the function pointer should live somewhere else?
Should we replace all uses of TypeId with RegisteredItem::id()?
-
entity::entity_store::ENTITY_METADATAmapsTypeIdto the typle(Vec<TypeId>, Vec<TypeId>): the entityTypeIdis mapped to lists ofProperty<E>TypeIds.This makes sense, because the use case is checking that a
PropertyListcontains all required properties, so we're checking equality of types. -
entity::property_store::PROPERTY_METADATAmapsusizetoVec<usize>: theProperty::id()is mapped to a vector containing theProperty::id()'s of all of that property's dependents.This makes sense, because the use case is looking up the properties in the property store, and they are addressable via the id.
-
Each
Entitytype now has its own series of property IDs. Thus, it's possible forProperty<E1>::id() == Property<E2>::id()ifE1!=E2. Using aTypeIDtherefore provides a little extra type safety. (Using the pair(E::id(), P::id())is an alternative—we do this forixa::entity::property_store::PROPERTY_METADATA, for example.)
Q: Should add_entity<E: Entity> return a Result<EntityId<E>, IxaError>? Right now it fails only if InitializationList is invalid.
pub fn add_entity<E: Entity, PL: PropertyList<E>>(&mut self, property_list: PL) -> Result<EntityId<E>, IxaError>;- Makes sense to return a
Resultif we let user add an entity, say, at debug console or via web API. - Makes sense to just panic if we only care about calling the method from client code directly.
Q: Should we expose Context::is_property_indexed::<E, P>() in the public API? (Returns true if the property is being indexed.) Right now the only use case is in unit tests.
- It's possible for
Context::is_property_indexed::<E, P>()to returntrueeven ifcontext.index_property::<E, P>()was never called, becausePmight be "equivalent" to some other propertyQthat is indexed. Right now this only happens for multi-properties, e.g.(Age, Height)is equivalent to(Height, Age). But presumably client code knows which properties its indexing. - It's just a noop—not an error—to call
context.index_property::<E, P>()multiple times, so client code doesn't need to check before indexing. - It's not exactly adding complexity, especially since we use it for tests anyway.
- If we make it public API, should it be part of the
PluginContext?
Q: What methods on Context should be a part of the PluginContext trait? Since PluginContext defines the basic API available to data plugins, do we want data plugins to have access to the Entity subsystem?
Edit: It's easier to have PropertyStore<E> know the entity type E, so we store the PropertyStore<E> (in a type-erased way) in EntityStore
Arguments for making the EntityStore and a field of PropertyStoreContext:
- Not every subsystem cares about entities/properties, but a lot of them do.
(
PluginContextdefines the minimal interface allDataPlugins can assume exists. Half of the trait extension constraints inPluginContextare related toPeopleContext.) - Accesses to these stores are often in the hottest paths / tightest loops, which recommends minimizing indirection (one fewer pointer dereference).
Arguments against making the EntityStore and PropertyStore fields of Context:
- Historically the only intrinsic properties and functions of the concrete
Contexttype is managing the timeline (events and plans). All other functionality is (was) provided by plugins. Adding entities (EntityStoreand) toPropertyStoreContextexpands its responsibilities from "events that happen in time" to also include "state of the world". Philosophically, one could argue this violates the "separation of concerns" / "single responsibility" principle. (Counterargument: A good software engineer is not constrained by rules of thumb.) - Is the indirection required for accessing a
DataPlugineven measurable? I've attempted to benchmark this, but it's difficult to measure. The overhead is somewhere between 0%-10% in the tightest loops, but obviously Amdahl's Law applies.
- The entity ID should know the
Entitytype so that aPersonIdcan't be used as aSettingId. - The original
PersonIdtype is opaque–it cannot be created or destructured outside of theixacrate. To achieve the same thing, we do this:EntityId<E: Entity>(usize, PhantomData<E>) -
Entitytype itself does not store the entity count, because we don't want client code to be able to create a new entity (or modify the entity count), and theEntitytypes are implemented in client code. So we store the entity count in theEntityStore(orEntityRecord.
Right now I just make all fields of properties defined with define_property (and related macros) pub to simplify parsing. But we could instead let the user specify visibility of fields.
Related: Default visibility of synthesized types. Right now I make Entity and Property<Entity> types pub. We could let the user specify.
Right now, something like define_property!(struct Age(u8), Person, is_required=true) generates a default string implementation using format!("{:?}", self). But we special-case new types that just wrap an Option<T>:
define_property!(struct Vaccine(Option<u32>), Person, default_const=None);
let no_vaccine = Vaccine(None);
no_vaccine.get_display(); // "None"
let some_vaccine = Vaccine(Some(11));
some_vaccine.get_display(); // "Vaccine(11)"An alternative is to unwrap every type for display:
define_property!(struct Age(u8), Person, is_required=true);
let age = Age(14);
age.get_display(); // "14"The wrapped Option<T> case becomes
define_property!(struct Vaccine(Option<u32>), Person, default_const=None);
let no_vaccine = Vaccine(None);
no_vaccine.get_display(); // "None"
let some_vaccine = Vaccine(Some(11));
some_vaccine.get_display(); // "11"The downside to this scheme is that Age(11) prints the same as Vaccine(Some(11)).
As currently implemented, client code can supply their own display_impl:
define_property!(
struct Height(u8),
Person,
is_required=true,
display_impl = |val: &Height| {
let feet = val.0 / 12;
let inches = val.0 - 12*feet;
format!("{}' {}\"", feet, inches)
}
);Originally it appears we assumed that it is not possible to have a non-required property without a default value. (I think we just let an internal unwrap fail.) Should we enforce this in the macro implementations? (Related to the next section as well.)
Answer: Yes, as per discussion. New implementation requires a property either have a default value or is required. Enforced at compile time with macros.
Rationale: https://app.excalidraw.com/s/7Clp38IaTj4/6og9r5Urrhu
- If you try to set a property to the value it already has, is a property change event emitted? (Previously, yes.)
- If you try to set an unset property to its default value, is a property change event emitted? (Previously, yes.)
I think we emit a property change even unconditionally, because it was hard to keep track of whether derived properties actually changed value due to type erasure. And since we emit an event for derived properties, we also do so for non-derived properties.
EDIT: There doesn't seem to be strong opinions about this either way, but there's consensus that it's not much of a burden on client code to do a check. Because a change event already isn't reporting on the instantaneous state of the world but rather a report of a particular transaction, it's already incumbant on client code to check whether the event is actionable.
See PropertyValueStore::replace().
Also: Previously, it appears we assumed that it is not possible to have a non-required property without a default value. (EDIT: Sort of. We panic in this case.)
There are several places in ixa internals where we have checks for
things that should never happen unless there's a bug in ixa itself. These
checks could be conditionally removed depending on the build profile:
#[inline(always)]
fn expect_debug_only<T>(opt: Option<T>, msg: &str) -> T {
if cfg!(debug_assertions) {
opt.expect(msg)
} else {
unsafe { opt.unwrap_unchecked() }
}
}
// or the plain vanilla inline version:
let value = if cfg!(debug_assertions) {
property_store
.get(entity_id)
.expect("getting a property value with \"constant\" initialization should never fail")
} else {
unsafe { property_store.get(entity_id).unwrap_unchecked() }
};The actual performance impact of this is probably negligible. It might be worth running some experiments to see if it's worth it.
We have this "rng_name_duplication_guard_" thing that I don't understand. Some research shows it was a pre-Rust 2018 hack to prevent global singletons with the same name.
- It doesn't work anymore (maybe?).
- Why not have RNGs with the same name?
- The macro makes the RNG private, so it's not accessible outside of the module anyway.
Alternatives (that I am also suspicious of):
- Just remove it. We trust client code to know how to refer to the correct symbol in other places.
- "private trait" trick:
$crate::paste::paste! {
trait [<__rng_guard_ $random_id>] {}
impl [<__rng_guard_ $random_id>] for () {}
}- unique
constgeneric:
struct __RngNameDupGuard<const NAME: &'static str>;
type _ = __RngNameDupGuard<{ stringify!($random_id) }>;We synthesize these types with macros without a pub visibility (or any other visibility).
DataPlugin- RNGs
EntityProperty<E: Entity>
Floating point property values are annoying, because the primitive f64 value does not implement Hash or Eq. We get around this by using the hash of the serialization of the value, hash_serialized_128, and we compare the 128 bit hash when we need equality.
But for most use cases we don't need—or even want—to use an f64. Instead, we want a finite real number. For these cases we should be using something like the decorum crate, which offers the Real type and ExtendedReal type.
I still think we should use hash_serialized_128, because it gives us the power to compute the same hash for an array of serialized values and a tuple of those same values, which allows us to query multi-properties dynamically (potentially from the debug console or web API). The performance penalty seems to be not very high.
This stuff is probably fine, and where it isn't fine, it's probably fixable over time. But if we can avoid any obvious tech debt now, we should.
- Sometimes we pass around a
Contextin places where we really just want aPropertyStoreor at most anEntityStore. (Related to how deep/shallow various internal API are in the ownership model, the next bullet.) - It's not always clear where some implementation lives. One philosophy is, all implementation that accesses a struct's data lives on the struct. But some structs are naturally just data, and some imlementation is cross-cutting. (Purists would argue for a refactor for such cases.)
QueryResultIterator