-
-
Notifications
You must be signed in to change notification settings - Fork 3
Tree crossover for SLP individuals (preserving donor structure while minimising exon disruption) #26
Description
Summary
The current TREE crossover implementation for ultra::gp::individual copies a donor subtree from one parent (from) into the other (to) by recursively overwriting genes at the same loci used in the donor. This guarantees validity, because donor dependencies are copied together, but it also causes an important side effect:
many active genes (exons) in the destination are overwritten unnecessarily.
In practice, the current operator behaves as a location-preserving subtree transplant. What is missing is a relocation-aware graft: the donor subtree should be copied into the destination while preserving semantics, but internal donor nodes should be placed into introns of the receiver whenever possible.
The proposal described below rewrites TREE crossover so that:
- the donor subtree root still replaces an active locus in the destination;
- donor dependencies are relocated into introns of the destination as much as possible;
- addresses are rewritten consistently;
- SLP validity constraints are preserved;
- fallback behaviour remains well defined when introns are insufficient.
This should reduce disruption, preserve more of the recipient’s active structure, and make introns serve as intended: storage capacity for latent code.
Context
gp::individual uses a Straight Line Program representation:
- each gene is stored at a
locus; - arguments may reference earlier loci via
D_ADDRESS; - therefore every valid individual respects a strict acyclicity / forward-evaluation constraint: if a gene at row
iuses an address, that address must refer to a row< i; - only a subset of genes are active (
cexons()), while the rest are introns.
Current TREE crossover (simple and correct):
auto crossover_ = [&](const locus &l, const auto &lambda) -> void
{
to.genome_(l) = from[l];
for (const auto &al : from[l].args)
if (std::holds_alternative<D_ADDRESS>(al))
lambda(from[l].locus_of_argument(al), lambda);
};
crossover_(random_locus(from), crossover_);This recursively copies the donor root and all of its dependencies at their original donor loci.
Design goals
Design goals
The proposed TREE crossover aims to satisfy the following goals.
Preserve donor subtree semantics
The inserted material must remain a complete dependency-closed subgraph. No dangling addresses. No partial dependency loss.
Minimise disruption to the receiver
The graft root must affect the active program, but internal donor nodes should go into introns whenever possible, rather than overwriting arbitrary exons.
Preserve SLP validity
The resulting individual must satisfy current is_valid() checks:
- correct arity;
- category consistency;
- valid address ordering (
address.index < current.index); - valid hash/signature.
####Avoid changes to gp::individual representation
The improvement should fit the current storage model and should not require:
- genome resizing;
- new gene types;
- proxy primitives;
- special runtime support.
Keep fallback behaviour explicit
If the destination does not have enough introns, the operator must degrade in a controlled way rather than silently reverting to broad exon destruction.
Proposed solution
High-level idea
The new TREE crossover should work in four logical phases:
- extract the donor subtree rooted at a random active donor locus;
- topologically order the subtree so that dependencies come before users;
- remap donor loci to destination loci, preferring introns for internal nodes;
- rewrite and insert donor genes with corrected addresses.
This separates what is copied from where it is placed.
Example
Suppose the donor subtree is:
[3] ADD(1, 2)
[7] MUL([3], 2)
and the selected donor root is [7].
Suppose the receiver has available introns at:
[12], [15]
A suitable remapping is:
[7] -> [7] // root overwrites its destination locus
[3] -> [12] // dependency goes into an intron
The rewritten graft becomes:
[12] ADD(1, 2)
[7] MUL([12], 2)
The resulting active effect is the same as in the donor subtree, but only one active receiver locus was necessarily overwritten.