Skip to content

Tree crossover for SLP individuals (preserving donor structure while minimising exon disruption) #26

@morinim

Description

@morinim

Summary

The current TREE crossover implementation for ultra::gp::individual copies a donor subtree from one parent (from) into the other (to) by recursively overwriting genes at the same loci used in the donor. This guarantees validity, because donor dependencies are copied together, but it also causes an important side effect:

many active genes (exons) in the destination are overwritten unnecessarily.

In practice, the current operator behaves as a location-preserving subtree transplant. What is missing is a relocation-aware graft: the donor subtree should be copied into the destination while preserving semantics, but internal donor nodes should be placed into introns of the receiver whenever possible.

The proposal described below rewrites TREE crossover so that:

  • the donor subtree root still replaces an active locus in the destination;
  • donor dependencies are relocated into introns of the destination as much as possible;
  • addresses are rewritten consistently;
  • SLP validity constraints are preserved;
  • fallback behaviour remains well defined when introns are insufficient.

This should reduce disruption, preserve more of the recipient’s active structure, and make introns serve as intended: storage capacity for latent code.

Context

gp::individual uses a Straight Line Program representation:

  • each gene is stored at a locus;
  • arguments may reference earlier loci via D_ADDRESS;
  • therefore every valid individual respects a strict acyclicity / forward-evaluation constraint: if a gene at row i uses an address, that address must refer to a row < i;
  • only a subset of genes are active (cexons()), while the rest are introns.

Current TREE crossover (simple and correct):

auto crossover_ = [&](const locus &l, const auto &lambda) -> void
{
  to.genome_(l) = from[l];

  for (const auto &al : from[l].args)
    if (std::holds_alternative<D_ADDRESS>(al))
      lambda(from[l].locus_of_argument(al), lambda);
};

crossover_(random_locus(from), crossover_);

This recursively copies the donor root and all of its dependencies at their original donor loci.

Design goals

Design goals

The proposed TREE crossover aims to satisfy the following goals.

Preserve donor subtree semantics

The inserted material must remain a complete dependency-closed subgraph. No dangling addresses. No partial dependency loss.

Minimise disruption to the receiver

The graft root must affect the active program, but internal donor nodes should go into introns whenever possible, rather than overwriting arbitrary exons.

Preserve SLP validity

The resulting individual must satisfy current is_valid() checks:

  • correct arity;
  • category consistency;
  • valid address ordering (address.index < current.index);
  • valid hash/signature.

####Avoid changes to gp::individual representation
The improvement should fit the current storage model and should not require:

  • genome resizing;
  • new gene types;
  • proxy primitives;
  • special runtime support.

Keep fallback behaviour explicit

If the destination does not have enough introns, the operator must degrade in a controlled way rather than silently reverting to broad exon destruction.

Proposed solution

High-level idea

The new TREE crossover should work in four logical phases:

  • extract the donor subtree rooted at a random active donor locus;
  • topologically order the subtree so that dependencies come before users;
  • remap donor loci to destination loci, preferring introns for internal nodes;
  • rewrite and insert donor genes with corrected addresses.

This separates what is copied from where it is placed.

Example

Suppose the donor subtree is:

[3] ADD(1, 2)
[7] MUL([3], 2)

and the selected donor root is [7].

Suppose the receiver has available introns at:

[12], [15]

A suitable remapping is:

[7] -> [7]     // root overwrites its destination locus
[3] -> [12]    // dependency goes into an intron

The rewritten graft becomes:

[12] ADD(1, 2)
[7]  MUL([12], 2)

The resulting active effect is the same as in the donor subtree, but only one active receiver locus was necessarily overwritten.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions