psaegert · psaegert · Oct 23, 2025 · Oct 23, 2025
diff --git a/README.md b/README.md
@@ -35,7 +35,7 @@ engine.simplify(('/', '<constant>', '*', '/', '*', 'x3', '<constant>', 'x3', 'lo
 
 # Simplify infix expressions
 engine.simplify('x3 * sin(<constant> + 1) / (x3 * x3)')
-# > '(<constant> / x3)'
+# > '<constant> / x3'
 ```
 
 More examples can be found in the [documentation](https://simplipy.readthedocs.io/).
@@ -88,7 +88,7 @@ pytest tests --cov src --cov-report html -m "not integration"
     title = {Efficient Simplification of Mathematical Expressions},
     year = 2025,
     publisher = {GitHub},
-    version = {0.2.8},
+    version = {0.2.9},
     url = {https://github.com/psaegert/simplipy}
 }
 ```
diff --git a/docs/index.md b/docs/index.md
@@ -1,4 +1,103 @@
-# Home
+# SimpliPy Documentation
 
-This page is under construction.
-Check out the [API Reference](api.md) in the meantime.
+SimpliPy is a high-throughput symbolic simplifier built for workloads where
+classic tools like SymPy struggle—think millions of expressions in the pre-training of
+Flash-ANSR's prefix-based transformer models. Instead of converting tokens into
+heavyweight objects and back again, SimpliPy keeps expressions as lightweight
+prefix lists, enabling rapid rewriting and direct integration with machine
+learning pipelines.
+
+
+## Why SimpliPy Exists
+
+SymPy excels at exact algebra, but its object graph and string parsing introduce
+costs that dominate at scale. SimpliPy was created to remove those bottlenecks:
+
+- **Prefix-first representation** – Expressions stay as token lists the entire
+	time, so there's no repeated parsing or AST allocation.
+- **Deterministic pipelines** – Rule application, operand sorting, and literal
+	masking always produce the same layout, which keeps downstream caches warm.
+- **GPU-friendly integration** – Outputs map directly into Flash-ANSR's input
+	space without any conversion step, making it practical to simplify millions of
+	candidates per minute.
+
+
+## Simplification Pipeline (Pseudo-Algorithm)
+
+```text
+function simplify(expr, max_iter=5):
+    tokens = parse(expr)  # infix→prefix or validate existing prefix
+    tokens = normalize(tokens)  # power folding, unary handling
+
+    for _ in range(max_iter):
+        tokens = cancel_terms(tokens)  # additive/multiplicative multiplicities
+        tokens = apply_rules(tokens)   # compiled rewrite patterns
+        tokens = sort_operands(tokens) # canonical order for commutative ops
+        tokens = mask_literals(tokens) # collapse trivial numerics to <constant>
+
+        if converged(tokens):
+            break
+
+    return finalize(tokens)  # prefix list or infix string, caller’s choice
+```
+
+This loop is intentionally lightweight: each pass performs a handful of pure
+list transformations, giving you predictable performance even on nested or noisy
+expressions.
+
+
+## Key Components
+
+- **Parsing & normalization** – `SimpliPyEngine.parse` and
+	`convert_expression` convert infix input, harmonize power operators, and
+	propagate unary negation without losing prefix fidelity.
+- **Term cancellation** – `collect_multiplicities` and `cancel_terms` identify
+	subtrees that appear with opposite parity or redundant factors, pruning them
+	before any rules run.
+- **Rule execution** – `compile_rules` turns machine-discovered or human-authored
+	simplifications into tree patterns. `apply_simplifcation_rules` then performs
+	fast top-down matching in each iteration.
+- **Canonical ordering** – `sort_operands` imposes a stable ordering for
+	commutative operators, ensuring identical expressions share identical token
+	layouts.
+- **Rule discovery workflow** – `find_rules` explores expression space in
+	parallel worker processes, confirms identities with numeric sampling, and
+	writes back deduplicated rulesets that future engines can load instantly.
+
+
+## Quickstart
+
+```bash
+pip install simplipy
+```
+
+```python
+import simplipy as sp
+
+engine = sp.SimpliPyEngine.load("dev_7-3", install=True)
+
+# Simplify prefix expressions
+engine.simplify(['/', '<constant>', '*', '/', '*', 'x3', '<constant>', 'x3', 'log', 'x3'])
+# -> ['/', '<constant>', 'log', 'x3']
+
+# Simplify infix expressions
+engine.simplify('x3 * sin(<constant> + 1) / (x3 * x3)')
+# -> '<constant> / x3'
+```
+
+Available engines can be browsed and downloaded from Hugging Face.
+The SimpliPy Asset Manager handles listing, installing, and uninstalling assets:
+
+```python
+sp.list_assets("engine")
+# --- Available Assets ---
+# - dev_7-3         [installed]  Development engine 7-3 for mathematical expression simplification.
+# - dev_7-2                      Development engine 7-2 for mathematical expression simplification.
+```
+
+## Where to go next
+
+- Explore the [API reference](api.md) for function-level details.
+- Read the [rule authoring guide](rules.md) to build simplification rule sets.
+
+Happy simplifying!
diff --git a/pyproject.toml b/pyproject.toml
@@ -7,7 +7,7 @@ authors = [
 readme = "README.md"
 requires-python = ">=3.11"
 dynamic = ["dependencies"]
-version = "0.2.8"
+version = "0.2.9"
 license = "MIT"
 license-files = ["LICEN[CS]E*"]
 

diff --git a/src/simplipy/engine.py b/src/simplipy/engine.py
@@ -69,7 +69,7 @@ class SimpliPyEngine:
         A compiled version of explicit rules without pattern variables.
     """
     def __init__(self, operators: dict[str, dict[str, Any]], rules: list[tuple] | None = None) -> None:
-        # This part, which sets up all the operator properties, is unchanged.
+        # Cache operator metadata for quick access during parsing and evaluation.
         self.operator_tokens = list(operators.keys())
         self.operator_aliases = {alias: operator for operator, properties in operators.items() for alias in properties['alias']}
         self.operator_inverses = {k: v["inverse"] for k, v in operators.items() if v.get("inverse") is not None}
@@ -105,16 +105,14 @@ def __init__(self, operators: dict[str, dict[str, Any]], rules: list[tuple] | No
         self.connection_classes_hyper = {'add': "mult", 'mult': "pow"}
         self.binary_connectable_operators = {'+', '-', '*', '/'}
 
-        # This is the simplified rules handling logic.
-        # It no longer checks if `rules` is a string or performs any file I/O.
-        # It only accepts a list of rules or None.
+        # Normalize the incoming rule list and eliminate duplicate patterns.
         dummy_variables = [f'x{i}' for i in range(100)]
         if rules is None:
             self.simplification_rules = []
         else:
             self.simplification_rules = deduplicate_rules(rules, dummy_variables=dummy_variables)
 
-        # This part is also unchanged.
+        # Build the compiled lookup tables that power rule application.
         self.compile_rules()
         self.rule_application_statistics: defaultdict[tuple, int] = defaultdict(int)
 
@@ -268,7 +266,31 @@ def is_valid(self, prefix_expression: list[str], verbose: bool = False) -> bool:
         return True
 
     def prefix_to_infix(self, tokens: list[str], power: Literal['func', '**'] = 'func', realization: bool = False) -> str:
-        """Converts a prefix expression to a human-readable infix string with minimal parentheses."""
+        """Converts a prefix expression to an infix string with minimal parentheses.
+
+        Parameters
+        ----------
+        tokens : list[str]
+            The prefix expression to render.
+        power : {'func', '**'}, optional
+            Controls how power operators are emitted. ``'func'`` keeps canonical
+            engine names such as ``pow3(x)``, while ``'**'`` renders Python-style
+            exponentiation.
+        realization : bool, optional
+            If True, operator tokens are replaced with their runtime
+            realizations (for example, ``'sin'`` becomes ``'np.sin'``), so the
+            output can be compiled directly.
+
+        Returns
+        -------
+        str
+            The formatted infix expression.
+
+        Raises
+        ------
+        ValueError
+            If the provided tokens do not form a well-formed prefix expression.
+        """
 
         if not tokens:
             return ''
@@ -688,7 +710,9 @@ def parse(
         """Parses an infix string into a standardized prefix expression.
 
         This is a high-level parsing utility that combines `infix_to_prefix`
-        with optional conversion and number masking steps.
+        with optional canonicalization and number masking. The resulting token
+        list is additionally cleaned up via `remove_pow1` to drop redundant
+        ``pow1_1`` occurrences.
 
         Parameters
         ----------
@@ -704,7 +728,8 @@ def parse(
         Returns
         -------
         list[str]
-            The final processed prefix expression.
+            The processed prefix expression after conversion, masking (if
+            enabled), and `remove_pow1` cleanup.
         """
 
         parsed_expression = self.infix_to_prefix(infix_expression)
@@ -1023,11 +1048,15 @@ def collect_multiplicities(self, expression: list[str] | tuple[str, ...], verbos
         Returns
         -------
         expression_tree : list
-            The expression represented as a tree.
+            A stack-based representation of the expression tree. Each entry is a
+            nested list of the form ``[operator, operands]`` mirroring the
+            structure consumed by `cancel_terms`.
         annotations_tree : list
-            A parallel tree containing the multiplicity counts for each subtree.
+            A parallel stack holding multiplicity annotations for each subtree,
+            organized by connection class.
         labels_tree : list
-            A parallel tree containing unique identifiers for each subtree.
+            A parallel stack containing stable identifiers for every subtree,
+            used to detect duplicates during cancellation.
         """
         stack: list = []
         stack_annotations: list = []
@@ -1133,18 +1162,22 @@ def cancel_terms(self, expression_tree: list, expression_annotations_tree: list,
         Parameters
         ----------
         expression_tree : list
-            The nested list representation of the expression.
+            The stack produced by `collect_multiplicities`, containing the
+            nested expression structure.
         expression_annotations_tree : list
-            The corresponding tree of multiplicity annotations.
+            The parallel stack of multiplicity annotations returned by
+            `collect_multiplicities`.
         stack_labels : list
-            The corresponding tree of subtree labels.
+            The parallel stack of subtree labels returned by
+            `collect_multiplicities`.
         verbose : bool, optional
             If True, prints detailed debugging information. Defaults to False.
 
         Returns
         -------
         list[str]
-            A new prefix expression with terms cancelled.
+            A simplified prefix expression with the detected duplicates merged
+            or removed.
         """
         stack = expression_tree
         stack_annotations = expression_annotations_tree
@@ -1637,11 +1670,14 @@ def find_rule_worker(
             constants_fit_retries: int) -> None:
         """A worker process for discovering simplification rules in parallel.
 
-        This function runs in a separate process. It fetches an expression from
-        the `work_queue`, evaluates it on a set of random numerical data, and
+        This function runs in a separate process. It fetches work items of the
+        form ``(expression, simplified_length, allowed_candidate_lengths)`` from
+        `work_queue`, evaluates the expression on shared random data, and
         compares the result against a library of simpler candidate expressions.
         If a numerical equivalence is found, it is considered a potential new
-        simplification rule and is placed on the `result_queue`.
+        simplification rule and is placed on the `result_queue`; otherwise ``None``
+        is queued to signal that no rule was discovered. A sentinel ``None`` work
+        item triggers a graceful shutdown.
 
         Notes
         -----
@@ -1803,7 +1839,8 @@ def find_rules(
             Equivalences are found by evaluating both expressions on random
             numerical data.
 
-        Discovered rules are added to the engine and can be saved to a file.
+        Discovered rules are deduplicated, compiled into the running engine, and
+        can optionally be saved to disk.
 
         Parameters
         ----------