Canonical reference for all contributors (human and AI). Before writing code, designing a module, or proposing a change — read this file.
Discover new mathematical formulas for fundamental constants by systematically searching inside Conservative Matrix Fields (CMFs).
A successful outcome is a new polynomial continued fraction (PCF) or
recurrence relation that converges to a constant like
The system takes a mathematical constant and one or more inspiration
functions (hypergeometric
The system is a four-stage pipeline. Every stage is modular: you can swap in a new implementation without touching the others.
Constant + Inspiration Functions
│
▼
┌─────────────────────────┐
│ 1. LOADING │ Map constants → CMFs (from DB, JSON, or code)
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 2. EXTRACTION │ Partition CMF space into bounded shards (Ax < b)
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 3. ANALYSIS │ Sample trajectories, compute δ, filter & rank shards
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 4. SEARCH │ Deep search in promising shards → discover PCFs
└─────────────────────────┘
| Responsibility | Details |
|---|---|
| Input | Constant objects + Formatter objects (pFq, MeijerG, BaseCMF) |
| Output | Dict[Constant, List[ShiftCMF]] — each constant mapped to its CMFs with shifts |
| Storage | SQLite database (families_v1.db), pickle files, or in-memory |
| Key classes | Formatter, pFq, MeijerG, BaseCMF, DB, BasicDBMod |
| Responsibility | Details |
|---|---|
| Input | Dict[Constant, List[ShiftCMF]] |
| Output |
Dict[Constant, List[Shard]] — bounded convex regions of the CMF lattice |
| How | Enumerate hyperplanes (matrix zeros & poles) → sign-vector encoding → build inequality system |
| Key classes |
ShardExtractorMod, ShardExtractor, Shard, Hyperplane
|
| Responsibility | Details |
|---|---|
| Input | Dict[Constant, List[Shard]] |
| Output |
Dict[Constant, List[Shard]] — filtered and ranked by promise |
| How | For each shard: sample ~$10^d$ trajectories (d = dimension), walk the CMF, compute convergence δ, keep shards above IDENTIFY_THRESHOLD
|
| Key classes |
AnalyzerModV1, Analyzer, SerialSearcher
|
| Responsibility | Details |
|---|---|
| Input | Prioritized shards from Analysis |
| Output | Dict[Searchable, DataManager] — discovered PCFs and search vectors |
| How | Deeper walks (depth up to 1500), exact rational convergent extraction, LIReC/RIES identification |
| Key classes | SearcherModV1, SerialSearcher, DataManager |
A target mathematical constant (e.g.,
The central algebraic structure. A CMF assigns a matrix
Implementation lives in ramanujantools. Dreamer wraps it via Formatter
subclasses for serialization and shift management.
A bounded convex region of the CMF's integer lattice, defined by linear
inequalities
An integer direction vector inside a shard. The system walks the CMF along this direction, multiplying matrices, to compute a convergent. If the convergent approaches the target constant, the trajectory is "identified."
The end product. A continued fraction
These must hold at all times. Any code change that violates them is incorrect.
-
All numerical verification uses
mpmathat ≥ 100 digits. Pythonfloatis forbidden for mathematical computation. -
Every
Formattercan round-trip through JSON.from_json_obj(to_json_obj(x))must reconstruct an equivalent object. -
Shard inequalities are strict. A point on the boundary (
$Ax = b$ ) is outside the shard. - The Constant registry is the single source of truth. Two constants with the same name are the same constant.
- Stage outputs are stage inputs. Loading → Extraction → Analysis → Search. Each stage takes the previous stage's output without transformation.
-
Modules are substitutable. Any
AnalyzerModSchemesubclass can replaceAnalyzerModV1without changingSystem.
Be aware of these when contributing. They are opportunities, not just problems.
| Area | Limitation | Impact |
|---|---|---|
| Windows / LIReC | LIReC has installation issues on Windows; cysignals/fpylll are Linux-only |
Tests and runs may need WSL or Linux |
| Parallelism | Extraction parallelism is commented out; search has basic chunking | Single-threaded bottleneck on large CMFs |
| Trajectory sampling | EndToEndSamplingEngine works but uniformity degrades in high dimensions |
Thin cones may be undersampled |
| Database | SQLite is local-only; no shared/remote DB | Multi-user workflows need manual merging |
| Search depth | Max depth capped at 1500 | Some PCFs need deeper walks to converge |
| Ascent logic | Mentioned in README but not implemented | Cannot yet climb from a PCF to a higher-level identity |
| Proof generation | System finds formulas but does not prove them | Discovered PCFs are conjectures until proven |
Ordered by impact. When choosing what to work on, prefer items higher on this list.
- Fix all critical bugs before adding features. See open bug-fix branches.
- Increase test coverage for core modules (Constant, Formatter, Shard, DB, System).
- CI pipeline: automated
pyteston every push; block merges on failure. - Cross-platform installation: resolve LIReC/Windows issues or provide a Docker image.
- Adaptive depth control: automatically increase walk depth when δ is improving.
- Improved trajectory sampling: better coverage of thin cones in high dimensions.
- New CMF families: integrate additional inspiration function types beyond pFq and MeijerG.
- Shard prioritization heuristics: ML-based ranking of shards by likely discovery yield.
- Parallel extraction and search: multi-process/distributed shard evaluation.
- Remote database: shared SQLite → PostgreSQL or similar for team-wide deduplication.
- Result deduplication: detect coboundary-equivalent PCFs automatically.
- Ascent logic: given a PCF, find the CMF it belongs to and generate related formulas.
- Automated proof sketches: generate symbolic proofs or proof obligations for discovered formulas.
- Paper-ready output: auto-generate LaTeX summaries of discovered formulas with full verification.
dreamer/
├── configs/ # Dataclass-based configuration (one file per stage)
├── loading/ # Stage 1: DB, formatters, JSON serialization
├── extraction/ # Stage 2: hyperplanes, shards, samplers
├── analysis/ # Stage 3: analyzers, prioritization
├── search/ # Stage 4: searchers, search methods
├── system/ # System orchestrator (System.run)
└── utils/ # Shared: constants, schemes, storage, logging, types
- Modules (
*_mod.py): orchestrate a stage, implementexecute(). - Methods (e.g.,
serial_searcher.py): the algorithmic core, called by modules. - Schemes (
*_scheme.py): abstract base classes defining the interface. - Formatters (
*_fmt.py): JSON-serializable wrappers aroundramanujantoolsobjects.
- Create a class inheriting from the relevant
*Scheme(e.g.,SearcherModScheme). - Implement the required methods (
execute(), etc.). - Place the method in
dreamer/<stage>/methods/<name>/and the module indreamer/<stage>/searchers/<name>/. - Add tests in
tests/test_<name>.py. - Register it by importing it in the stage's
__init__.py.
- Framework:
pytest. Tests live intests/. - Every public function must have at least one test (see
COVERAGE_POLICY.md). - Mathematical tests must use
mpmathwith ≥ 100 digits of precision. - Run:
pytest tests/ -v
| Repo | Role | How Dreamer Uses It |
|---|---|---|
ramanujantools |
Core math library: CMF, PCF, Matrix, Position, Limit | Primary dependency — all CMF operations go through it |
LIReC |
Library of Integer Relations and Constants | Used for constant identification in analysis/search |
RamanujanMachine |
Original discovery algorithms (MITM-RF, ESMA) | Reference implementations; not a direct dependency |
euler2ai |
Formula harvesting from arXiv | Future integration for seeding inspiration functions |
Record important architectural decisions here so future contributors understand why, not just what.
| Date | Decision | Rationale |
|---|---|---|
| 2024 | Use ramanujantools as the CMF engine, not a custom implementation |
Avoid duplication; leverage tested group library |
| 2024 | SQLite for local DB | Simplicity; no server needed for single-user runs |
| 2024 | Modular stage architecture with abstract schemes | Allow independent development of new analyzers/searchers |
| 2025 | IDENTIFY_THRESHOLD = -1 means "accept all shards" |
Analysis stage filters shards; -1 disables filtering for exploratory runs |
| 2026-04 | Fix Constant.value_mpmath infinite recursion |
@cached_property was calling itself; use _explicit_mpmath backing field |
| 2026-04 | Fix MeijerG.__init__ super() call argument order |
use_inv_t was being passed as shifts positional arg |
| 2026-04 | Fix SearcherModV1.execute() empty return |
dms dict was never populated with search results |
- Before starting any task: re-read sections 3–6 to understand the pipeline, objects, invariants, and limitations.
- Before proposing an architecture change: check section 10 (Decision Log) — someone may have already considered and rejected your idea.
- After completing a significant change: update sections 6, 7, or 10 as appropriate.
- When unsure what to work on: consult section 7 (Development Priorities).
This document is living. If something is wrong or missing, fix it in the same PR as the related code change.
This is a modular pipeline system for discovering polynomial continued fractions (PCFs) of mathematical constants via Conservative Matrix Fields (CMFs). The four-stage pipeline (Loading → Extraction → Analysis → Search) processes constants through CMF parameter spaces, partitioning into "shards" for parallelizable search. See SYSTEM_SPEC.md for full details.
Key components:
dreamer/system/system.py: Orchestrates the pipeline viaSystem.run().dreamer/configs/: Dataclass-based configs for each stage (e.g.,analysis.pyforIDENTIFY_THRESHOLD).dreamer/loading/,extraction/,analysis/,search/: Modular stages with swappable implementations.
Data flows: Constant → Dict[Constant, List[ShiftCMF]] → Dict[Constant, List[Shard]] → prioritized shards → Dict[Searchable, DataManager] with discovered PCFs.
Reference: SYSTEM_SPEC.md (canonical), README.md (usage), pyproject.toml (deps), DEFINITION_OF_DONE.md (completion criteria), COVERAGE_POLICY.md (testing).