Add Noosphere Engine foundation (Phase 0 + Phase 1)#187
Add Noosphere Engine foundation (Phase 0 + Phase 1)#187user1303836 wants to merge 5 commits intomainfrom
Conversation
Phase 0 - Core package structure: - SQLAlchemy models (10 tables): guild state, embeddings, egregore snapshots, soundscape, attractors, archive, crystal rooms, baselines - Shared dataclasses: ProcessedMessage, CommunityStateVector - Constants: PHI, Fibonacci, enums (ComputationMode, PathologyType, MessageClassification) - EmbeddingService: async sentence-transformers wrapper (384-dim) - OutputGovernor: O(1) sidechain gain, token bucket, cooldown - SoundscapeMonitor: anthrophony/biophony/geophony classification Phase 1 - Metrics and computation: - MetricsComputer: three-tier scheduling (hourly/daily/weekly) - WelfordAccumulator: online mean/variance with z-score normalization - Egregore Index: weighted coherence + convergence + concentration - Archive decay: reference-extended half-life (base 168h) - FibonacciScheduler: golden-angle jitter for quasiperiodic scheduling - PhiParameter: golden ratio mode weight oscillator 90 tests covering all modules.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Code Review: PR #187 -- Noosphere Engine Foundation (Phase 0 + Phase 1)Verdict: Changes Requested -- 7 issues to address before merge, 5 positive observations. Issues Requiring Changes1. Enums use Per our alignment agreement, all enums should use # Current
class MessageClassification(enum.Enum):
ANTHROPHONY = "anthrophony"
# Expected
class MessageClassification(str, enum.Enum):
ANTHROPHONY = "anthrophony"2. The dataclass is missing fields that dev-analytics and dev-features depend on:
These were agreed upon during interface alignment. Use 3.
from datetime import UTC, datetime
# ...
now = datetime.now(UTC)4.
5.
6. Duplicate There are two 7.
def compute_hourly(self, guild_id: int) -> CommunityStateVector:
messages = self._messages.pop(guild_id, [])
embeddings = self._embeddings.pop(guild_id, [])
# ... compute ...Or cap at a configurable max size (e.g., last 1000 messages per guild). Positive Observations
@greptile |
|
Thanks for the detailed review! All 7 issues are clear and actionable. I'll address them before requesting re-review:
Appreciate the positive feedback on the implementation patterns! Will push fixes shortly. |
- Change guild_id, channel_id, user_id, message_id from str to int in ProcessedMessage and CommunityStateVector (Discord IDs are integers) - Add missing Phase 4 fields to CommunityStateVector: sentiment_alignment, interaction_modularity, fractal_dimension, lyapunov_exponent, gromov_curvature (default to math.nan) - Update all dict key types in MetricsComputer, SoundscapeMonitor, OutputGovernor to match - Update all tests to use int IDs
Per team-lead correction: existing codebase uses str for Discord IDs throughout (database models, cogs, repository layer). Reverting the int change to maintain consistency. - Revert guild_id/channel_id/user_id/message_id back to str in ProcessedMessage, CommunityStateVector, and all dependent modules - Change enum base from enum.Enum to (str, enum.Enum) for ComputationMode, PathologyType, MessageClassification to enable direct JSON serialization - Keep Phase 4 fields (sentiment_alignment, interaction_modularity, fractal_dimension, lyapunov_exponent, gromov_curvature) with math.nan defaults - Add test verifying enum JSON serialization
- constants.py: Match foundation's enum.Enum base (not str, Enum), add MessageClassification and Phase 0 constants (embedding, archive, output governor), keep Crystal Room enums as additive-only additions - shared/phi_parameter.py: Replace with foundation's canonical version (no tick_count or set_phase -- engine tracks its own tick count) - test_constants.py: Keep only Crystal Room enum tests, defer shared constant/enum tests to foundation - test_phi_parameter.py: Remove tick_count and set_phase tests - test_engine.py: Use engine._tick_count instead of phi.tick_count - test_serendipity.py: Fix flaky range test by using zero-noise injector
- Replace datetime.utcnow() with datetime.now(UTC) (item 3) - Move TYPE_CHECKING imports to runtime with noqa comments (item 4) - Remove unused chromadb dependency and mypy override (item 5) - Remove duplicate [dependency-groups] section (item 6) - Add MAX_MESSAGES/EMBEDDINGS_PER_GUILD bounds to MetricsComputer (item 7)
Re-review: All 7 Items AddressedI've verified every fix against the original review. All items are resolved. Original Issues -- Status
Verification Checklist
Minor Note (non-blocking)Foundation uses Verdict: ApproveThis PR is clean and ready to merge. |
Merge Crystal Room enums from dev-features into the canonical constants.py so PR #189 can import them after rebase. Both use str, enum.Enum base class for JSON serialization consistency.
- Engine now dispatches CommunityStateVector and ProcessedMessage objects instead of kwargs, matching Phase 2 cog listener signatures - Add data_models.py with CommunityStateVector and ProcessedMessage (matching dev-foundation PR #187 canonical definitions) - Cap ModeManager._history at 100 entries to prevent unbounded growth - Split noosphere loading exception handling: ImportError -> debug, other exceptions -> warning with traceback
Summary
Implements the foundational infrastructure for the Noosphere Engine feature, covering Phase 0 (core package structure) and Phase 1 (metrics computation layer).
Phase 0 - Core Infrastructure:
NoosphereGuildState,MessageEmbedding,EgregoreSnapshot,UserBioelectricState,SoundscapeSnapshot,AttractorSnapshot,ArchiveEntry,ArchiveLink,CrystalRoom,GuildMetricsBaselineProcessedMessage,CommunityStateVectorComputationMode(10 modes),PathologyType(10 types),MessageClassificationEmbeddingService: async wrapper around sentence-transformers (paraphrase-multilingual-MiniLM-L12-v2, 384-dim)OutputGovernor: O(1) sidechain gain computation with token bucket and cooldownSoundscapeMonitor: anthrophony/biophony/geophony message classificationPhase 1 - Metrics and Computation:
MetricsComputer: three-tier scheduling (hourly/daily/weekly) producingCommunityStateVectorWelfordAccumulator: online mean/variance with z-score normalization and sigmoid mappingFibonacciScheduler: golden-angle jitter for quasiperiodic schedulingPhiParameter: golden ratio mode weight oscillatorDependencies added: sentence-transformers, chromadb, numpy, scipy, networkx
Test plan