Skip to content

Demos#187

Open
mikezangus wants to merge 21 commits intomainfrom
demos
Open

Demos#187
mikezangus wants to merge 21 commits intomainfrom
demos

Conversation

@mikezangus
Copy link
Collaborator

No description provided.

@blacksmith-sh
Copy link
Contributor

blacksmith-sh bot commented Jan 29, 2026

Found 1 test failure on Blacksmith runners:

Failure

Test View Logs
test_rlm_v1/test_rlm_v1_basic_completion View Logs

Fix in Cursor

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

This PR adds comprehensive progress tracking and demo utilities for GEPA optimization and eval workflows. The changes improve user experience with real-time progress updates, better terminal event detection, and infrastructure improvements for tunnel management.

Key Changes

  • Progress Tracking System: New synth_ai/sdk/optimization/progress/handlers.py (777 lines) provides comprehensive progress handlers with clock tracking, idle status ticking, and formatted output for optimization and eval jobs
  • Eval Job Improvements: Enhanced poll_until_complete() with retry logic for fetching results, better terminal state handling, and integration with progress printers. Default polling interval reduced from 15s to 1s for faster feedback
  • Stream Event Detection: Improved terminal event detection in streamer.py with flexible suffix matching for job.completed and job.failed patterns, plus grace periods and stale timeouts
  • Tunnel Infrastructure:
    • Replaced blocking Rust health checks with pure Python async httpx to avoid GIL contention
    • Added tunnel_close() function for proper lease cleanup
    • Fixed regex escaping in cloudflared URL pattern
    • Improved DNS verification with retry logic
    • Disabled DNS verification for quick tunnels due to reliability issues
  • Utility Functions: New seed sampling (stratified_seed_sample, split_seed_slices) and stats utilities (confidence_band) exported from main package
  • Demo Support: Added pre-commit hook and script to auto-generate Jupyter notebooks from demo .py files with cell markers

Notable Implementation Details

  • The Rust connector refactoring properly releases locks before awaiting process cleanup to avoid deadlocks
  • Progress handlers merge best_prompt/best_score from both handlers and API results without clobbering
  • Eval jobs retry fetching results for up to 10s after terminal status, handling the lag between status and results availability

Confidence Score: 4/5

  • This PR is safe to merge with minor considerations around testing the new progress tracking system
  • The changes are well-structured and address real issues (GIL contention, DNS reliability, progress feedback). The code quality is high with proper error handling and retry logic. Score of 4 (not 5) reflects the significant scope of new functionality (777-line progress handlers module) that would benefit from thorough testing in production-like scenarios, particularly the new polling retry logic and progress printer integration
  • synth_ai/sdk/optimization/progress/handlers.py and synth_ai/sdk/eval/job.py need thorough testing with real workloads to verify the new progress tracking and retry logic work correctly across different edge cases

Important Files Changed

Filename Overview
synth_ai/core/streaming/streamer.py Enhanced terminal event detection with flexible suffix matching, added grace periods and stale timeouts, improved SSE streaming robustness
synth_ai/core/tunnels/rust.py Replaced Rust-based health check with pure Python async httpx to avoid GIL contention with uvicorn
synth_ai/sdk/eval/job.py Major improvements to polling: added progress printer, retry logic for fetching results, better handling of terminal states, reduced default interval to 1s
synth_ai/sdk/optimization/progress/handlers.py New comprehensive progress handlers for GEPA optimization and eval with clock, idle ticker, and status printing
synth_ai_core/src/tunnels/connector.rs Refactored stop() and idle timeout logic to properly release lock before awaiting process cleanup
synth_ai_py/src/lib.rs Added tunnel_close function to properly close tunnel leases

Sequence Diagram

sequenceDiagram
    participant User
    participant EvalJob
    participant Streamer
    participant ProgressPrinter
    participant Backend
    participant Tunnel
    
    User->>EvalJob: poll_until_complete(progress=True)
    EvalJob->>ProgressPrinter: create printer with label
    EvalJob->>ProgressPrinter: log_start()
    
    loop Until terminal state
        EvalJob->>Backend: get_status()
        Backend-->>EvalJob: status + results
        EvalJob->>ProgressPrinter: handle_status(status)
        ProgressPrinter->>ProgressPrinter: update clock, tick idle timer
        ProgressPrinter-->>User: print progress update
        
        alt Status is terminal (completed)
            loop Retry with deadline
                EvalJob->>Backend: get_results()
                alt Results available
                    Backend-->>EvalJob: full results
                else Results not ready
                    EvalJob->>EvalJob: wait 1s, retry
                end
            end
            
            alt Results empty
                EvalJob->>Backend: get_status() (fresh)
                Backend-->>EvalJob: status with results in metadata
            end
            
            EvalJob->>ProgressPrinter: log_terminal(status, mean_reward)
            ProgressPrinter-->>User: print final status
            EvalJob-->>User: return EvalResult
        end
        
        EvalJob->>EvalJob: sleep(interval)
    end
    
    Note over User,Tunnel: Tunnel Management
    User->>Tunnel: create tunnel via tunnel_open()
    Tunnel->>Tunnel: skip DNS verification for quick tunnels
    Tunnel->>Backend: wait_for_health_check() via httpx
    
    loop Health check polling
        Tunnel->>Backend: GET /health
        alt Health check passes
            Backend-->>Tunnel: 200 OK
            Tunnel-->>User: tunnel ready
        else Check fails/timeout
            Tunnel->>Tunnel: sleep 0.5s, retry
        end
    end
    
    User->>Tunnel: tunnel_close(lease_id)
    Tunnel->>Tunnel: cleanup connector state
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +75 to +76
if event_type.endswith("job.completed") or event_type.endswith(".job.completed"):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second condition endswith(".job.completed") is redundant - any string ending with .job.completed will already match endswith("job.completed"). Consider simplifying to just event_type.endswith("job.completed")

Suggested change
if event_type.endswith("job.completed") or event_type.endswith(".job.completed"):
return True
if event_type.endswith("job.completed"):
Prompt To Fix With AI
This is a comment left during a code review.
Path: synth_ai/core/streaming/streamer.py
Line: 75:76

Comment:
The second condition `endswith(".job.completed")` is redundant - any string ending with `.job.completed` will already match `endswith("job.completed")`. Consider simplifying to just `event_type.endswith("job.completed")`

```suggestion
    if event_type.endswith("job.completed"):
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +90 to +91
if event_type.endswith("job.failed") or event_type.endswith(".job.failed"):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same redundancy - endswith(".job.failed") is a subset of endswith("job.failed")

Suggested change
if event_type.endswith("job.failed") or event_type.endswith(".job.failed"):
return True
if event_type.endswith("job.failed"):
Prompt To Fix With AI
This is a comment left during a code review.
Path: synth_ai/core/streaming/streamer.py
Line: 90:91

Comment:
Same redundancy - `endswith(".job.failed")` is a subset of `endswith("job.failed")`

```suggestion
    if event_type.endswith("job.failed"):
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant