Profiler Agent --- Full Technical Implementation Plan (v1.0.0)

Purpose Convert a CodeMessage (C++ 17 source) into a ProfileReport that contains precise runtime and memory telemetry across logarithmically‑spaced input scales, executed inside an isolated Docker sandbox with strict resource limits.

Reader Senior Python/C++ developer; after reading, you can implement the agent end‑to‑end with zero follow‑up questions.

1 High‑Level Responsibilities

Prepare input cases based on plan‑derived input_bounds (log‑scale + corner cases).
Compile the C++ code with deterministic flags.
Execute the binary under /usr/bin/time -v in a Docker container with cgroup & resource.setrlimit caps.
Parse the time & memory output into numeric lists aligned with input_sizes.
Collect (optional) hotspot information via gprof when debug=true.
Emit a strict ProfileReport Pydantic model.

2 Public API

class Profiler(Agent):
    """Empirical runtime/memory measurement for a single CodeMessage."""

    def run(self, code: CodeMessage, *, debug: bool = False) -> ProfileReport:
        """Compile & execute, returning ProfileReport.

        Raises:
            SandboxError: on compilation or runtime error that persists after 1 retry.
        """

2.1 Configurable Constants (utils.config)

3 Internal Architecture

Profiler.run()
 ├─ _prepare_inputs()   # deterministic input strings
 ├─ _compile_cpp()      # g++ -O2 -std=c++17
 │    └─ returns path to binary
 ├─ for each n in input_sizes:
 │      └─ _execute_binary()   # wraps docker + /usr/bin/time -v
 │            ├─ stdout
 │            └─ _parse_time_output()
 ├─ if debug:
 │      └─ _collect_gprof()
 └─ Build ProfileReport → return

3.1 Module Break‑Down

4 Docker Sandbox Details

4.1 Base Image

FROM ubuntu:20.04
RUN apt-get update && apt-get install -y build-essential gprof time
RUN useradd -m sandbox
USER sandbox
WORKDIR /sandbox

Image tag: ghcr.io/swiftsolve/sandbox:20.04-gcc10

4.2 Entrypoint Script `sandbox_entry.sh`

#!/usr/bin/env bash
ulimit -v $(( ${SANDBOX_MEM_MB} * 1024 ))   # RLIMIT_AS
ulimit -s 262144                             # 256 MB stack
exec /usr/bin/time -v "$@"

Mount path /code for source + binary.

4.3 Resource Enforcement

Container memory limit = ${SANDBOX_MEM_MB}m.
--pids-limit 64 to avoid fork bombs.
No network (--network none).

5 Input Generation Strategy

For an integer bound n_max extracted from PlanMessage.input_bounds:

sizes_log = [int(x) for x in [1e3, 5e3, 1e4, 5e4, 1e5] if x <= n_max]
input_sizes = [0,1] + sizes_log

5.1 Generic Generator (placeholder)

If task type unknown, feed {n}\n where task likely reads n and exits. TODO: once datasets/task_format.py matured, switch to task‑specific input constructors provided by dataset metadata.

6 Telemetry Parsing

Regex patterns:

_RX_WALL  = re.compile(r"Elapsed \(wall clock\) time.*?:\s*(\d+:\d+\.\d+)")
_RX_RSS   = re.compile(r"Maximum resident set size \(kbytes\):\s*(\d+)")

Conversion:

mins,secs = wall.split(':'); runtime_ms = (int(mins)*60+float(secs))*1000
peak_mb = int(rss_kb)//1024

Edge cases: if regex missing → raise ParseError and mark run as failure.

7 Failure Handling & Retries

| Stage | Retry Count | On Failure Action | | Compilation | 1 | Re‑run _compile_cpp; if still fails → raise SandboxError(code="compile") | | Execution | 0 (deterministic) | Capture stderr; if non‑zero exit or timeout → mark runtime ∞, memory ∞, continue loop; set hotspot {"_crash":"SIGSEGV"} | | Parsing | 0 | Raise ParseError → controller counts as agent failure |

8 ProfileReport Construction

report = ProfileReport(
    task_id=code.task_id,
    iteration=code.iteration,
    input_sizes=input_sizes,
    runtime_ms=runtimes,
    peak_memory_mb=memories,
    hotspots=hotspots,
)

Validation: len(runtime_ms) == len(input_sizes) == len(peak_memory_mb).

9 Logging

Profiler logger name: Profiler.
INFO: compile command, each input size runtime/mem.
WARNING: compile/runtime failures.
DEBUG (if LOG_LEVEL=DEBUG): full /usr/bin/time -v dump.

10 Unit & Integration Tests

Coverage ≥ 90 % for agents/profiler.py.

11 Acceptance Criteria

Given a syntactically correct CodeMessage, Profiler.run() returns a valid ProfileReport within 2 × len(input_sizes) seconds total wall time.
Report numbers differ < 5 % across two consecutive runs on same host (determinism).
Memory is within ± 2 MB of /usr/bin/time manual run.

End of Profiler Agent Technical Plan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiler Agent --- Full Technical Implementation Plan (v1.0.0)

1 High‑Level Responsibilities

2 Public API

2.1 Configurable Constants (utils.config)

3 Internal Architecture

3.1 Module Break‑Down

4 Docker Sandbox Details

4.1 Base Image

4.2 Entrypoint Script `sandbox_entry.sh`

4.3 Resource Enforcement

5 Input Generation Strategy

5.1 Generic Generator (placeholder)

6 Telemetry Parsing

7 Failure Handling & Retries

8 ProfileReport Construction

9 Logging

10 Unit & Integration Tests

11 Acceptance Criteria

FilesExpand file tree

PROFILER.md

Latest commit

History

PROFILER.md

File metadata and controls

Profiler Agent --- Full Technical Implementation Plan (v1.0.0)

1 High‑Level Responsibilities

2 Public API

2.1 Configurable Constants (utils.config)

3 Internal Architecture

3.1 Module Break‑Down

4 Docker Sandbox Details

4.1 Base Image

4.2 Entrypoint Script sandbox_entry.sh

4.3 Resource Enforcement

5 Input Generation Strategy

5.1 Generic Generator (placeholder)

6 Telemetry Parsing

7 Failure Handling & Retries

8 ProfileReport Construction

9 Logging

10 Unit & Integration Tests

11 Acceptance Criteria

4.2 Entrypoint Script `sandbox_entry.sh`