Skip to content

Conversation

@olesho
Copy link
Contributor

@olesho olesho commented Dec 12, 2025

  • First simple eval test is passing

@olesho olesho requested a review from tysonthomas9 December 12, 2025 15:15
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.


COPY submodules/kernel-images/server/ .
RUN GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH:-arm64} \
RUN GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH:-amd64} \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the amd64 flag or does it auto detect?

@tysonthomas9
Copy link
Contributor

@claude

@tysonthomas9 tysonthomas9 requested a review from Copilot December 13, 2025 17:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR sets up CircleCI for continuous integration and deployment, including build pipelines, testing, and evaluation workflows. The changes enable automated testing of browser operator functionality with Docker-based build stages and API validation.

Key changes:

  • Added comprehensive CircleCI configuration with build, test, and deployment workflows
  • Switched from platform-specific (arm64) to platform-agnostic Docker base images
  • Migrated submodule URL from SSH to HTTPS for CI compatibility
  • Updated submodule pointer for browser-operator-core

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
.circleci/config.yml New CircleCI pipeline with Docker builds, API tests, eval tests, and Cloud Run deployment
deployments/local/Dockerfile Removed platform-specific constraints and updated default architecture from arm64 to amd64
.gitmodules Changed browser-operator-core URL to HTTPS and removed webarena submodule
submodules/browser-operator-core Updated submodule commit reference

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +256 to +265
curl -v -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"input": "How much is 5x7? Just respond with the number.",
"url": "about:blank",
"wait_timeout": 30000,
"model": {
"main_model": {"provider": "openai", "model": "gpt-4o-mini", "api_key": "'"${OPENAI_API_KEY}"'"}
}
}' 2>&1 || true
Copy link

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API key is being passed directly in command line arguments, which can be logged in process listings and CI logs. Consider using environment variables or secure file-based configuration instead.

Suggested change
curl -v -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"input": "How much is 5x7? Just respond with the number.",
"url": "about:blank",
"wait_timeout": 30000,
"model": {
"main_model": {"provider": "openai", "model": "gpt-4o-mini", "api_key": "'"${OPENAI_API_KEY}"'"}
}
}' 2>&1 || true
cat > payload.json <<EOF
{
"input": "How much is 5x7? Just respond with the number.",
"url": "about:blank",
"wait_timeout": 30000,
"model": {
"main_model": {
"provider": "openai",
"model": "gpt-4o-mini",
"api_key": "${OPENAI_API_KEY}"
}
}
}
EOF
curl -v -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
--data-binary @payload.json 2>&1 || true
rm -f payload.json

Copilot uses AI. Check for mistakes.
olesho and others added 7 commits December 13, 2025 19:24
Reports are saved to evals/reports/ (relative to config.yml),
not evals/native/reports/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create Dockerfile.devtools.amd64 for Intel/AMD (CircleCI)
- Create Dockerfile.devtools.arm64 for Apple Silicon (local Mac)
- Update Makefile to auto-detect platform and use appropriate Dockerfile
- Update CircleCI to explicitly use amd64 Dockerfile
- Remove hardcoded --platform flag from base Dockerfile.devtools

This fixes the build error on Mac M1/M2/M3 where the gn build tool
was failing because it was trying to use Linux x64 binaries.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Usage:
  PLATFORM=amd64 make rebuild-devtools
  PLATFORM=arm64 make rebuild-devtools

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
DevTools build toolchain only supports AMD64, so we use a single
Dockerfile with --platform=linux/amd64 for the build stages.
On Apple Silicon, Docker Desktop uses Rosetta 2 emulation.

- Remove platform-specific Dockerfiles (amd64, arm64)
- Update unified Dockerfile.devtools with clear documentation
- Simplify Makefile (remove platform detection/override)
- Update CircleCI to use unified Dockerfile

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Go cross-compilation uses TARGETARCH from Docker BuildKit
- Makefile auto-detects platform (arm64 on Apple Silicon, amd64 otherwise)
- Override with PLATFORM=amd64 or PLATFORM=arm64
- Passes --platform flag to docker build for kernel-browser image

Example:
  make rebuild                    # Uses detected platform
  PLATFORM=amd64 make rebuild     # Force AMD64

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous approach of removing out/Default broke gn initialization.
Now we properly regenerate build files with 'gn gen' and rebuild with
'autoninja' which is the standard Chromium build approach.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants