feat: claude-web-computer-agent — web search + computer use with BMasterAI telemetry by travis-burmaster · Pull Request #51 · travis-burmaster/bmasterai

travis-burmaster · 2026-03-15T11:30:33Z

Summary

Adds examples/claude-web-computer-agent/ — a bare-metal Claude tool-use agent combining web search (Tavily) and computer use (screenshot/click/type/key/scroll), fully instrumented with BMasterAI logging and monitoring
No frameworks — just the raw Anthropic SDK tool_use / tool_result cycle, making this the foundational reference before the LangGraph examples
Screenshots are returned as multimodal image blocks so Claude can actually see the screen
BMasterAI telemetry on every LLM call, tool dispatch, decision point, and error path — output to logs/agent.log, logs/agent.jsonl, and logs/reasoning/

Files

File	Purpose
`tools.py`	`web_search` and `computer_use` schemas + dispatch
`agent.py`	`WebComputerAgent` class — tool-use loop with full BMasterAI instrumentation
`main.py`	CLI entry point with env checks and interactive fallback
`requirements.txt`	`bmasterai>=0.2.3`, `anthropic>=0.40.0`, `tavily-python`
`.env.example`	`ANTHROPIC_API_KEY` + `TAVILY_API_KEY`

Test plan

pip install -r requirements.txt installs cleanly
python main.py "Search for latest Anthropic news" runs without errors
python main.py "Take a screenshot" returns a computer_use result
logs/agent.jsonl is written with structured telemetry after each run

Generated with Claude (Cowork mode)

…terAI telemetry Adds a bare-metal Claude tool-use agent combining web search (Tavily) and computer use (screenshot/click/type/key/scroll) — no frameworks, just the Anthropic SDK — fully instrumented with BMasterAI logging and monitoring. What it demonstrates: - The raw Anthropic tool_use / tool_result message cycle - Multimodal tool_result (screenshots sent back as image blocks) - BMasterAI telemetry on every LLM call, tool dispatch, and decision point - Structured JSONL telemetry output for downstream analytics Stack: Anthropic SDK, Tavily, xdotool + scrot, BMasterAI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-03-15T11:30:49Z

Claude finished @travis-burmaster's task —— View job

Code Review Complete ✅

Todo List

Read repository guidelines (CLAUDE.md) - No CLAUDE.md found
Review all 7 new files in the PR
Analyze code quality and best practices
Check for potential bugs and security issues
Evaluate performance considerations
Assess test coverage
Provide comprehensive feedback

Overall Assessment

This is a well-structured and thoroughly documented example implementation that demonstrates the core tool-use pattern for Claude agents. The code quality is high with comprehensive BMasterAI telemetry integration.

✅ Strengths

Code Quality & Architecture

Clean separation of concerns: Tools, agent logic, and CLI are properly separated
Comprehensive documentation: Excellent README with clear examples and setup instructions
Robust error handling: Graceful degradation when dependencies are missing
Defensive programming: Proper validation of required parameters before tool execution
Consistent code style: Well-formatted with clear docstrings

Security & Best Practices

Environment variable handling: Proper use of environment variables for API keys
No hardcoded secrets: Template provided in .env.example
Input validation: Parameters validated before tool execution
Subprocess security: Uses check=True and proper argument handling

Performance Considerations

Efficient token usage: Truncates long content (snippets to 500 chars, logs to 200 chars)
Resource management: Proper cleanup in finally blocks
Caching: Base64 images stripped from logs to reduce memory usage
Timeout handling: MAX_TURNS prevents infinite loops

⚠️ Areas for Improvement

Security Issues

Command Injection Risk (tools.py:164, 170, 176, 186)

# Current - vulnerable to injection
subprocess.run(["xdotool", "type", "--clearmodifiers", text], check=True)

# Safer approach
subprocess.run(["xdotool", "type", "--clearmodifiers", "--", text], check=True)

Impact: Malicious input could execute arbitrary commands if text contains shell metacharacters.

Path Traversal (tools.py:148-149)
```
# Consider validating output path for screenshot
result = subprocess.run(["scrot", "-", "--quality", "80"], ...)
```
Recommendation: While current usage is safe, consider validating any file paths in future extensions.

Error Handling & Reliability

Incomplete Error Context (agent.py:171-172)

except Exception as e:
    # Missing context about which operation failed
    self.monitor.track_error(AGENT_ID, type(e).__name__)

Recommendation: Include more context (turn number, operation type, input parameters).

Resource Cleanup (tools.py:147-159)

# No cleanup if scrot process hangs
result = subprocess.run(["scrot", "-", "--quality", "80"], capture_output=True, check=True)

Recommendation: Add timeout parameter to prevent hanging processes.

Code Quality Enhancements

Type Hints (tools.py:139, agent.py:189)

# Current
def _run_computer_use(action: str, x: int = None, ...)

# Better
def _run_computer_use(action: str, x: Optional[int] = None, ...)

Constants (agent.py:37-39)

# Consider making these configurable
DEFAULT_MODEL = "claude-opus-4-6"
MAX_TURNS = 20
MAX_TOKENS = 4096

Test Coverage

Missing Tests: No test files included
- Recommendation: Add unit tests for:
  - Tool dispatch functions
  - Error handling scenarios
  - Environment variable validation
  - BMasterAI telemetry integration

Documentation

API Model Reference (agent.py:37, main.py:70)
```
DEFAULT_MODEL = "claude-opus-4-6"  # This model may not exist
```
Recommendation: Use a stable model reference like claude-3-5-sonnet-20241022 or make model configurable.

🔧 Specific Recommendations

Immediate (Security)

Add -- separator to xdotool commands to prevent command injection
Add subprocess timeouts to prevent hanging processes

Short Term (Reliability)

Enhance error messages with operation context
Add input validation for tool parameters
Make model and limits configurable via environment variables

Long Term (Maintainability)

Add comprehensive test suite
Consider adding retry logic for failed tool operations
Add configuration validation on startup

📊 Performance Analysis

Token efficiency: ✅ Good truncation strategies
Memory usage: ✅ Base64 data properly stripped from logs
Network efficiency: ✅ Reasonable API call patterns
Resource limits: ✅ MAX_TURNS prevents runaway execution

🏆 Best Practices Followed

✅ Comprehensive telemetry and monitoring
✅ Clear error messages and logging
✅ Modular, testable design
✅ Proper dependency management
✅ Excellent documentation and examples

Recommendation: APPROVE with suggestions

This is a solid foundational example that demonstrates best practices for Claude tool-use agents. The security issues are minor and easily addressable. The comprehensive BMasterAI integration makes this an excellent reference implementation.

Lists the new example at the top of the 2026 section as the foundational tool-use reference. Also removes stale `NEW` badge from deep-research-agent since it's no longer the newest entry. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-03-15T11:33:07Z

Claude finished @travis-burmaster's task —— View job

Claude Code Review Complete ✅

Todo List

Read repository guidelines (CLAUDE.md) - No CLAUDE.md found
Review all 8 new files in the PR
Analyze code quality and best practices
Check for potential bugs and security issues
Evaluate performance considerations
Assess test coverage
Provide comprehensive feedback

Overall Assessment

This is a well-structured and thoroughly documented example implementation that demonstrates the core tool-use pattern for Claude agents. The code quality is high with comprehensive BMasterAI telemetry integration.

✅ Strengths

Code Quality & Architecture

Clean separation of concerns: Tools, agent logic, and CLI are properly separated
Comprehensive documentation: Excellent README with clear examples and setup instructions
Robust error handling: Graceful degradation when dependencies are missing
Defensive programming: Proper validation of required parameters before tool execution
Consistent code style: Well-formatted with clear docstrings

Security & Best Practices

Environment variable handling: Proper use of environment variables for API keys
No hardcoded secrets: Template provided in .env.example
Input validation: Parameters validated before tool execution
Subprocess security: Uses check=True and proper argument handling

Performance Considerations

Efficient token usage: Truncates long content (snippets to 500 chars, logs to 200 chars)
Resource management: Proper cleanup in finally blocks
Caching: Base64 images stripped from logs to reduce memory usage
Timeout handling: MAX_TURNS prevents infinite loops

⚠️ Areas for Improvement

Security Issues

Command Injection Risk (tools.py:164, 170, 176, 186)

# Current - vulnerable to injection
subprocess.run(["xdotool", "type", "--clearmodifiers", text], check=True)

# Safer approach
subprocess.run(["xdotool", "type", "--clearmodifiers", "--", text], check=True)

Impact: Malicious input could execute arbitrary commands if text contains shell metacharacters.

Path Traversal (tools.py:148-149)
```
# Consider validating output path for screenshot
result = subprocess.run(["scrot", "-", "--quality", "80"], ...)
```
Recommendation: While current usage is safe, consider validating any file paths in future extensions.

Error Handling & Reliability

Incomplete Error Context (agent.py:171-172)

except Exception as e:
    # Missing context about which operation failed
    self.monitor.track_error(AGENT_ID, type(e).__name__)

Recommendation: Include more context (turn number, operation type, input parameters).

Resource Cleanup (tools.py:147-159)

# No cleanup if scrot process hangs
result = subprocess.run(["scrot", "-", "--quality", "80"], capture_output=True, check=True)

Recommendation: Add timeout parameter to prevent hanging processes.

Code Quality Enhancements

Type Hints (tools.py:139, agent.py:189)

# Current
def _run_computer_use(action: str, x: int = None, ...)

# Better
def _run_computer_use(action: str, x: Optional[int] = None, ...)

Constants (agent.py:37-39)

# Consider making these configurable
DEFAULT_MODEL = "claude-opus-4-6"
MAX_TURNS = 20
MAX_TOKENS = 4096

Test Coverage

Missing Tests: No test files included
- Recommendation: Add unit tests for:
  - Tool dispatch functions
  - Error handling scenarios
  - Environment variable validation
  - BMasterAI telemetry integration

Documentation

API Model Reference (agent.py:37, main.py:70)
```
DEFAULT_MODEL = "claude-opus-4-6"  # This model may not exist
```
Recommendation: Use a stable model reference like claude-3-5-sonnet-20241022 or make model configurable.

🔧 Specific Recommendations

Immediate (Security)

Add -- separator to xdotool commands to prevent command injection
Add subprocess timeouts to prevent hanging processes

Short Term (Reliability)

Enhance error messages with operation context
Add input validation for tool parameters
Make model and limits configurable via environment variables

Long Term (Maintainability)

Add comprehensive test suite
Consider adding retry logic for failed tool operations
Add configuration validation on startup

📊 Performance Analysis

Token efficiency: ✅ Good truncation strategies
Memory usage: ✅ Base64 data properly stripped from logs
Network efficiency: ✅ Reasonable API call patterns
Resource limits: ✅ MAX_TURNS prevents runaway execution

🏆 Best Practices Followed

✅ Comprehensive telemetry and monitoring
✅ Clear error messages and logging
✅ Modular, testable design
✅ Proper dependency management
✅ Excellent documentation and examples

Recommendation: APPROVE with suggestions

This is a solid foundational example that demonstrates best practices for Claude tool-use agents. The security issues are minor and easily addressable. The comprehensive BMasterAI integration makes this an excellent reference implementation.

tools.py: - Add -- end-of-options sentinel to xdotool type and key calls to prevent text/key names from being interpreted as CLI flags - Add timeout=10s to all subprocess calls to prevent hangs if the display is unresponsive - Fix Optional[int] / Optional[str] type hints on _run_computer_use agent.py: - Include error_type, message_count, and traceback (limit 5 frames) in the exception handler metadata for better observability Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-03-15T11:39:43Z

Claude finished @travis-burmaster's task —— View job

Claude Code Review Complete ✅

Todo List

Read repository guidelines (CLAUDE.md) - No CLAUDE.md found
Review all 8 new files in the PR
Analyze code quality and best practices
Check for potential bugs and security issues
Evaluate performance considerations
Assess test coverage
Provide comprehensive feedback

Overall Assessment

This is a well-structured and thoroughly documented example implementation that demonstrates the core tool-use pattern for Claude agents. The code quality is high with comprehensive BMasterAI telemetry integration.

✅ Strengths

Code Quality & Architecture

Clean separation of concerns: Tools, agent logic, and CLI are properly separated
Comprehensive documentation: Excellent README with clear examples and setup instructions
Robust error handling: Graceful degradation when dependencies are missing
Defensive programming: Proper validation of required parameters before tool execution
Consistent code style: Well-formatted with clear docstrings

Security & Best Practices

Environment variable handling: Proper use of environment variables for API keys
No hardcoded secrets: Template provided in .env.example
Input validation: Parameters validated before tool execution
Subprocess security: Uses check=True and proper argument handling

Performance Considerations

Efficient token usage: Truncates long content (snippets to 500 chars, logs to 200 chars)
Resource management: Proper cleanup in finally blocks
Caching: Base64 images stripped from logs to reduce memory usage
Timeout handling: MAX_TURNS prevents infinite loops

⚠️ Areas for Improvement

Security Issues

Command Injection Risk (tools.py:177, 188)

# Current - vulnerable to injection
subprocess.run(["xdotool", "type", "--clearmodifiers", "--", text], check=True)

Status: ✅ Already properly protected with "--" separator

Subprocess Timeout Protection (tools.py:153, 169, 179, 189, 201)
```
# Good: timeout parameter already added
timeout=_SUBPROCESS_TIMEOUT,
```
Status: ✅ Already implemented properly

Error Handling & Reliability

Model Reference (agent.py:37, main.py:70)
```
DEFAULT_MODEL = "claude-opus-4-6"  # May not exist
```
Recommendation: Consider using a stable model reference like claude-3-5-sonnet-20241022 or make model configurable via environment variable.
Exception Context (agent.py:178)
```
self.monitor.track_error(AGENT_ID, type(e).__name__)
```
Recommendation: Could include more context like turn number and operation type.

Code Quality Enhancements

Type Hints (tools.py:141)
```
def _run_computer_use(action: str, x: Optional[int] = None, ...)
```
Status: ✅ Type hints are properly used throughout
Constants Configuration (agent.py:37-39)
```
# Consider making these configurable
DEFAULT_MODEL = "claude-opus-4-6"
MAX_TURNS = 20
MAX_TOKENS = 4096
```
Recommendation: Make these configurable via environment variables or constructor parameters.

Test Coverage

Missing Tests: No test files included
- Recommendation: Add unit tests for:
  - Tool dispatch functions
  - Error handling scenarios
  - Environment variable validation
  - BMasterAI telemetry integration

🔧 Specific Recommendations

Short Term (Reliability)

Use a stable model reference or make model configurable
Add configuration validation on startup
Consider making MAX_TURNS and MAX_TOKENS configurable

Long Term (Maintainability)

Add comprehensive test suite
Consider adding retry logic for failed tool operations
Add integration tests with mock services

📊 Performance Analysis

Token efficiency: ✅ Good truncation strategies (tools.py:128, agent.py:266)
Memory usage: ✅ Base64 data properly stripped from logs (agent.py:338)
Network efficiency: ✅ Reasonable API call patterns
Resource limits: ✅ MAX_TURNS prevents runaway execution
Subprocess management: ✅ Proper timeouts and error handling

🛡️ Security Analysis

API key management: ✅ Environment variables, no hardcoded secrets
Subprocess execution: ✅ Proper use of check=True and -- separators
Input validation: ✅ Required parameters validated before execution
Path traversal: ✅ No user-controllable file paths
Command injection: ✅ Properly protected with argument arrays and separators

🏆 Best Practices Followed

✅ Comprehensive telemetry and monitoring
✅ Clear error messages and logging
✅ Modular, testable design
✅ Proper dependency management
✅ Excellent documentation and examples
✅ Graceful degradation for missing dependencies
✅ Resource cleanup and timeout handling

Recommendation: APPROVE

This is an excellent foundational example that demonstrates best practices for Claude tool-use agents. The implementation is secure, well-documented, and follows good software engineering practices. The comprehensive BMasterAI integration makes this a valuable reference implementation.

Minor improvements around model configuration and test coverage would enhance the codebase, but the current implementation is production-ready and serves as an excellent learning resource.

travis-burmaster merged commit 1253c4a into main Mar 15, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: claude-web-computer-agent — web search + computer use with BMasterAI telemetry#51

feat: claude-web-computer-agent — web search + computer use with BMasterAI telemetry#51
travis-burmaster merged 3 commits intomainfrom
feat/claude-web-computer-agent

travis-burmaster commented Mar 15, 2026

Uh oh!

claude bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

travis-burmaster commented Mar 15, 2026

Summary

Files

Test plan

Uh oh!

claude bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete ✅

Todo List

Overall Assessment

✅ Strengths

Code Quality & Architecture

Security & Best Practices

Performance Considerations

⚠️ Areas for Improvement

Security Issues

Error Handling & Reliability

Code Quality Enhancements

Test Coverage

Documentation

🔧 Specific Recommendations

Immediate (Security)

Short Term (Reliability)

Long Term (Maintainability)

📊 Performance Analysis

🏆 Best Practices Followed

Recommendation: APPROVE with suggestions

Uh oh!

claude bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Todo List

Overall Assessment

✅ Strengths

Code Quality & Architecture

Security & Best Practices

Performance Considerations

⚠️ Areas for Improvement

Security Issues

Error Handling & Reliability

Code Quality Enhancements

Test Coverage

Documentation

🔧 Specific Recommendations

Immediate (Security)

Short Term (Reliability)

Long Term (Maintainability)

📊 Performance Analysis

🏆 Best Practices Followed

Recommendation: APPROVE with suggestions

Uh oh!

claude bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Todo List

Overall Assessment

✅ Strengths

Code Quality & Architecture

Security & Best Practices

Performance Considerations

⚠️ Areas for Improvement

Security Issues

Error Handling & Reliability

Code Quality Enhancements

Test Coverage

🔧 Specific Recommendations

Short Term (Reliability)

Long Term (Maintainability)

📊 Performance Analysis

🛡️ Security Analysis

🏆 Best Practices Followed

Recommendation: APPROVE

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

claude bot commented Mar 15, 2026 •

edited

Loading

claude bot commented Mar 15, 2026 •

edited

Loading

claude bot commented Mar 15, 2026 •

edited

Loading