Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .coverage
Binary file not shown.
36 changes: 19 additions & 17 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
# Pull Request Template

## Description
Please include a summary of the change and which issue is fixed.
Provide a clear summary of the changes and the problem being solved.

Fixes # (issue)
## Related Issues
Fixes # (issue number)

## Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
- [ ] Added/Updated Unit Tests
- [ ] Tested locally with `uv run pytest`
## Checklist
- [ ] I have followed the `AGENTS.md` rules.
- [ ] My code follows the project style (black/isort).
- [ ] I have added tests that prove my fix is effective or that my feature works.
- [ ] New and existing unit tests pass locally.
- [ ] I have updated the documentation.
- [ ] Coverage is >= 90% for new code.

## Checklist:
- [ ] My code follows the style guidelines of this project (`uv run black`, `uv run isort`)
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] My changes generate no new warnings/linting errors (passed `uv run flake8`, `uv run mypy`)
- [ ] New and existing unit tests pass locally with my changes
## Screenshots (if applicable)
Add screenshots or recordings here.
35 changes: 14 additions & 21 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,19 @@

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [1.1.0] - 2026-03-22
### Added
- **Web Dashboard**: Streamlit interface to monitor agents and workflows.
- **Monitoring**: Prometheus metrics wrapper.
- **Integrations**: Slack, Linear, and GitHub OAuth integrations.
- **Core Tests**: Comprehensive test suite covering orchestrator and workflows.
- **Real Training**: Unsloth fine-tuning via `SFTTrainer`.
- **Real Inference**: CPU-optimized `llama-cpp` BitNet inference.
- **CI/CD**: Expanded GitHub Actions using `uv`.
- **Simulation**: Support for Newton physics simulations.
- **Templates**: Add bug/feature reporting templates and PR guidelines.
- **Security**: Added `SECURITY.md`.

### Changed
- Migrated dependency management from `requirements.txt` to `pyproject.toml` (via `uv`).
- Standardized imports, removing `sys.path` injection hacks.
- GitHub Integration (OAuth, PRs, Commits).
- Slack Bot (Threads, Interactive, Slash).
- Linear Integration (Issue creation, Status updates).
- Error Recovery (Snapshots, Retries).
- Advanced Superpowers Skills (Research, Audit, Optimization).
- Advanced Context Management (Token tracking, Summarization).
- Daytona Sandbox Integration.
- Comprehensive Testing Suite (50+ tests).

### Fixed
- Logging configuration compatibility across different `python-json-logger` versions.
## [1.0.0] - 2024-01-01
### Initial Release
- Core orchestrator engine.
- Modal and Docker sandbox support.
- Basic brainstorming and planning skills.
29 changes: 19 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,24 @@
# Contributing to AI Dev OS

First off, thanks for taking the time to contribute! 🎉
Welcome! We are excited that you want to contribute to the AI Dev OS project.

## Development Process
## Workflow

1. **Fork** the repository and clone it to your local machine.
2. **Branch**: Create a feature branch `git checkout -b feature/your-feature-name`.
3. **Commit**: Make sure to test your code. See the testing section.
4. **Push**: Submit a Pull Request.
1. Fork the repository.
2. Create a feature branch: `feature/your-feature-name`.
3. Follow the rules in `AGENTS.md` (mandatory).
4. Implement your changes using TDD.
5. Ensure all tests pass: `uv run pytest`.
6. Submit a Pull Request.

## Rules
- You MUST follow the core rules outlined in `AGENTS.md`.
- Test-Driven Development (TDD) is required for any logic change.
- Ensure you run `black`, `isort`, and `mypy` before submitting your PR.
## Code Style
- Use `black` for formatting.
- Use `isort` for imports.
- Maintain 90%+ test coverage for new code.

## Submitting a PR
Your PR must include:
- A clear description of changes.
- Reference to any related issues.
- Updated documentation.
- All tests green.
25 changes: 9 additions & 16 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,20 @@

## Supported Versions

Only the latest `main` branch is actively supported with security updates.

| Version | Supported |
| ------- | ------------------ |
| v0.1.x | :white_check_mark: |
| legacy | :x: |
We are currently only supporting security fixes for the `main` branch.

## Reporting a Vulnerability

If you discover a security vulnerability within AI Dev OS, please **do not open a public issue**.

Instead, please send an e-mail to the maintainers privately or use GitHub's private vulnerability reporting feature (if enabled). We will work with you to assess and resolve the vulnerability as quickly as possible.
If you discover a security vulnerability within this project, please do NOT open a public issue. Instead, send an email to security@example.com (replace with real email).

### What to include
Include as much information as possible:
- A description of the vulnerability.
- Steps to reproduce.
- Potential impact.
- Any mitigation strategies you've identified.

### Scope
- Core orchestration engine
- Web Dashboard
- Docker/Modal Sandbox isolations
- Authentication flows
We will acknowledge your report within 48 hours and provide a timeline for a fix.

## Critical Protections
- DO NOT commit API keys to this repository. Use environment variables.
- All code must run in sandboxed environments (Modal, Daytona, Docker).
- Review all third-party code before integration.
41 changes: 41 additions & 0 deletions baseline_roadmap_tests.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
============================= test session starts =============================
platform win32 -- Python 3.12.0, pytest-9.0.2, pluggy-1.6.0
rootdir: C:\Users\HASSA\Desktop\AI-DEV-OS
configfile: pyproject.toml
testpaths: tests
plugins: anyio-4.12.1, langsmith-0.7.22, asyncio-1.3.0, cov-7.0.0
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 38 items

tests\test_core.py ... [ 7%]
tests\test_core_comprehensive.py ..................... [ 63%]
tests\test_integrations.py ... [ 71%]
tests\test_models.py .. [ 76%]
tests\test_sandbox.py .. [ 81%]
tests\test_skills.py ... [ 89%]
tests\test_utils.py .... [100%]

============================== warnings summary ===============================
tests/test_core.py::test_workflow_state_logging
tests/test_core_comprehensive.py::TestWorkflowState::test_state_initialization
tests/test_core_comprehensive.py::TestWorkflowState::test_add_log
tests/test_core_comprehensive.py::TestWorkflowState::test_state_transitions
tests/test_core_comprehensive.py::TestWorkflowState::test_context_usage
tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_creates_file
tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_empty_agents
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\core.py:94: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
self.created_at = datetime.utcnow().isoformat()

tests/test_core.py::test_workflow_state_logging
tests/test_core_comprehensive.py::TestWorkflowState::test_add_log
tests/test_core_comprehensive.py::TestWorkflowState::test_add_log
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\core.py:98: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
self.logs.append(f"[{datetime.utcnow().isoformat()}] {message}")

tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_creates_file
tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_empty_agents
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\core.py:181: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
"timestamp": datetime.utcnow().isoformat(),

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================= 38 passed, 12 warnings in 5.81s =======================
120 changes: 120 additions & 0 deletions coverage_report.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
============================= test session starts =============================
platform win32 -- Python 3.12.0, pytest-9.0.2, pluggy-1.6.0
rootdir: C:\Users\HASSA\Desktop\AI-DEV-OS
configfile: pyproject.toml
testpaths: tests
plugins: anyio-4.12.1, langsmith-0.7.22, asyncio-1.3.0, cov-7.0.0
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 51 items

tests\test_core.py ... [ 5%]
tests\test_core_comprehensive.py ..................... [ 47%]
tests\test_core_snapshot.py .F [ 50%]
tests\test_github_real.py .... [ 58%]
tests\test_integrations.py ... [ 64%]
tests\test_models.py .. [ 68%]
tests\test_sandbox.py .. [ 72%]
tests\test_skills.py ... [ 78%]
tests\test_slack_bot.py ... [ 84%]
tests\test_snapshot.py .... [ 92%]
tests\test_utils.py .... [100%]

================================== FAILURES ===================================
__________________________ test_retry_on_api_failure __________________________

orchestrator = <ai_dev_os.core.AIDevOSOrchestrator object at 0x0000019B0EC0AF00>

@pytest.mark.asyncio
async def test_retry_on_api_failure(orchestrator):
mock_resp = MagicMock()
mock_resp.content = [MagicMock(text="success result")]
mock_resp.usage.output_tokens = 10

orchestrator.mock_anth.messages.create.side_effect = [
Exception("transient error"),
mock_resp
]

# Ensure a different request to avoid any potential (unpatched) cache hits
with patch("builtins.input", return_value="no"):
state = await orchestrator.run("unique request for retry")

assert state.design_doc == "success result"
> assert orchestrator.mock_anth.messages.create.call_count == 2
E AssertionError: assert 0 == 2
E + where 0 = <MagicMock name='Anthropic().messages.create' id='1765479203376'>.call_count
E + where <MagicMock name='Anthropic().messages.create' id='1765479203376'> = <MagicMock name='Anthropic().messages' id='1765479183184'>.create
E + where <MagicMock name='Anthropic().messages' id='1765479183184'> = <MagicMock name='Anthropic()' id='1765479145520'>.messages
E + where <MagicMock name='Anthropic()' id='1765479145520'> = <ai_dev_os.core.AIDevOSOrchestrator object at 0x0000019B0EC0AF00>.mock_anth

tests\test_core_snapshot.py:58: AssertionError
---------------------------- Captured stdout call -----------------------------
\n[HUD] Phase: brainstorming | Context: 0.0% | Agents: none\n\n\U0001f4cb DESIGN DOCUMENT:\n\nsuccess result\n\n============================================================
============================== warnings summary ===============================
tests/test_core.py::test_workflow_state_logging
tests/test_core_comprehensive.py::TestWorkflowState::test_state_initialization
tests/test_core_comprehensive.py::TestWorkflowState::test_add_log
tests/test_core_comprehensive.py::TestWorkflowState::test_state_transitions
tests/test_core_comprehensive.py::TestWorkflowState::test_context_usage
tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_creates_file
tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_empty_agents
tests/test_core_snapshot.py::test_run_generates_snapshots
tests/test_core_snapshot.py::test_retry_on_api_failure
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\core.py:97: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
self.created_at = datetime.utcnow().isoformat()

tests/test_core.py: 1 warning
tests/test_core_comprehensive.py: 2 warnings
tests/test_core_snapshot.py: 17 warnings
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\core.py:101: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
self.logs.append(f"[{datetime.utcnow().isoformat()}] {message}")

tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_creates_file
tests/test_core_comprehensive.py::TestClaudeHUDIntegration::test_hud_update_empty_agents
tests/test_core_snapshot.py::test_run_generates_snapshots
tests/test_core_snapshot.py::test_run_generates_snapshots
tests/test_core_snapshot.py::test_run_generates_snapshots
tests/test_core_snapshot.py::test_retry_on_api_failure
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\core.py:185: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
"timestamp": datetime.utcnow().isoformat(),

tests/test_integrations.py::test_github_webhook
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\integrations\github.py:29: DeprecationWarning: Argument login_or_token is deprecated, please use auth=github.Auth.Token(...) instead
self.client = Github(token) if HAS_GITHUB else None

tests/test_snapshot.py::test_save_snapshot_creates_file
tests/test_snapshot.py::test_load_latest_snapshot
tests/test_snapshot.py::test_load_latest_snapshot
tests/test_snapshot.py::test_list_snapshots
tests/test_snapshot.py::test_list_snapshots
C:\Users\HASSA\Desktop\AI-DEV-OS\src\ai_dev_os\utils\snapshot.py:26: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================== tests coverage ================================
_______________ coverage: platform win32, python 3.12.0-final-0 _______________

Name Stmts Miss Cover Missing
----------------------------------------------------------------------
src\ai_dev_os\__init__.py 0 0 100%
src\ai_dev_os\agents.py 0 0 100%
src\ai_dev_os\core.py 270 36 87% 116, 153-172, 211, 281-283, 342, 397-398, 517-518, 560-594, 598
src\ai_dev_os\hud.py 0 0 100%
src\ai_dev_os\integrations\__init__.py 0 0 100%
src\ai_dev_os\integrations\github.py 56 18 68% 15-17, 27, 34-35, 43-45, 57, 63-65, 70, 77-79, 90
src\ai_dev_os\integrations\linear.py 26 14 46% 17, 26-43, 49-55
src\ai_dev_os\integrations\slack.py 34 10 71% 16, 38-40, 46-50, 65
src\ai_dev_os\models.py 194 114 41% 72-73, 95-97, 106, 117-152, 185-187, 191-210, 214-232, 239-241, 245-266, 275-293, 299-311, 318-319, 323-337, 341-343, 347-351, 355-363, 371-380, 385-389
src\ai_dev_os\monitoring_metrics.py 43 43 0% 7-107
src\ai_dev_os\sandbox.py 220 141 36% 60, 68, 73, 78, 83, 96-123, 137-140, 144-150, 154-160, 164-170, 178-187, 191-196, 200-205, 209-214, 218-224, 232-260, 264-278, 282-299, 303-315, 319-327, 343-351, 356, 362-363
src\ai_dev_os\simulation.py 77 77 0% 7-147
src\ai_dev_os\skills.py 22 0 100%
src\ai_dev_os\utils\__init__.py 0 0 100%
src\ai_dev_os\utils\error_handling.py 21 0 100%
src\ai_dev_os\utils\monitoring.py 23 9 61% 19-22, 28-36
src\ai_dev_os\utils\snapshot.py 28 0 100%
----------------------------------------------------------------------
TOTAL 1014 462 54%
=========================== short test summary info ===========================
FAILED tests/test_core_snapshot.py::test_retry_on_api_failure - AssertionErro...
================== 1 failed, 50 passed, 41 warnings in 8.71s ==================
Loading
Loading