Skip to content

Fix Win32Exception when stopping Elastic Agent service in transient state during test cleanup#348

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-test-failures
Draft

Fix Win32Exception when stopping Elastic Agent service in transient state during test cleanup#348
Copilot wants to merge 2 commits intomainfrom
copilot/fix-test-failures

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 12, 2026

Clean-ElasticAgentService called Stop-Service unconditionally, which throws Win32Exception: The service cannot accept control messages at this time when the service is in StartPending or StopPending. The exception propagated through Clean-ElasticAgent, skipping Clean-ElasticAgentProcess and Clean-ElasticAgentDirectory, leaving the binary on disk and failing the Check-AgentRemnants assertion.

Changes

  • src/agent-qa/helpers.ps1Clean-ElasticAgentService
    • Wait up to 30 seconds for the service to reach a stable state (Running or Stopped) before attempting to stop it
    • Emit a Write-Warning if the timeout expires with the service still in a transient state
    • Wrap Stop-Service -Force in try/catch to log stop failures as warnings rather than propagating exceptions that abort the cleanup sequence
    • Skip Stop-Service entirely if the service is already Stopped
# Before: throws Win32Exception if service is StartPending/StopPending
Stop-Service 'Elastic Agent'

# After: waits for stable state, then stops gracefully
while ($service.Status -notin @('Running', 'Stopped') -and (Get-Date) -lt $timeout) {
    Start-Sleep -Seconds 1
    $service.Refresh()
}
if ($service.Status -ne 'Stopped') {
    try {
        Stop-Service 'Elastic Agent' -Force
    } catch {
        Write-Warning "Failed to stop 'Elastic Agent' service: $_"
    }
}
Original prompt

This section details on the original issue you should resolve

<issue_title>There are test failures</issue_title>
<issue_description>See https://buildkite.com/elastic/elastic-stack-installers/builds/13650 (changes are not related to the project itself but adding some file to enable some new CI capabilities)

2025-10-22 17:54:48 UTC | [-] Can be installed in Fleet mode and uninstalled via GUID in default mode 27ms (26ms\|1ms)
-- | --
  | 2025-10-22 17:54:48 UTC | [0] Win32Exception: The service cannot accept control messages at this time.
  | 2025-10-22 17:54:48 UTC | InvalidOperationException: Cannot stop 'Elastic Agent' service on computer '.'.
  | 2025-10-22 17:54:48 UTC | ServiceCommandException: Service 'Elastic Agent (Elastic Agent)' cannot be stopped due to the following error: Cannot stop 'Elastic Agent' service on computer '.'.
  | 2025-10-22 17:54:48 UTC | at Clean-ElasticAgentService, C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\src\agent-qa\helpers.ps1:338
  | 2025-10-22 17:54:48 UTC | at Clean-ElasticAgent, C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\src\agent-qa\helpers.ps1:320
  | 2025-10-22 17:54:48 UTC | at <ScriptBlock>, C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:99
  | 2025-10-22 17:54:48 UTC | [1] Expected $false, because The agent should have been cleaned up already, but got $true.
  | 2025-10-22 17:54:48 UTC | at Is-AgentBinaryPresent \| Should -BeFalse -Because "The agent should have been cleaned up already", C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:72
  | 2025-10-22 17:54:48 UTC | at Check-AgentRemnants, C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:72
  | 2025-10-22 17:54:48 UTC | at <ScriptBlock>, C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:109
  | 2025-10-22 17:54:48 UTC | Tests completed in 855.61s
  | 2025-10-22 17:54:48 UTC | Tests Passed: 4, Failed: 14, Skipped: 0, Inconclusive: 0, NotRun: 0
  | 2025-10-22 17:54:48 UTC | Exception: Pester run failed, because 14 tests failed
  | 2025-10-22 17:54:49 UTC | Child pwsh process returned 1, a non zero exit code
  | 2025-10-22 17:54:49 UTC | Testing Failed
  | 2025-10-22 17:54:49 UTC | C:\buildkite-agent\builds\bk-agent-prod-gcp-1761153718403188678\elastic\elastic-stack-installers\.buildkite\scripts\bui
  | 2025-10-22 17:54:49 UTC | ld.ps1 : Tests failed.
  | 2025-10-22 17:54:49 UTC | At line:1 char:1
  | 2025-10-22 17:54:49 UTC | + .\.buildkite\scripts\build.ps1
  | 2025-10-22 17:54:49 UTC | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  | 2025-10-22 17:54:49 UTC | + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
  | 2025-10-22 17:54:49 UTC | + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,build.ps1
  | 2025-10-22 17:54:49 UTC |  
  | 2025-10-22 17:54:49 UTC | ���� Error: The command exited with status 1
  | 2025-10-22 17:54:49 UTC | user command error: exit status 1


Reproduction

Create a feature branch and its PR.

Then waitn for the CI to run
</issue_description>

<agent_instructions>Review the GH issue description and comments that contain all the ifnormation from the errors that are happening in the CI.

Try to reproduce the error loclaly and find what's causing the problem</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@v1v ```

 

Creating "C:\buildkite-agent\builds\bk-agent-prod-gcp-1773340373614086112\elastic\elastic-stack-installers"

cd C:\buildkite-agent\builds\bk-agent-prod-gcp-1773340373614086112\elastic\elastic-stack-installers

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com
������ Warning: ssh-keyscan "github.com" failed (Attempt 1/3 Retrying in 2s)

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

choose_kex: unsupported KEX method sntrup761x25519-sha512@openssh.com

github.com:22 SSH-2.0-a73f77f

c...


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…ansient state

Co-authored-by: v1v <2871786+v1v@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix test failures related to CI changes Fix Win32Exception when stopping Elastic Agent service in transient state during test cleanup Mar 12, 2026
Copilot AI requested a review from v1v March 12, 2026 20:19
Copilot finished work on behalf of v1v March 12, 2026 20:19
@v1v v1v marked this pull request as ready for review March 12, 2026 20:33
@v1v v1v requested a review from a team as a code owner March 12, 2026 20:33
@v1v v1v marked this pull request as draft March 12, 2026 20:33
@v1v
Copy link
Copy Markdown
Member

v1v commented Mar 12, 2026

buildkite test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

There are test failures

2 participants