Skip to content

Conversation

@rene-oromtz
Copy link
Contributor

@rene-oromtz rene-oromtz commented Nov 27, 2025

Description

This PR is meant to make the migration from the old logging mechanism to the new logging introduced by #666 more smoother.

Note: This branch is waiting for #666 to be merged on Monday Dec1, after merging, will rebase to display only relevant additions

Server Changes:

  1. Returns the old (now legacy) /output and /serial_output endpoints in server. This is required so both CLI and agents can write/read temporarily to/from this endpoints.
  2. Add a polymorphic schema to handle both types of results
  3. Add TTL to remove logs from database after 7 days

Agent Changes

  1. Allow agents to write logs to legacy and new endpoints

CLI

  1. Minor change while polling to just print a waiting for output after 90 seconds of not receiving any output

During transition, this would be the upgrade flow

  1. Update server to use this new changes. Agents and CLI will use legacy endpoints only
  2. Update agent code. Agents will now write to both endpoints
  3. Update CLI. New CLI will only read from new changes.
  4. Remove legacy endpoints from server/agent

Resolved issues

CERTTF-757

Documentation

Web service API changes

Tests

Tested in Staging

The following scenarios were validated successfully in staging:


  1. Server charm updated running this branch changes. Agent running code from main and CLI using stable channel
    Live Polling was available through /output endpoint
    Agents were able to send results back to server
[25-11-26 22:05:33]    INFO: (agent.py:304)| Starting job e21382bb-5d36-482e-acee-fbe9135f534d
....
[25-11-26 22:21:46]    INFO: (client.py:275)| Submitting job outcome for job: e21382bb-5d36-482e-acee-fbe9135f534d
[25-11-26 22:21:47]    INFO: (__init__.py:147)| Sleeping for 10

Results retrievable by CLI through the /result endpoint

testflinger-cli --server=https://testflinger-staging.canonical.com results e21382bb-5d36-482e-acee-fbe9135f534d |jq .reserve_output
"************************************************\n* Starting testflinger reserve phase on audino *\n************************************************\n2025-11-26 22:19:57,703 audino INFO: DEVICE CONNECTOR: BEGIN reservation\n2025-11-26 22:19:58,107 audino INFO: DEVICE CONNECTOR: Successfully imported key: lp:rene-orozco\n/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: \"key.pub\"\n\nNumber of key(s) added: 1\n\nNow try logging into the machine, with:   \"ssh -o 'StrictHostKeyChecking=no' -o 'UserKnownHostsFile=/dev/null' 'ubuntu@10.241.2.13'\"\nand check to make sure that only the key(s) you wanted were added.\n\n*** TESTFLINGER SYSTEM RESERVED ***\nYou can now connect to ubuntu@10.241.2.13\nCurrent time:           [2025-11-26T22:19:58.723899+00:00]\nReservation expires at: [2025-11-26T22:20:58.723936+00:00]\nReservation will automatically timeout in 60 seconds\nTo end the reservation sooner use: testflinger-cli cancel e21382bb-5d36-482e-acee-fbe9135f534d\n"

New endpoint is available but does not hold any data as agent still is not aware of it

curl -X GET https://testflinger-staging.canonical.com/v1/result/e21382bb-5d36-482e-acee-fbe9135f534d/log/output?phase=reserve
{"output": {"reserve": {"last_fragment_number": -1, "log_data": ""}}}

  1. Server and Agent running this branch changes. CLI using stable channel
    Live Polling was available through /output endpoint
    Agents were able to send results back to server
[25-11-26 22:27:08]    INFO: (__init__.py:145)| Checking jobs
[25-11-26 22:27:08]    INFO: (agent.py:304)| Starting job 41a5f3a0-700a-47e7-8426-389d94a7d67d
....
[25-11-26 22:44:24]    INFO: (client.py:260)| Submitting job outcome for job: 41a5f3a0-700a-47e7-8426-389d94a7d67d
[25-11-26 22:44:25]    INFO: (__init__.py:147)| Sleeping for 10

Results retrievable by CLI through the /result endpoint

testflinger-cli --server=https://testflinger-staging.canonical.com results 41a5f3a0-700a-47e7-8426-389d94a7d67d |jq .reserve_output
"************************************************\n* Starting testflinger reserve phase on audino *\n************************************************\n2025-11-26 22:42:32,848 audino INFO: DEVICE CONNECTOR: BEGIN reservation\n2025-11-26 22:42:33,379 audino INFO: DEVICE CONNECTOR: Successfully imported key: lp:rene-orozco\n/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: \"key.pub\"\n\nNumber of key(s) added: 1\n\nNow try logging into the machine, with:   \"ssh -o 'StrictHostKeyChecking=no' -o 'UserKnownHostsFile=/dev/null' 'ubuntu@10.241.2.13'\"\nand check to make sure that only the key(s) you wanted were added.\n\n*** TESTFLINGER SYSTEM RESERVED ***\nYou can now connect to ubuntu@10.241.2.13\nCurrent time:           [2025-11-26T22:42:33.995728+00:00]\nReservation expires at: [2025-11-26T22:43:33.995787+00:00]\nReservation will automatically timeout in 60 seconds\nTo end the reservation sooner use: testflinger-cli cancel 41a5f3a0-700a-47e7-8426-389d94a7d67d\n"

Additionally, agent now writes to both legacy and new endpoints so output data is also available from new endpoint (not retrievable by CLI):

curl -X GET https://testflinger-staging.canonical.com/v1/result/41a5f3a0-700a-47e7-8426-389d94a7d67d/log/output?phase=reserve |jq
{
  "output": {
    "reserve": {
      "last_fragment_number": 31,
      "log_data": "************************************************\n* Starting testflinger reserve phase on audino *\n************************************************\n2025-11-26 22:42:32,848 audino INFO: DEVICE CONNECTOR: BEGIN reservation\n2025-11-26 22:42:33,379 audino INFO: DEVICE CONNECTOR: Successfully imported key: lp:rene-orozco\n/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: \"key.pub\"\n\nNumber of key(s) added: 1\n\nNow try logging into the machine, with:   \"ssh -o 'StrictHostKeyChecking=no' -o 'UserKnownHostsFile=/dev/null' 'ubuntu@10.241.2.13'\"\nand check to make sure that only the key(s) you wanted were added.\n\n*** TESTFLINGER SYSTEM RESERVED ***\nYou can now connect to ubuntu@10.241.2.13\nCurrent time:           [2025-11-26T22:42:33.995728+00:00]\nReservation expires at: [2025-11-26T22:43:33.995787+00:00]\nReservation will automatically timeout in 60 seconds\nTo end the reservation sooner use: testflinger-cli cancel 41a5f3a0-700a-47e7-8426-389d94a7d67d\n"
    }
  }
}

  1. Server, agent and CLI changes running this PR code
    Live Polling was available through /log/output endpoint
    CLI can now poll to specific phases
uv run testflinger-cli --server=https://testflinger-staging.canonical.com poll 7c576240-2373-4431-809a-7cd95b8cb9aa --phase reserve
************************************************
* Starting testflinger reserve phase on audino *
************************************************
2025-11-26 23:48:34,397 audino INFO: DEVICE CONNECTOR: BEGIN reservation
2025-11-26 23:48:34,826 audino INFO: DEVICE CONNECTOR: Successfully imported key: lp:rene-orozco
....
Reservation will automatically timeout in 60 seconds
To end the reservation sooner use: testflinger-cli cancel 7c576240-2373-4431-809a-7cd95b8cb9aa
complete

@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

❌ Patch coverage is 76.22951% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.88%. Comparing base (35aa402) to head (3ec2c99).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #836      +/-   ##
==========================================
+ Coverage   70.82%   70.88%   +0.06%     
==========================================
  Files         108      108              
  Lines        9443     9549     +106     
  Branches      841      850       +9     
==========================================
+ Hits         6688     6769      +81     
- Misses       2585     2609      +24     
- Partials      170      171       +1     
Flag Coverage Δ *Carryforward flag
agent 72.41% <81.81%> (-0.32%) ⬇️
cli 87.78% <100.00%> (+0.24%) ⬆️
device 54.22% <ø> (ø) Carriedforward from 05373ac
server 87.58% <70.65%> (-0.83%) ⬇️

*This pull request uses carry forward flags. Click here to find out more.

Components Coverage Δ
Agent 72.41% <81.81%> (-0.32%) ⬇️
CLI 87.78% <100.00%> (+0.24%) ⬆️
Common ∅ <ø> (∅)
Device Connectors 54.22% <ø> (ø)
Server 87.58% <70.65%> (-0.83%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rene-oromtz rene-oromtz changed the title Non breaking logging feat: implement legacy endpoints for non breaking logging Nov 27, 2025
@rene-oromtz rene-oromtz marked this pull request as ready for review December 1, 2025 19:25
Copy link
Collaborator

@pedro-avalos pedro-avalos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work; this will definitely make the transition much smoother. I have left some suggestions and questions regarding some of the changes.

pedro-avalos
pedro-avalos previously approved these changes Dec 1, 2025
Copy link
Collaborator

@pedro-avalos pedro-avalos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!
Great work!

@rene-oromtz rene-oromtz merged commit a53aa32 into main Dec 1, 2025
14 checks passed
@rene-oromtz rene-oromtz deleted the non-breaking-logging branch December 1, 2025 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants