-
Notifications
You must be signed in to change notification settings - Fork 50
💥Enable Worker heartbeating, update Core to 44a6576 #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
23d3dc6 to
fbc46ea
Compare
|
|
||
| <PropertyGroup Condition="$([MSBuild]::IsOSPlatform('Windows'))"> | ||
| <BridgeLibraryFile>temporal_sdk_core_c_bridge.dll</BridgeLibraryFile> | ||
| <BridgeLibraryFile>temporalio_sdk_core_c_bridge.dll</BridgeLibraryFile> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is considered a backwards incompatible change. We have to mark the PR with a 💥 in the title of the PR at least so we know to update release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated PR title and desc
|
|
||
| <PropertyGroup Condition="$([MSBuild]::IsOSPlatform('Windows'))"> | ||
| <BridgeLibraryFile>temporal_sdk_core_c_bridge.dll</BridgeLibraryFile> | ||
| <BridgeLibraryFile>temporalio_sdk_core_c_bridge.dll</BridgeLibraryFile> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was not made to the PackBridgeRuntimes from below. Should probably also temporarily enable the package.yml GH workflow in this branch to confirm it still works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package.yml now passes after a fixup, https://github.com/temporalio/sdk-dotnet/actions/runs/19974335695
| options.NexusTaskPollerBehavior ??= new PollerBehavior.SimpleMaximum(options.MaxConcurrentNexusTaskPolls); | ||
| var workerPluginNames = options.Plugins? | ||
| .Select(p => p.Name) | ||
| .Where(name => !string.IsNullOrEmpty(name)) ?? Enumerable.Empty<string>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic, but I don't think we need the IsNullOrEmpty check, it is a required property. If we're concerned about empty plugin names, we should validate that somewhere else, but I don't think we're really that concerned about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
| /// </summary> | ||
| /// <param name="telemetry"><see cref="Telemetry" />.</param> | ||
| /// <param name="workerHeartbeatInterval">Worker heartbeat interval. If 0 is specified, heartbeat is disabled.</param> | ||
| public TemporalRuntimeOptions(TelemetryOptions telemetry, TimeSpan workerHeartbeatInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need a constructor for this, workerHeartbeatInterval would be rarely set, they can use the parameterless or post-construction set approaches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, removed
| /// <summary> | ||
| /// Gets or sets the worker heartbeat duration in milliseconds. | ||
| /// </summary> | ||
| public TimeSpan WorkerHeartbeatInterval { get; set; } = TimeSpan.FromSeconds(60); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be nullable IMO (i.e. ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, added in
| return new Interop.TemporalCoreRuntimeOptions() | ||
| { | ||
| telemetry = scope.Pointer(options.Telemetry.ToInteropOptions(scope)), | ||
| worker_heartbeat_interval_millis = (ulong)options.WorkerHeartbeatInterval.TotalMilliseconds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should support unset (if 0 means unset in bridge, then can convert here from null to unset)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converted null and unset to 0, lmk if I misunderstood the comment
|
It looks like something about the Core upgrade broke the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, you'll have to update the TelemetryFilterOptions class to account for the crate name changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ty, I didn't do a thorough search through the repo to update crate names. Kinda thought AI woulda taken care of that, my bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, I missed this in Ruby which was pointed out to me at temporalio/sdk-ruby#368 (comment) which reminded me to point it out here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Client plugin names not passed to Core
The TemporalCoreClientOptions struct now has a plugin_names field added by this PR, but the ToInteropOptions method for TemporalConnectionOptions doesn't set this field. The PR description mentions "plumb plugin names (both client and worker) to core", and while worker plugin names are properly passed (via the worker options), client plugin names are not being sent to Core during connection. The field will be default-initialized to an empty array, which may cause the worker heartbeat feature to not properly report client-side plugins.
src/Temporalio/Bridge/OptionsExtensions.cs#L246-L269
sdk-dotnet/src/Temporalio/Bridge/OptionsExtensions.cs
Lines 246 to 269 in 0d016b3
| var scheme = options.Tls == null ? "http" : "https"; | |
| return new Interop.TemporalCoreClientOptions() | |
| { | |
| target_url = scope.ByteArray($"{scheme}://{options.TargetHost}"), | |
| client_name = ClientName.Ref, | |
| client_version = ClientVersion.Ref, | |
| metadata = scope.Metadata(options.RpcMetadata), | |
| api_key = scope.ByteArray(options.ApiKey), | |
| identity = scope.ByteArray(options.Identity), | |
| tls_options = | |
| options.Tls == null ? null : scope.Pointer(options.Tls.ToInteropOptions(scope)), | |
| retry_options = | |
| options.RpcRetry == null | |
| ? null | |
| : scope.Pointer(options.RpcRetry.ToInteropOptions()), | |
| keep_alive_options = | |
| options.KeepAlive == null | |
| ? null | |
| : scope.Pointer(options.KeepAlive.ToInteropOptions()), | |
| http_connect_proxy_options = | |
| options.HttpConnectProxy == null | |
| ? null | |
| : scope.Pointer(options.HttpConnectProxy.ToInteropOptions(scope)), | |
| }; |
src/Temporalio/Bridge/Interop/Interop.cs#L217-L219
sdk-dotnet/src/Temporalio/Bridge/Interop/Interop.cs
Lines 217 to 219 in 0d016b3
| [NativeTypeName("struct TemporalCoreByteArrayRefArray")] | |
| public TemporalCoreByteArrayRefArray plugin_names; |
cretz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only blocking concern is to remove the temporary enablement of the package.yml GH workflow
| } | ||
| var pluginNames = options.Plugins? | ||
| .Select(p => p.Name) | ||
| .Where(name => !string.IsNullOrEmpty(name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As cursor bot mentions above, probably don't need this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, I missed this in Ruby which was pointed out to me at temporalio/sdk-ruby#368 (comment) which reminded me to point it out here
| // We'll also test the metric prefix | ||
| Metrics = new() { Prometheus = new(promAddr), MetricPrefix = "foo_" }, | ||
| }, | ||
| WorkerHeartbeatInterval = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why these had to be set to null in these tests (just trying to make sure our users won't have to do this in their tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh this is due to the metrics test failing from the Worker heartbeating configured for runtime, but server version does not support it warning message, the solution is either to disable it for now, or we can set the dynamic config flag to support heartbeating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test fails from seeing an additional log statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, I don't see where these tests are checking logs. Can you explain a bit more how these tests were failing before setting this to null? I want to make sure our users don't have to do the same thing in the same situations. I am not seeing where the metrics would clash with worker heartbeat metrics, but I may be missing it. If unclear where it is failing, I can help debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed off-PR, provided diff for TestUtils.cs to help this situation
What was changed
Why?
New worker heartbeating feature
Checklist
Closes [Feature Request] Enable Worker Heartbeating #551
How was this tested:
Manually tested
Any docs updates needed?
Note
Enables worker heartbeating, plumbs plugin names to Core, updates interop/bindings and build paths, and renames the native bridge library to temporalio_sdk_core_c_bridge across code, CI, and docs.
TemporalRuntimeOptions.WorkerHeartbeatIntervaland propagate to Core (worker_heartbeat_interval_millis).TemporalCoreByteArrayRefArrayon client/worker options.no_remote_activitieswithTemporalCoreWorkerTaskTypesinTemporalCoreWorkerOptionsand populate based on registered work.Bridge.Workerconstructor acceptsclientPluginsand forwards to interop conversion.sdk-core/crates/sdk-core-c-bridgeand adjust allDllImportnames and library path totemporalio_sdk_core_c_bridge.TemporalCoreWorkerTaskTypes, extra fields on client/runtime/worker options.temporal_*totemporalio_*.temporalio_sdk_core_c_bridge.*; update.csproj,build/net462/Temporalio.targets, and GitHub Actions workflow to new paths and Cargo manifest.Written by Cursor Bugbot for commit 2bbb1f4. This will update automatically on new commits. Configure here.