Add sticky sessions to premium target groups#989
Closed
milesAraya wants to merge 1731 commits intodevelop-mainfrom
Closed
Add sticky sessions to premium target groups#989milesAraya wants to merge 1731 commits intodevelop-mainfrom
milesAraya wants to merge 1731 commits intodevelop-mainfrom
Conversation
Fixing error during account delete
Run enabled before S3 upload finished
…lock session scoping
…concurrency constants
… appropriate one (RemoteStorageSimpleReader -> RemoteStorageReader)
….com/arayabrain/optinist-for-cloud into fix/records-page-missing-experiments
…mUser && and add endpoint to send to cloudwatch log
…riments Publish experiments on Records page regardless of lock
Premium popup fix and logging
Remove outdated dosctrings with references to development cases
…ain/optinist-for-cloud into hotfix/storage-quota-fix
Fix storage quota after upgrade premium
- Added dedicated reproduce api for private dataview
- Adjust the frontend to call the reproduce api of the private dataview
- Generalized find_dataview_record.DataviewService
- Add `Depends(is_workspace_available)` to dataview reproduce API - Change the return type of _ensure_experiment_downloaded to the appropriate one
Fix/private dataview reproduce
Fix/premium popup 2
…d skip showing the assigning popup if the async call resolved quickly
Add flushSync to premium popup
Update frontend package.json to v1.0.0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Content
Summary
compute.tf. Retroactively apply stickiness to pre-existing TGs via theDuplicateTargetGroupNamefallback andassign_premium_userinline creation.fix_incorrect_is_shared_flags()in scheduled monitoring beforeprocess_shared_instance_optimization()so stale flags are corrected before the optimizer attempts unnecessary migrations.ModifyTargetGroupAttributesIAM permission required by the newmodify_target_group_attributesAPI call.Design Decisions
_enable_sticky_sessions()instead of inline calls. Three call sites need the samemodify_target_group_attributesinvocation. A helper keeps them consistent and avoids duplicating the attribute list.try/exceptin the helper, not at each call site. A TG without stickiness is better than no TG at all. If the API call fails (throttling, transient error), TG creation still succeeds and stickiness is retried on the next assignment via theDuplicateTargetGroupNamehandler.DuplicateTargetGroupNamehandler. Pre-existing TGs created before this change lack stickiness. The handler now applies it transparently — no manual migration needed.fix_incorrect_is_shared_flags()called beforeprocess_shared_instance_optimization(). Correcting stale flags first prevents the optimizer from attempting unnecessary migrations for users already alone on their instance.try/exceptforfix_incorrect_is_shared_flags(). Failure doesn't break the existing optimization step. Follows the file's convention of inlineimport traceback.STICKY_SESSION_DURATION_SECONDS = 300. Follows codebase convention of extracting magic numbers into named constants alongsideDEFAULT_IDLE_TIMEOUT_HOURS,LOCK_TIMEOUT_SECONDS, etc.Evidence
compute.tf:85-89defines the main TG stickiness config (lb_cookie, 300s) — this change mirrors it for premium TGs.fix_incorrect_is_shared_flags()(line ~4558) is already implemented with@with_transactionsafety.Files changed
infrastructure/terraform/premium_manager.tf-- AddModifyTargetGroupAttributesIAM permission to the Lambda's ELB policy; existing resource scoping (targetgroup/premium-*) already covers the target groupsinfrastructure/terraform/premium_manager_package/premium_manager.py-- AddSTICKY_SESSION_DURATION_SECONDSconstant; add typed_enable_sticky_sessions()helper; call it after TG creation increate_or_get_target_group(), in theDuplicateTargetGroupNamehandler, and inassign_premium_user()inline creation; callfix_incorrect_is_shared_flags()beforeprocess_shared_instance_optimization()in step 10 ofhandle_scheduled_monitoring()Manual Testcases
aws elbv2 describe-target-group-attributes --target-group-arn <arn>)modify_target_group_attributesfailure (temporarily remove IAM permission) — verify TG creation still succeeds and a WARNING is loggedis_shared=1for a user who is alone on their instance — trigger scheduled monitoring and verify the flag is corrected to 0process_shared_instance_optimization()still runs afterfix_incorrect_is_shared_flags()succeeds or failsUnit, Integration, Contract Test Coverage
premium_manager.pyLambda code. No tests added or broken by this change.Others
Difficulties
Risk Assessment
ModifyTargetGroupAttributesis scoped to existingtargetgroup/premium-*andtargetgroup/subscr-*resource ARNs. Must be applied viaterraform applybefore deploying the Lambda code.fix_incorrect_is_shared_flags()with@with_transaction. Isolatedtry/except— failure doesn't break step 10. Idempotent.modify_target_group_attributescalls will fail with AccessDenied. The helper'stry/exceptmakes this a soft failure (logged warning), but stickiness won't take effect until IAM is in place.