Skip to content

Enhanced Stability, Pause Logic, and Sentinel Improvements#59

Open
chideat wants to merge 14 commits intomainfrom
feat/1.1.0
Open

Enhanced Stability, Pause Logic, and Sentinel Improvements#59
chideat wants to merge 14 commits intomainfrom
feat/1.1.0

Conversation

@chideat
Copy link
Owner

@chideat chideat commented Sep 19, 2025

This release introduces significant improvements to the Valkey operator, focusing on stability enhancements, refined pause/resume functionality, and robust sentinel
failover handling. Key features include improved configuration management, better resource cleanup, and enhanced monitoring capabilities.

Changes

🚀 New Features

  • Enhanced Pause Logic: Improved pause status handling that sets pause status only after all pods are deleted
  • Force Failover Support: Added support for force failover to refresh sentinel node announcements
  • Annotation Refactoring: Major refactor of annotation merging for restart annotations
  • Service Comparison: Major service comparison improvements and actor enhancements

🐛 Bug Fixes

  • Config Update Fix: Resolved bug in config updates during Redis version upgrades
  • Resource Cleanup: Fixed duplicate resource settings cleanup
  • Pod Annotation: Corrected IsPodAnnotationDiff function name across RDS controllers
  • Cluster Node Handling: Improved handling of failed cluster nodes

🔧 Technical Improvements

  • Helper Commands: Updated helper commands and initialization scripts
  • Sentinel Startup: Increased sentinel startup probe delay for better stability
  • Major Refactor: Comprehensive service comparison and actor system improvements

Test Plan

  • Verify pause/resume functionality works correctly
  • Test force failover operations with sentinel clusters
  • Validate config updates during version upgrades
  • Confirm resource cleanup eliminates duplicate settings
  • Test annotation merging for restart operations
  • Verify cluster node healing and rebalancing

Files Changed

The changes span across 50+ files including controller logic, builder patterns, operator engines, and Valkey client implementations, with focus on cluster, failover, and sentinel operations.

This release significantly improves the operator's reliability and handling of edge cases in production environments.

chideat and others added 10 commits September 17, 2025 08:05
- Update function call from IsPodAnnonationDiff to IsPodAnnotationDiff
- Maintain all other resource cleanup and finalizer functionality
- Add comprehensive IsServiceChanged function for detailed service comparison
- Refactor actor ensure resource ordering and method names
- Improve service change detection with proper label/annotation comparison
- Enhance statefulset handling with better error checking
- Add utility functions for service port and spec comparison
Refactored the annotation merging logic to specifically handle the `RestartAnnotationKey`.

- Introduced `MergeRestartAnnotation` to compare and merge restart annotations based on timestamps.
- Replaced the generic `MergeAnnotations` function with the new specialized function.
- Ensured that the restart annotation is correctly propagated during updates.
- Modified the pause logic in the cluster, failover, and sentinel controllers to requeue the resource if nodes still exist, allowing them to scale down gracefully. The operator will now pause reconciliation only after all pods have been terminated.
- Increased the initial delay for the sentinel startup probe to 30 seconds to prevent premature failures on slower systems.
- Modified cluster, failover, and sentinel command implementations
- Updated initialization scripts for different deployment modes
- Improved helper functionality across command modules

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings September 19, 2025 10:47
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces comprehensive enhancements to the Valkey operator focusing on stability improvements, refined pause/resume functionality, and robust sentinel failover handling. The changes modernize string operations, improve service comparison logic, enhance config management with better version tracking, and strengthen resource cleanup mechanisms.

  • Major refactor of string operations from strings.Replace to strings.ReplaceAll and adoption of strings.SplitSeq for better performance
  • Enhanced service comparison using go-cmp library with proper empty value handling and comprehensive field validation
  • Improved configuration management with last-applied-config tracking and better version upgrade handling

Reviewed Changes

Copilot reviewed 60 out of 60 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/valkey/valkey.go Updates string operations to use strings.SplitSeq for improved iteration
pkg/kubernetes/clientset/service.go Enhances service comparison logic with go-cmp library
internal/valkey/node.go Modernizes string replacement using strings.ReplaceAll
internal/util/kubernetes.go Major refactor of comparison logic and addition of comprehensive service change detection
internal/ops//actor/.go Updates resource management with better service handling and configuration tracking
internal/builder/*.go Improves annotation merging and configuration management
cmd/helper/commands//.go Simplifies service access logic and removes redundant environment variables

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

chideat and others added 3 commits September 19, 2025 22:05
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Seer <kvcnow@gmail.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 62 out of 63 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant