-
Notifications
You must be signed in to change notification settings - Fork 1
feature/SP-7272, pki-authority service for swarm #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: swarm
Are you sure you want to change the base?
Conversation
… 10.13.0.0/16 subnet, improving network security configurations for the VM environment.
…c on TCP port 443 from all sources, enhancing accessibility for the API server while maintaining security configurations.
…nce security configurations while maintaining existing network access rules.
…incoming traffic for the API server, improving accessibility while maintaining security configurations.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes migrate PKI Authority from a systemd-based service to a containerized LXC-based architecture with Python-driven orchestration. Key modifications include removing legacy systemd units, introducing comprehensive Python modules for container lifecycle and networking management, restructuring PCCS configuration with SSL key generation, adding SSL passthrough routing to OpenResty, and integrating PKI Authority into the SwarmDB framework via a new plugin system. Changes
Sequence Diagram(s)sequenceDiagram
participant Admin as Admin/Orchestrator
participant Plugin as PKI EventHandler
participant LXC as LXC Daemon
participant Container as PKI Container
participant Redis as Redis Cluster
participant Net as Network/iptables
Admin->>Plugin: apply() event
activate Plugin
Plugin->>Plugin: detect_cpu_type()
Plugin->>Plugin: detect_vm_mode()
Plugin->>Plugin: get_pki_domain()
Plugin->>LXC: create(archive_path)
activate LXC
LXC->>Container: extract & initialize
LXC-->>Plugin: container created
deactivate LXC
Plugin->>Plugin: patch_yaml_config(cpu_type, vm_mode, domain)
Plugin->>Plugin: patch_lxc_config(cpu_type)
Plugin->>Container: copy patched configs
Plugin->>Net: setup_iptables(wg_ip)
activate Net
Net->>Net: ensure NAT/DNAT/MASQUERADE rules
Net-->>Plugin: iptables configured
deactivate Net
Plugin->>LXC: start(timeout=30)
activate LXC
LXC->>Container: lxc-start
Container->>Container: systemd service startup
LXC-->>Plugin: container started
deactivate LXC
Plugin->>Container: health check loop
Container-->>Plugin: service healthy (uptime + HTTP /healthcheck)
Plugin->>Plugin: create_gateway_endpoints()
Plugin->>Redis: store route config (routes:<domain>)
activate Redis
Redis-->>Plugin: route stored
deactivate Redis
Plugin->>Plugin: create_output() & save properties
Plugin-->>Admin: completed
deactivate Plugin
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (1 warning, 2 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/apps/openresty/main.py (1)
679-686: Dead code:total_weightis computed but never used.The
total_weightis calculated by summing weights from all targets, but the actual round-robin selection at line 684 uses simple modulo with#targets, ignoring the weights entirely. Either remove the unused computation or implement weighted round-robin.🔎 Option 1: Remove unused code if weights aren't needed
if policy == "rr" then -- Round-robin local counters = ngx.shared.rr_counters local counter = counters:get(domain) or 0 - local total_weight = 0 - for _, t in ipairs(targets) do - total_weight = total_weight + (t.weight or 1) - end local idx = (counter % #targets) + 1 target_url = targets[idx].url counters:incr(domain, 1, 0)
🧹 Nitpick comments (13)
src/services/apps/openresty/main.py (2)
429-446: Consider Redis connection efficiency in stream context.Each new TLS connection triggers a sequential Redis connection attempt across all hosts without connection pooling. While the 30-second cache mitigates this for repeated requests to the same domain, high-traffic scenarios with many unique domains could create significant Redis connection overhead.
For improved performance, consider using
lua-resty-redis-connectoror implementing connection pooling withset_keepalive()instead ofclose(). However, since this is in the stream context (not HTTP), connection pooling options are more limited.
1037-1041: TODO: Graceful shutdown not implemented.The finalize handler currently does nothing. For production use, consider implementing connection draining using OpenResty's
ngx.worker.exit()or sending a QUIT signal to nginx workers to allow in-flight requests to complete.Do you want me to help implement graceful shutdown logic, or open an issue to track this task?
src/rootfs/files/scripts/install_pccs.sh (2)
110-118: SSL key generation has security limitations for production use.The function generates a self-signed certificate with the following characteristics:
- CN=localhost (not suitable for multi-host deployments)
- 2048-bit RSA key (acceptable, but 4096 is more secure)
- 365-day validity (requires annual renewal)
- Private key without passphrase protection
While this is appropriate for development/testing environments, production deployments should consider:
- Using CA-signed certificates with proper domain names
- Implementing certificate rotation/renewal processes
- Protecting private keys with stronger access controls
Optional: Add check for existing SSL keys
function generate_ssl_keys() { log_info "generating SSL keys for PCCS"; + + if [ -d "${OUTPUTDIR}${PCCS_ORIGINAL_LOCATION}/${PCCS_DIRNAME}/ssl_key" ]; then + log_info "SSL keys directory already exists, skipping generation"; + return 0; + fi + mkdir -p "${OUTPUTDIR}${PCCS_ORIGINAL_LOCATION}/${PCCS_DIRNAME}/ssl_key";
16-20: Consider making credentials configurable via environment variables.The PCCS API key and password are hardcoded in the script. While this may be acceptable for build-time configuration, consider making these values configurable via environment variables with fallback defaults for improved security and flexibility.
Recommended: Make credentials configurable
# Configuration variables -PCCS_API_KEY="aecd5ebb682346028d60c36131eb2d92" -PCCS_PORT="8081" -PCCS_PASSWORD="pccspassword123" +PCCS_API_KEY="${PCCS_API_KEY:-aecd5ebb682346028d60c36131eb2d92}" +PCCS_PORT="${PCCS_PORT:-8081}" +PCCS_PASSWORD="${PCCS_PASSWORD:-pccspassword123}" # Generate SHA512 hash of the password USER_TOKEN=$(echo -n "${PCCS_PASSWORD}" | sha512sum | awk '{print $1}')This allows users to override defaults by setting environment variables before running the script.
src/Dockerfile (1)
251-252: UseCOPYinstead ofADDfor local files.Hadolint correctly flags that
ADDshould not be used for local file/folder copying.COPYis preferred as it's more explicit and doesn't have the implicit behaviors ofADD(like auto-extracting archives).Proposed fix
-ADD rootfs/files/configs/pki-service/lxc-swarm-template.yaml "${OUTPUTDIR}/etc/super/containers/pki-authority/lxc-swarm-template.yaml" -ADD rootfs/files/configs/pki-service/lxc-legacy-vm-template.yaml "${OUTPUTDIR}/etc/super/containers/pki-authority/lxc-legacy-vm-template.yaml" +COPY rootfs/files/configs/pki-service/lxc-swarm-template.yaml "${OUTPUTDIR}/etc/super/containers/pki-authority/lxc-swarm-template.yaml" +COPY rootfs/files/configs/pki-service/lxc-legacy-vm-template.yaml "${OUTPUTDIR}/etc/super/containers/pki-authority/lxc-legacy-vm-template.yaml"src/swarm-scripts/80.setup-pki-authority.sh (1)
35-35: Remove unused variableCLI.The
CLIvariable is defined but never used in the script. Shellcheck correctly identifies this as dead code.Proposed fix
-CLI="$(dirname "$0")/swarm-cli.sh" echo "Creating/Updating ClusterPolicies '$CLUSTER_POLICY'..."src/services/apps/pki-authority/manifest.yaml (1)
59-67: Potential null access in node_id resolution.Lines 66 and 80 use a complex pattern to resolve
node_idfrom clusternodes. If no matching clusternode is found (e.g., due to timing issues or data inconsistency), the select will return nothing and.nodewill fail silently or return null.Consider adding a fallback or using
// nullto handle missing clusternodes gracefully:node_id: ((.cluster_node as $cn | $swarmdb.clusternodes[] | select(.id == $cn)) | .node) // nullsrc/services/apps/pki-authority/main.py (2)
134-154: Consider closing Redis connection after use.The
RedisClusterclient is created but not explicitly closed after operations. While Python's garbage collection will eventually clean up, explicitly closing the connection is a best practice to release resources promptly, especially in long-running services.Proposed fix
try: redis_client = RedisCluster( startup_nodes=startup_nodes, decode_responses=True, skip_full_coverage_check=True, socket_connect_timeout=5, ) redis_client.ping() redis_client.set(route_key, route_json) log(LogLevel.INFO, f"Successfully set gateway route {route_key} in Redis Cluster") + redis_client.close() if self.cluster_properties is None: self.cluster_properties = {} self.cluster_properties[self.PROP_REGISTERED_ENDPOINTS] = ";".join(current_endpoints) except Exception as e: error_msg = f"Failed to set route in Redis Cluster: {str(e)}"
159-159: Minor: Missing space in comparison.Proposed fix
- elif self.status =="postponed": + elif self.status == "postponed":src/services/apps/pki-authority/helpers.py (4)
457-460: Remove extraneousfprefix from strings without placeholders.These f-strings don't contain any placeholders, so the
fprefix is unnecessary.Proposed fix
if check_result.returncode == 0: - log(LogLevel.INFO, f"Rule already exists") + log(LogLevel.INFO, "Rule already exists") else: subprocess.run(add_args, check=True) - log(LogLevel.INFO, f"Rule added") + log(LogLevel.INFO, "Rule added")
156-225: Consider simplifying uptime calculation.The current implementation shells out to
datecommand to parse the timestamp. Since you're already in Python, consider usingdatetime.strptimefor parsing, which would be more portable and avoid the extra subprocess call.Example approach
from datetime import datetime # Instead of shelling out to date, parse directly: # ActiveEnterTimestamp format is typically like "Wed 2025-01-01 12:00:00 UTC" try: start_dt = datetime.strptime(timestamp_str, "%a %Y-%m-%d %H:%M:%S %Z") uptime_seconds = int((datetime.now() - start_dt).total_seconds()) except ValueError: # Handle parsing error passNote: The current approach works correctly; this is just a cleanliness suggestion.
443-448: Consider using iterable unpacking instead of concatenation.Ruff suggests using iterable unpacking for cleaner code.
Proposed fix
for rule in rules: # Delete rules that contain our comment if IPTABLES_RULE_COMMENT in rule: delete_rule = rule.replace("-A", "-D", 1) - subprocess.run(["iptables", "-t", "nat"] + delete_rule.split()[1:], check=True) + subprocess.run(["iptables", "-t", "nat", *delete_rule.split()[1:]], check=True) log(LogLevel.INFO, f"Deleted iptables rule: {delete_rule}")
310-312: Consider raising an exception instead ofsys.exit(1).Using
sys.exit(1)in a helper function prevents proper error handling by callers. The calling code inmain.pywraps operations in try/except, so raising an exception would allow for better error propagation and logging.Proposed fix
if not src_yaml.exists(): - log(LogLevel.ERROR, f"Error: {src_yaml} not found.") - sys.exit(1) + error_msg = f"Template not found: {src_yaml}" + log(LogLevel.ERROR, error_msg) + raise FileNotFoundError(error_msg)The same applies to similar
sys.exit(1)calls inget_bridge_ip()(lines 401, 407) andupdate_pccs_url()(line 567).
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (11)
src/Dockerfilesrc/rootfs/files/configs/pki-service/create-and-configure-pki.shsrc/rootfs/files/configs/pki-service/lxc-legacy-vm-template.yamlsrc/rootfs/files/configs/pki-service/lxc-swarm-template.yamlsrc/rootfs/files/configs/pki-service/pki-authority.servicesrc/rootfs/files/scripts/install_pccs.shsrc/services/apps/openresty/main.pysrc/services/apps/pki-authority/helpers.pysrc/services/apps/pki-authority/main.pysrc/services/apps/pki-authority/manifest.yamlsrc/swarm-scripts/80.setup-pki-authority.sh
💤 Files with no reviewable changes (2)
- src/rootfs/files/configs/pki-service/pki-authority.service
- src/rootfs/files/configs/pki-service/create-and-configure-pki.sh
🧰 Additional context used
🧬 Code graph analysis (1)
src/services/apps/pki-authority/helpers.py (1)
src/services/apps/pki-authority/main.py (1)
destroy(350-384)
🪛 Hadolint (2.14.0)
src/Dockerfile
[error] 251-251: Use COPY instead of ADD for files and folders
(DL3020)
[error] 252-252: Use COPY instead of ADD for files and folders
(DL3020)
🪛 Ruff (0.14.10)
src/services/apps/openresty/main.py
376-736: Possible SQL injection vector through string-based query construction
(S608)
src/services/apps/pki-authority/main.py
40-40: missing closing quote in string literal
(invalid-syntax)
src/services/apps/pki-authority/helpers.py
61-61: subprocess call: check for execution of untrusted input
(S603)
62-62: Starting a process with a partial executable path
(S607)
72-72: subprocess call: check for execution of untrusted input
(S603)
73-73: Starting a process with a partial executable path
(S607)
83-83: subprocess call: check for execution of untrusted input
(S603)
84-84: Starting a process with a partial executable path
(S607)
99-99: Starting a process with a partial executable path
(S607)
106-106: Consider moving this statement to an else block
(TRY300)
107-107: Do not catch blind exception: Exception
(BLE001)
114-114: subprocess call: check for execution of untrusted input
(S603)
115-115: Starting a process with a partial executable path
(S607)
120-120: Consider moving this statement to an else block
(TRY300)
121-121: Do not catch blind exception: Exception
(BLE001)
128-128: subprocess call: check for execution of untrusted input
(S603)
129-129: Starting a process with a partial executable path
(S607)
140-140: subprocess call: check for execution of untrusted input
(S603)
141-147: Starting a process with a partial executable path
(S607)
151-151: Consider moving this statement to an else block
(TRY300)
160-160: subprocess call: check for execution of untrusted input
(S603)
161-161: Starting a process with a partial executable path
(S607)
173-173: subprocess call: check for execution of untrusted input
(S603)
174-175: Starting a process with a partial executable path
(S607)
187-187: subprocess call: check for execution of untrusted input
(S603)
188-188: Starting a process with a partial executable path
(S607)
208-208: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
214-214: Do not catch blind exception: Exception
(BLE001)
217-217: Do not catch blind exception: Exception
(BLE001)
221-221: Consider moving this statement to an else block
(TRY300)
223-223: Do not catch blind exception: Exception
(BLE001)
279-279: Abstract raise to an inner function
(TRY301)
285-285: Abstract raise to an inner function
(TRY301)
288-288: Consider moving this statement to an else block
(TRY300)
295-295: Create your own exception
(TRY002)
393-393: subprocess call: check for execution of untrusted input
(S603)
394-394: Starting a process with a partial executable path
(S607)
416-416: subprocess call: check for execution of untrusted input
(S603)
417-417: Starting a process with a partial executable path
(S607)
425-425: subprocess call: check for execution of untrusted input
(S603)
426-426: Starting a process with a partial executable path
(S607)
436-436: subprocess call: check for execution of untrusted input
(S603)
437-437: Starting a process with a partial executable path
(S607)
447-447: subprocess call: check for execution of untrusted input
(S603)
447-447: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
454-454: subprocess call: check for execution of untrusted input
(S603)
457-457: f-string without any placeholders
Remove extraneous f prefix
(F541)
459-459: subprocess call: check for execution of untrusted input
(S603)
460-460: f-string without any placeholders
Remove extraneous f prefix
(F541)
🪛 Shellcheck (0.11.0)
src/swarm-scripts/80.setup-pki-authority.sh
[warning] 35-35: CLI appears unused. Verify use (or export if used externally).
(SC2034)
🔇 Additional comments (10)
src/services/apps/openresty/main.py (2)
376-522: LGTM: SSL passthrough stream block is well-designed.The new stream block correctly implements:
- SNI-based routing with
ssl_preread- Caching with 30-second TTL to reduce Redis load
- Multiple routing policies (round-robin, IP hash)
- Proper fallback to SSL termination mode via 127.0.0.1:8443
The static analysis warning about SQL injection (S608) is a false positive — this generates Nginx configuration, not SQL queries.
606-608: Verify: HTTPS server now binds to localhost only.The HTTPS server has been changed from listening on all interfaces (port 443) to listening only on
127.0.0.1:8443. This is correct for the new architecture where the stream block handles external port 443 and proxies SSL-termination traffic internally. Ensure no external clients expect direct HTTPS access on port 443 without going through the stream proxy.src/rootfs/files/scripts/install_pccs.sh (3)
70-73: Configuration path updated correctly for the new flow.The config directory and file are now created at
PCCS_ORIGINAL_LOCATIONbefore the move operation, which aligns with the new execution order wheremove_pccs_to_custom_locationis called later (line 158).
83-83: Verify the caching mode change aligns with requirements.The
CachingFillModehas been changed from"REQ"to"LAZY". This modifies PCCS caching behavior—LAZY mode defers certificate fetching until requested, which may increase initial request latency but reduce startup overhead.
155-158: The execution order is not problematic.The service is not executed during the build phase—
enable_pccs_serviceonly creates a symlink in the build chroot. At runtime, thepccs-init.service(marked withBefore=pccs.service) runs first and ensures PCCS files are present at/opt/intelby copying them from/usr/localif needed (seepccs-init.shlines 9-11). Therefore, thepccs.servicewill always find PCCS at the expected/opt/intellocation, regardless of where files are stored during the build.Likely an incorrect or invalid review comment.
src/swarm-scripts/80.setup-pki-authority.sh (1)
1-44: LGTM!The script is well-structured with proper error handling (
set -euo pipefail), sensible defaults for environment variables, and validation of the manifest file before proceeding. The bootstrap logic for creating ClusterPolicies and ClusterServices is clean.src/rootfs/files/configs/pki-service/lxc-swarm-template.yaml (1)
1-33: LGTM!The template correctly defines the PKI subroot configuration with appropriate challenge types (token, tdx, sev-snp), GitHub-based signature verification for attestation, and file-based storage. The structure aligns with the patching logic in
helpers.pywhich will overrideownChallenge.type,ownDomain, andattestationServiceSource.modeat runtime.src/services/apps/pki-authority/main.py (2)
255-263: Safe list modification during iteration.The code iterates over
missing_properties[:](a copy) while modifying the original list with.remove(). This is correct and avoids the pitfall of modifying a list while iterating over it directly. Good pattern usage.
350-385: LGTM!The destroy logic properly handles the teardown sequence: stop container → destroy container → clean up iptables → conditionally delete Redis route. The check
len(self.pki_cluster_nodes) <= 1correctly identifies the last node scenario, and the exception handling for Redis deletion prevents the destroy operation from failing due to network issues.src/services/apps/pki-authority/helpers.py (1)
1-44: LGTM!The helper module provides a comprehensive set of utilities for PKI Authority container management. Key strengths:
- Clean separation of concerns with dedicated functions for iptables, LXC operations, and configuration patching
- Proper use of enums for log levels and VM modes
- Consistent logging throughout
- Type hints for better code documentation
|
|
||
|
|
||
| class EventHandler: | ||
| """Handler for plugin events with unified exit point."""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Syntax error in docstring.
There are four double quotes instead of three, which causes a syntax error. The static analyzer correctly flagged "missing closing quote in string literal".
Proposed fix
class EventHandler:
- """Handler for plugin events with unified exit point.""""
+ """Handler for plugin events with unified exit point."""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """Handler for plugin events with unified exit point."""" | |
| class EventHandler: | |
| """Handler for plugin events with unified exit point.""" |
🧰 Tools
🪛 Ruff (0.14.10)
40-40: missing closing quote in string literal
(invalid-syntax)
🤖 Prompt for AI Agents
In src/services/apps/pki-authority/main.py around line 40, the docstring uses
four double quotes causing a syntax error; fix it by replacing the four double
quotes with a proper triple-quoted string delimiter (""" or '''), ensuring the
opening and closing quotes match and the docstring content remains unchanged so
the string literal is valid.
Summary by CodeRabbit
Release Notes
New Features
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.