From 59a2ab28199ed8f42443ac50c9d15c2df8924f74 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 18 Mar 2025 15:55:50 -0400 Subject: [PATCH 01/52] Update documentation --- docs/README.md | 11 +++++------ docs/terraform_cloud.md | 2 +- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/docs/README.md b/docs/README.md index 44e5c189c..5d7fee4bf 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1274,16 +1274,15 @@ It is possible to destroy only the instances and keep the rest of the infrastruc like the floating ip, the volumes, the generated SSH host key, etc. To do so, set the count value of the instance type you wish to destroy to 0. -### 9.2 Reset +### 9.2 Instance Replacement On some occasions, it is desirable to rebuild some of the instances from scratch. -Using `terraform taint`, you can designate resources that will be rebuilt at -next application of the plan. +Using the `-replace` option of `terraform apply`, you can designate resources +that will be rebuilt at next application of the plan. -To rebuild the first login node : +For example, to rebuild the first login node : ``` -terraform taint 'module.openstack.openstack_compute_instance_v2.instances["login1"]' -terraform apply +terraform apply -replace='module.openstack.openstack_compute_instance_v2.instances["login1"]' ``` ## 10. Customize Cluster Software Configuration diff --git a/docs/terraform_cloud.md b/docs/terraform_cloud.md index 412380233..9f7a90dfe 100644 --- a/docs/terraform_cloud.md +++ b/docs/terraform_cloud.md @@ -188,7 +188,7 @@ plan will then be automatically applied. Terraform cloud only allows to apply or destroy the plan as stated in the main.tf, but sometimes it can be useful to run some other terraform commands that are only -available through the command-line interface, for example `terraform taint`. +available through the command-line interface. It is possible to import the terraform state of a cluster on your local computer and then use the CLI on it. From 91ecc0d118eb52f114849ce61f1fd644ab6c65b7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 26 Mar 2025 10:28:44 -0400 Subject: [PATCH 02/52] Replace list of specific subdomains by * This leaves the ability to explictly specify the A records, but by default all traffic that falls in the wild card will now be directed to reverse proxy. --- dns/cloudflare/variables.tf | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dns/cloudflare/variables.tf b/dns/cloudflare/variables.tf index f021da3bd..46385c820 100644 --- a/dns/cloudflare/variables.tf +++ b/dns/cloudflare/variables.tf @@ -7,7 +7,7 @@ variable "domain" { variable "vhosts" { description = "List of vhost dns records to create as vhost.name.domain_name." type = list(string) - default = ["ipa", "jupyter", "mokey", "explore"] + default = ["*"] } variable "domain_tag" { From f9e2713e1cb4a19e4290c14a7dee44a98cd51bf2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 27 Mar 2025 09:03:54 -0400 Subject: [PATCH 03/52] Update changelog --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index b72d0279c..78b581078 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,13 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). +## [14.3.0] UNRELEASED + +### Changed + +- [dns] The default list of vhost subdomains has been replaced by a `["*"]`. +This simplifies configuration of new virtual hosts in the reverse proxy. + ## [14.2.1] 2025-02-21 No changes to infrastructure code. From f909189f097b793d77ed561558169a4a2f797a22 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 27 Mar 2025 09:28:27 -0400 Subject: [PATCH 04/52] Add docs on prometheus --- docs/README.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/docs/README.md b/docs/README.md index 5d7fee4bf..a340c1e2a 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1687,6 +1687,28 @@ a volume, add `enable_resize = true` to its specs map. You can then increase the The corresponding volume will be expanded by the cloud provider and the filesystem will be extended by Puppet. +### 10.15 Access Prometheus' expression browser + +Prometheus is an open-source systems monitoring and alerting toolkit. It is installed by default +in Magic Castle. Every instance exposes their usage metrics and some services do to. To explore +and visualize this data, it possible to access the [expression browser](https://prometheus.io/docs/visualization/browser/). + +From inside the cluster, it is typically available at `http://mgmt1:9090`. Given DNS is configured +for your cluster, you can add the following snippet to your [hieradata](#413-hieradata-optional). to access the expression browser +from Internet. + +```yaml +lookup_options: + profile::reverse_proxy::subdomains: + merge: 'hash' +profile::reverse_proxy::subdomains: + metrics: "%{lookup('terraform.tag_ip.mgmt.0')}:9090" +profile::reverse_proxy::remote_ips: + metrics: [''] +``` + +Prometheus will then be available at `http://metrics.your-cluster.yourdomain.tld/`. + ## 11. Customize Magic Castle Terraform Files You can modify the Terraform module files in the folder named after your cloud From 77855c9effff1f65fe0abb8cdf535a67450cc05b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Mon, 7 Apr 2025 15:23:42 -0400 Subject: [PATCH 05/52] Fix typo in docs --- docs/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index a340c1e2a..ed2cf7d06 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1111,7 +1111,7 @@ Refer to the subsection [6.3](#63-unsupported-providers) for more details. #### 6.1.2 Cloudflare API Token -If you prefer using an API token instead of the global API key, you will need to configure a token with the following four permissions with the [Cloudflare API Token interface](https://dash.cloudflare.com/profile/api-tokens). +If you prefer using an API token instead of the global API key, you will need to configure a token with the following permissions using the [Cloudflare API Token interface](https://dash.cloudflare.com/profile/api-tokens). | Section | Subsection | Permission| | :------ |:---------- | :-------- | From 50736543f2cb14aee9909fb2869c03aacf88e101 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 10 Apr 2025 12:33:38 -0400 Subject: [PATCH 06/52] Make sure ssh keys do not have whitespace prefix or suffix Otherwise, puppet risks erroring when applying the catalog with this message: Failed to apply catalog: Parameter key failed on Ssh_authorized_key[centos_0]: Key must not contain whitespace: --- common/configuration/main.tf | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/common/configuration/main.tf b/common/configuration/main.tf index 7592d6b6a..6c8420dc8 100644 --- a/common/configuration/main.tf +++ b/common/configuration/main.tf @@ -47,6 +47,7 @@ resource "random_pet" "guest_passwd" { locals { puppet_passwd = random_string.puppet_passwd.result guest_passwd = var.guest_passwd != "" ? var.guest_passwd : try(random_pet.guest_passwd[0].id, "") + public_keys = [for key in var.public_keys: trimspace(key)] puppetservers = { for host, values in var.inventory: host => values.local_ip if contains(values.tags, "puppet")} all_tags = toset(flatten([for key, values in var.inventory : values.tags])) @@ -70,7 +71,7 @@ locals { tag_ip = local.tag_ip data = { sudoer_username = var.sudoer_username - public_keys = var.public_keys + public_keys = local.public_keys cluster_name = lower(var.cluster_name) domain_name = var.domain_name guest_passwd = local.guest_passwd @@ -98,7 +99,7 @@ locals { puppetservers = local.puppetservers, puppetserver_password = local.puppet_passwd, sudoer_username = var.sudoer_username, - ssh_authorized_keys = var.public_keys + ssh_authorized_keys = local.public_keys tf_ssh_public_key = tls_private_key.ssh.public_key_openssh # If there is no bastion, the terraform data has to be packed with the user_data of the puppetserver. # We do not packed it systematically because it increases the user-data size to a value that can be From e898422040e9b2dc33c55c7df18d1e135c5bab3f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 10 Apr 2025 16:03:31 -0400 Subject: [PATCH 07/52] Reduce choices of availablity zones in AWS Limit the choices to only what is possible given the selected instance types instead of all AZ available. --- aws/infrastructure.tf | 60 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 51 insertions(+), 9 deletions(-) diff --git a/aws/infrastructure.tf b/aws/infrastructure.tf index 0a567fb72..ec5955aaf 100644 --- a/aws/infrastructure.tf +++ b/aws/infrastructure.tf @@ -48,22 +48,58 @@ module "provision" { data "aws_availability_zones" "available" { state = "available" + lifecycle { + postcondition { + condition = var.availability_zone == "" || contains(self.names, var.availability_zone) + error_message = "var.availability_zone must be one of ${jsonencode(self.names)}" + } + } +} + +# Retrieve the availability zones in which each unique instance type is available +data "aws_ec2_instance_type_offerings" "inst_az" { + filter { + name = "instance-type" + values = distinct([for instance in module.design.instances: instance.type]) + } + location_type = "availability-zone" +} + +# Build a set of availability zones that offer all selected instance types +locals { + instance_types = distinct([for instance in module.design.instances: instance.type]) + az_choices = setintersection(data.aws_availability_zones.available.names, values({ + for type in local.instance_types: type => + [ for idx, zone in data.aws_ec2_instance_type_offerings.inst_az.locations: zone + if data.aws_ec2_instance_type_offerings.inst_az.instance_types[idx] == type + ] + })...) +} + +resource "terraform_data" "az_check" { + lifecycle { + precondition { + condition = length(local.az_choices) > 0 + error_message = "There is not a single availability zone in ${var.region} that provides all instance types you have selected." + } + precondition { + condition = var.availability_zone == "" || contains(local.az_choices, var.availability_zone) + error_message = < Date: Thu, 3 Apr 2025 12:28:22 -0400 Subject: [PATCH 08/52] Enable puppet prometheus reporting --- common/configuration/puppet.yaml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/common/configuration/puppet.yaml b/common/configuration/puppet.yaml index e3f5a48fd..d087049b5 100644 --- a/common/configuration/puppet.yaml +++ b/common/configuration/puppet.yaml @@ -72,6 +72,8 @@ runcmd: - chown puppet:puppet /var/log/autosign.log - /opt/puppetlabs/bin/puppet config set autosign /opt/puppetlabs/puppet/bin/autosign-validator --section server - /opt/puppetlabs/bin/puppet config set allow_duplicate_certs true --section server + - /opt/puppetlabs/bin/puppet config set reports prometheus --section server + - install -d -m 0755 -o puppet -g puppet /var/lib/node_exporter # allow puppet to write report as prometheus metrics on first run # Generate bootstrap hieradata asymmetric encryption key - mkdir -p /etc/puppetlabs/puppet/eyaml - "(cd /etc/puppetlabs/puppet/eyaml; openssl req -x509 -nodes -newkey rsa:2048 -keyout boot_private_key.pkcs7.pem -out boot_public_key.pkcs7.pem -batch)" @@ -110,7 +112,7 @@ runcmd: %{ endif ~} - /opt/puppetlabs/bin/puppet config set certname ${node_name} - /opt/puppetlabs/bin/puppet config set waitforcert 15s - - /opt/puppetlabs/bin/puppet config set report false + - /opt/puppetlabs/bin/puppet config set report true - /opt/puppetlabs/bin/puppet config set postrun_command /opt/puppetlabs/bin/postrun - systemctl enable puppet # Remove all ifcfg configuration files that have no corresponding network interface in ip link show. From d8f17436af98dd4e1dd9816c55213560e71a73c5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 22 Apr 2025 14:16:54 -0400 Subject: [PATCH 09/52] Fix typo in changelog --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 78b581078..ad97f2f81 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,7 +26,7 @@ Refer to [puppet-magic_castle changelog](https://github.com/ComputeCanada/puppet - Generalized definition of instance's specs (PR #341) - Made tf user a system user (PR #343) -- Splited sshd config so that Match directives are in their own files (PR #345) +- Split sshd config so that Match directives are in their own files (PR #345) ## [14.1.3] 2025-01-29 From 4471777fcd93ba4349f76778e35cb28f0481f196 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 22 Apr 2025 14:35:35 -0400 Subject: [PATCH 10/52] Fix codespell check --- .github/workflows/spelling.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/spelling.yaml b/.github/workflows/spelling.yaml index 13f4e220d..8b45f7443 100644 --- a/.github/workflows/spelling.yaml +++ b/.github/workflows/spelling.yaml @@ -16,5 +16,5 @@ jobs: - uses: codespell-project/actions-codespell@master with: check_filenames: true - ignore_words_list: keypair + ignore_words_list: keypair, te only_warn: 1 From a4da6decde131f6dcf87fd2f765845b8c0f22f8f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 17 Apr 2025 11:56:01 -0400 Subject: [PATCH 11/52] Move definition of minimum instance root disk size in design module Also define a map for minimum disk per tags that used in conjunction with the minimum root disk size is used to specify the disk size of each instance in the design module. --- aws/infrastructure.tf | 3 ++- azure/infrastructure.tf | 3 ++- common/design/main.tf | 5 +++++ common/design/variables.tf | 3 ++- gcp/infrastructure.tf | 3 ++- openstack/infrastructure.tf | 5 +++-- 6 files changed, 16 insertions(+), 6 deletions(-) diff --git a/aws/infrastructure.tf b/aws/infrastructure.tf index ec5955aaf..1459b77e8 100644 --- a/aws/infrastructure.tf +++ b/aws/infrastructure.tf @@ -8,6 +8,7 @@ module "design" { cluster_name = var.cluster_name domain = var.domain instances = var.instances + min_disk_size = 20 pool = var.pool volumes = var.volumes firewall_rules = var.firewall_rules @@ -141,7 +142,7 @@ resource "aws_instance" "instances" { ebs_optimized = true root_block_device { volume_type = lookup(each.value, "disk_type", "gp2") - volume_size = lookup(each.value, "disk_size", 20) + volume_size = each.value.disk_size } tags = { diff --git a/azure/infrastructure.tf b/azure/infrastructure.tf index c7a3f8326..938e79306 100644 --- a/azure/infrastructure.tf +++ b/azure/infrastructure.tf @@ -8,6 +8,7 @@ module "design" { cluster_name = var.cluster_name domain = var.domain instances = var.instances + min_disk_size = 30 pool = var.pool volumes = var.volumes firewall_rules = var.firewall_rules @@ -73,7 +74,7 @@ resource "azurerm_linux_virtual_machine" "instances" { name = format("%s-%s-disk", var.cluster_name, each.key) caching = "ReadWrite" storage_account_type = lookup(each.value, "disk_type", "Premium_LRS") - disk_size_gb = lookup(each.value, "disk_size", 30) + disk_size_gb = each.value.disk_size } dynamic "plan" { diff --git a/common/design/main.tf b/common/design/main.tf index acb71aded..3f28c2444 100644 --- a/common/design/main.tf +++ b/common/design/main.tf @@ -5,6 +5,10 @@ data "http" "agent_ip" { locals { domain_name = "${lower(var.cluster_name)}.${lower(var.domain)}" + min_disk_size_per_tags = { + "mgmt": 20 + } + instances = merge( flatten([ for prefix, attrs in var.instances : [ @@ -14,6 +18,7 @@ locals { { prefix = prefix, specs = { for attr, value in attrs : attr => value if ! contains(["count", "tags", "image"], attr) } + disk_size = max(var.min_disk_size, [for tag in attrs.tags: local.min_disk_size_per_tags[tag]]...) }, ) } diff --git a/common/design/variables.tf b/common/design/variables.tf index 922c999c2..1aa8bbc26 100644 --- a/common/design/variables.tf +++ b/common/design/variables.tf @@ -3,4 +3,5 @@ variable "domain" { } variable "instances" { } variable "volumes" { } variable "pool" { } -variable "firewall_rules" { } \ No newline at end of file +variable "firewall_rules" { } +variable "min_disk_size" { } \ No newline at end of file diff --git a/gcp/infrastructure.tf b/gcp/infrastructure.tf index 575871be2..79a458c06 100644 --- a/gcp/infrastructure.tf +++ b/gcp/infrastructure.tf @@ -8,6 +8,7 @@ module "design" { cluster_name = var.cluster_name domain = var.domain instances = var.instances + min_disk_size = 20 pool = var.pool volumes = var.volumes firewall_rules = var.firewall_rules @@ -95,7 +96,7 @@ resource "google_compute_instance" "instances" { initialize_params { image = lookup(each.value, "image", var.image) type = lookup(each.value, "disk_type", "pd-ssd") - size = lookup(each.value, "disk_size", 20) + size = each.value.disk_size } } diff --git a/openstack/infrastructure.tf b/openstack/infrastructure.tf index 7644ff982..1f0d3d1ad 100644 --- a/openstack/infrastructure.tf +++ b/openstack/infrastructure.tf @@ -3,6 +3,7 @@ module "design" { cluster_name = var.cluster_name domain = var.domain instances = var.instances + min_disk_size = 10 pool = var.pool volumes = var.volumes firewall_rules = var.firewall_rules @@ -58,7 +59,7 @@ data "openstack_compute_flavor_v2" "flavors" { resource "openstack_compute_instance_v2" "instances" { for_each = module.design.instances_to_build name = format("%s-%s", var.cluster_name, each.key) - image_id = lookup(each.value, "disk_size", 10) > data.openstack_compute_flavor_v2.flavors[each.value.prefix].disk ? null : data.openstack_images_image_v2.image[each.value.prefix].id + image_id = each.value.disk_size > data.openstack_compute_flavor_v2.flavors[each.value.prefix].disk ? null : data.openstack_images_image_v2.image[each.value.prefix].id flavor_name = each.value.type user_data = base64gzip(module.configuration.user_data[each.key]) @@ -76,7 +77,7 @@ resource "openstack_compute_instance_v2" "instances" { } dynamic "block_device" { - for_each = lookup(each.value, "disk_size", 10) > data.openstack_compute_flavor_v2.flavors[each.value.prefix].disk ? [{ volume_size = lookup(each.value, "disk_size", 10) }] : [] + for_each = each.value.disk_size > data.openstack_compute_flavor_v2.flavors[each.value.prefix].disk ? [{ volume_size = each.value.disk_size }] : [] content { uuid = data.openstack_images_image_v2.image[each.value.prefix].id source_type = "image" From be04440e0b12930752a09db8558053336082c5cf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 17 Apr 2025 12:36:15 -0400 Subject: [PATCH 12/52] Add warnings on low disk size per tags and cloud provider --- common/design/main.tf | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/common/design/main.tf b/common/design/main.tf index 3f28c2444..81941fda9 100644 --- a/common/design/main.tf +++ b/common/design/main.tf @@ -14,11 +14,11 @@ locals { for prefix, attrs in var.instances : [ for i in range(lookup(attrs, "count", 1)) : { (format("%s%d", prefix, i + 1)) = merge( + { disk_size = max(var.min_disk_size, [for tag in attrs.tags: lookup(local.min_disk_size_per_tags, tag, 0)]...)}, { for attr, value in attrs : attr => value if ! contains(["count"], attr) }, { prefix = prefix, specs = { for attr, value in attrs : attr => value if ! contains(["count", "tags", "image"], attr) } - disk_size = max(var.min_disk_size, [for tag in attrs.tags: local.min_disk_size_per_tags[tag]]...) }, ) } @@ -78,3 +78,14 @@ locals { 0), "") } + +check "disk_space_per_tag" { + assert { + condition = alltrue(flatten([for inst in local.instances: [for tag in inst.tags: lookup(local.min_disk_size_per_tags, tag, var.min_disk_size) <= inst.disk_size ]])) + error_message = "At least one instance's disk_size is smaller than what is recommended given its set of tags.\nMininum disk size per tags: ${jsonencode(local.min_disk_size_per_tags)}" + } + assert { + condition = alltrue([for inst in local.instances: var.min_disk_size <= inst.disk_size ]) + error_message = "At least one instance's disk_size is smaller than what is recommended by the cloud provider.\nMinimum disk size for provider: ${var.min_disk_size}" + } +} From 3dbdb1cbd8c48d8b5ebadcd7d5f22c489d7b8aca Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 17 Apr 2025 15:07:39 -0400 Subject: [PATCH 13/52] Bump terraform version to 1.5.7 --- .github/workflows/release.yaml | 2 +- .github/workflows/test.yaml | 2 +- README.md | 2 +- aws/versions.tf | 2 +- azure/versions.tf | 2 +- common/configuration/versions.tf | 2 +- dns/cloudflare/versions.tf | 2 +- dns/gcloud/versions.tf | 2 +- dns/record_generator/versions.tf | 2 +- docs/README.md | 2 +- docs/developers.md | 2 +- examples/advanced/basic_puppet/openstack/main.tf | 2 +- examples/advanced/elk/openstack/main.tf | 2 +- examples/advanced/k8s/openstack/main.tf | 2 +- examples/advanced/lustre/openstack/main.tf | 2 +- examples/advanced/spark/openstack/main.tf | 2 +- examples/advanced/spot_instances/aws/main.tf | 2 +- examples/advanced/spot_instances/azure/main.tf | 2 +- examples/advanced/spot_instances/gcp/main.tf | 2 +- examples/aws/main.tf | 2 +- examples/azure/main.tf | 2 +- examples/gcp/main.tf | 2 +- examples/openstack/main.tf | 2 +- examples/ovh/main.tf | 2 +- gcp/versions.tf | 2 +- openstack/versions.tf | 2 +- ovh/versions.tf | 2 +- 27 files changed, 27 insertions(+), 27 deletions(-) diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml index c502800bb..7aa1306f5 100644 --- a/.github/workflows/release.yaml +++ b/.github/workflows/release.yaml @@ -19,7 +19,7 @@ jobs: - uses: hashicorp/setup-terraform@v1 with: - terraform_version: 1.4.0 + terraform_version: 1.5.7 - name: Create tarballs and zips if: startsWith(github.ref, 'refs/tags/') diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 61d7af8cc..70896edb3 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -11,7 +11,7 @@ on: jobs: test: env: - TF_VERSION: 1.4.0 + TF_VERSION: 1.5.7 runs-on: ubuntu-latest diff --git a/README.md b/README.md index 32e0e6d3e..2e5c39642 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ From these new possibilities emerged an open-source software project named Magic ## Setup -- Install [Terraform](https://releases.hashicorp.com/terraform/) (>= 1.4.0) +- Install [Terraform](https://releases.hashicorp.com/terraform/) (>= 1.5.7) - Download the [latest release of Magic Castle](https://github.com/ComputeCanada/magic_castle/releases) for the cloud provider you wish to use. - Uncompress the release - Follow the instructions diff --git a/aws/versions.tf b/aws/versions.tf index 4b8cfb472..f621f113d 100644 --- a/aws/versions.tf +++ b/aws/versions.tf @@ -1,6 +1,6 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { aws = { source = "hashicorp/aws" diff --git a/azure/versions.tf b/azure/versions.tf index cb2d9bae3..ce57ed791 100644 --- a/azure/versions.tf +++ b/azure/versions.tf @@ -1,4 +1,4 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } diff --git a/common/configuration/versions.tf b/common/configuration/versions.tf index adb9abffc..9d01be053 100644 --- a/common/configuration/versions.tf +++ b/common/configuration/versions.tf @@ -1,6 +1,6 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { random = { source = "hashicorp/random" diff --git a/dns/cloudflare/versions.tf b/dns/cloudflare/versions.tf index 8cf00d458..01dc1a90d 100644 --- a/dns/cloudflare/versions.tf +++ b/dns/cloudflare/versions.tf @@ -1,6 +1,6 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { cloudflare = { source = "cloudflare/cloudflare" diff --git a/dns/gcloud/versions.tf b/dns/gcloud/versions.tf index 582096c43..ec19d1607 100644 --- a/dns/gcloud/versions.tf +++ b/dns/gcloud/versions.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { google = { source = "hashicorp/google" diff --git a/dns/record_generator/versions.tf b/dns/record_generator/versions.tf index 45c61689f..4d9570d0a 100644 --- a/dns/record_generator/versions.tf +++ b/dns/record_generator/versions.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { external = { source = "hashicorp/external" diff --git a/docs/README.md b/docs/README.md index ed2cf7d06..9259371d3 100644 --- a/docs/README.md +++ b/docs/README.md @@ -6,7 +6,7 @@ To use Magic Castle you will need: -1. Terraform (>= 1.4.0) +1. Terraform (>= 1.5.7) 2. Authenticated access to a cloud 3. Ability to communicate with the cloud provider API from your computer 4. A project with operational limits meeting the requirements described in _Quotas_ subsection. diff --git a/docs/developers.md b/docs/developers.md index a0274d478..15db2cb0e 100644 --- a/docs/developers.md +++ b/docs/developers.md @@ -12,7 +12,7 @@ ## 1. Setup To develop for Magic Castle you will need: -* Terraform (>= 1.4.0) +* Terraform (>= 1.5.7) * git * Access to a Cloud (e.g.: Compute Canada Arbutus) * Ability to communicate with the cloud provider API from your computer diff --git a/examples/advanced/basic_puppet/openstack/main.tf b/examples/advanced/basic_puppet/openstack/main.tf index ee7bb1359..9700f4785 100644 --- a/examples/advanced/basic_puppet/openstack/main.tf +++ b/examples/advanced/basic_puppet/openstack/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "openstack" { diff --git a/examples/advanced/elk/openstack/main.tf b/examples/advanced/elk/openstack/main.tf index b1a4d66ca..3f3bfc9a5 100644 --- a/examples/advanced/elk/openstack/main.tf +++ b/examples/advanced/elk/openstack/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "openstack" { diff --git a/examples/advanced/k8s/openstack/main.tf b/examples/advanced/k8s/openstack/main.tf index 46b7ca223..4359aaf7d 100644 --- a/examples/advanced/k8s/openstack/main.tf +++ b/examples/advanced/k8s/openstack/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "openstack" { diff --git a/examples/advanced/lustre/openstack/main.tf b/examples/advanced/lustre/openstack/main.tf index eca16c04b..b22be985b 100644 --- a/examples/advanced/lustre/openstack/main.tf +++ b/examples/advanced/lustre/openstack/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "openstack" { diff --git a/examples/advanced/spark/openstack/main.tf b/examples/advanced/spark/openstack/main.tf index 532a29a92..edc71e57e 100644 --- a/examples/advanced/spark/openstack/main.tf +++ b/examples/advanced/spark/openstack/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "openstack" { diff --git a/examples/advanced/spot_instances/aws/main.tf b/examples/advanced/spot_instances/aws/main.tf index bd91cc078..6432f499e 100644 --- a/examples/advanced/spot_instances/aws/main.tf +++ b/examples/advanced/spot_instances/aws/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "aws" { diff --git a/examples/advanced/spot_instances/azure/main.tf b/examples/advanced/spot_instances/azure/main.tf index 3bc5bbdcf..a3f7f2051 100644 --- a/examples/advanced/spot_instances/azure/main.tf +++ b/examples/advanced/spot_instances/azure/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "azure" { diff --git a/examples/advanced/spot_instances/gcp/main.tf b/examples/advanced/spot_instances/gcp/main.tf index 7d870c7e3..2a3c98866 100644 --- a/examples/advanced/spot_instances/gcp/main.tf +++ b/examples/advanced/spot_instances/gcp/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } module "gcp" { diff --git a/examples/aws/main.tf b/examples/aws/main.tf index 12970d0a7..eb9edea30 100644 --- a/examples/aws/main.tf +++ b/examples/aws/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } variable "pool" { diff --git a/examples/azure/main.tf b/examples/azure/main.tf index 1270e23fd..f5bec05c1 100644 --- a/examples/azure/main.tf +++ b/examples/azure/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } variable "pool" { diff --git a/examples/gcp/main.tf b/examples/gcp/main.tf index 2b2ca4ac8..a2a70c3c1 100644 --- a/examples/gcp/main.tf +++ b/examples/gcp/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } variable "pool" { diff --git a/examples/openstack/main.tf b/examples/openstack/main.tf index 3312548d4..cddbf2e60 100644 --- a/examples/openstack/main.tf +++ b/examples/openstack/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } variable "pool" { diff --git a/examples/ovh/main.tf b/examples/ovh/main.tf index 7c39f881f..3242e754b 100644 --- a/examples/ovh/main.tf +++ b/examples/ovh/main.tf @@ -1,5 +1,5 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } variable "pool" { diff --git a/gcp/versions.tf b/gcp/versions.tf index cb2d9bae3..ce57ed791 100644 --- a/gcp/versions.tf +++ b/gcp/versions.tf @@ -1,4 +1,4 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" } diff --git a/openstack/versions.tf b/openstack/versions.tf index 64dd2480f..1270ee941 100644 --- a/openstack/versions.tf +++ b/openstack/versions.tf @@ -1,6 +1,6 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { openstack = { source = "terraform-provider-openstack/openstack" diff --git a/ovh/versions.tf b/ovh/versions.tf index 10c008a81..8d769ed8e 100644 --- a/ovh/versions.tf +++ b/ovh/versions.tf @@ -1,6 +1,6 @@ terraform { - required_version = ">= 1.4.0" + required_version = ">= 1.5.7" required_providers { openstack = { source = "terraform-provider-openstack/openstack" From c45ded6fb4a09581fe643bd59f0bba27d8763e81 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 17 Apr 2025 15:35:13 -0400 Subject: [PATCH 14/52] Update docs --- docs/README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/docs/README.md b/docs/README.md index 9259371d3..632ee1d2d 100644 --- a/docs/README.md +++ b/docs/README.md @@ -532,14 +532,20 @@ be leveraged to accelerate compute node configuration. | Provider | `disk_type` | `disk_size` (GiB) | | -------- | :---------- | ----------------: | | Azure |`Premium_LRS`| 30 | - | AWS | `gp2` | 10 | + | AWS | `gp2` | 20 | | GCP | `pd-ssd` | 20 | | OpenStack| `null` | 10 | | OVH | `null` | 10 | 4. `disk_size`: size in gibibytes (GiB) of the instance's root disk containing -the operating system and service software -(default: see the previous table). +the operating system and service software. The default value is computed has the +maximum between the cloud provider default size (see previous table) and the +recommended minimum size per tag as specified in the following table. + + | Tag | min `disk_size` (GiB) | + | -------- | --------------------: | + | `mgmt` | 20 | + 5. `mig`: map of [NVIDIA Multi-Instance GPU (MIG)](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html) short profile names and count used to partition the instances' GPU, example for an A100: ``` mig = { "1g.5gb" = 2, "2g.10gb" = 1, "3g.20gb" = 1 } From 54a89f4e2a34ca920bf1ae83a5666d7bb0942131 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 10:36:39 -0400 Subject: [PATCH 15/52] Make mkdocs display warning for invalid anchor links --- docs/README.md | 2 +- mkdocs.yml | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 632ee1d2d..16f06460d 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1413,7 +1413,7 @@ By default, instances tagged `login` have their port 22 opened to entire world. If you know the range of ip addresses that will connect to your cluster, we strongly recommend that you limit the access to port 22 to this range. -To limit the access to port 22, refer to [section 4.14 firewall_rules](#414-firewall_rules-optional), +To limit the access to port 22, refer to [section 4.16 firewall_rules](#416-firewall_rules-optional), and replace the `cidr` of the `ssh` rule to match the range of ip addresses that have be the allowed to connect to the cluster. If there are more than one range, create multiple rules with distinct names. diff --git a/mkdocs.yml b/mkdocs.yml index fbc14f626..31549ab85 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,4 +1,6 @@ site_name: Magic Castle +validation: + anchors: warn theme: name: material logo: img/logo.png From 6604eb5ddc716ff62de36b26274c7df606105af8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 11:12:35 -0400 Subject: [PATCH 16/52] Fix anchor link in docs --- docs/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 16f06460d..0f977cee6 100644 --- a/docs/README.md +++ b/docs/README.md @@ -498,7 +498,7 @@ instance, while in Puppet code tags are used to identify roles of the instances. Terraform tags: - `login`: identify instances accessible with SSH from Internet and pointed by the domain name A records -- `pool`: identify instances created only when their hostname appears in the [`var.pool`](#417-pool-optional) list. +- `pool`: identify instances created only when their hostname appears in the [`var.pool`](#419-pool-optional) list. - `proxy`: identify instances accessible with HTTP/HTTPS and pointed by the vhost A records - `public`: identify instances that need to have a public ip address reachable from Internet - `puppet`: identify instances configured as Puppet servers From 5ab29def8423ffcae436502073e623abbb7a4dae Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 11:18:45 -0400 Subject: [PATCH 17/52] Fix developer doc anchor link --- docs/developers.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/developers.md b/docs/developers.md index 15db2cb0e..6942598b3 100644 --- a/docs/developers.md +++ b/docs/developers.md @@ -16,7 +16,7 @@ To develop for Magic Castle you will need: * git * Access to a Cloud (e.g.: Compute Canada Arbutus) * Ability to communicate with the cloud provider API from your computer -* A cloud project with enough room for the resource described in section [Magic Caslte Doc 1.1](README.md#11-quotas). +* A cloud project with enough room for the resources described in [section 1.4](README.md#14-quotas). * [optional] [Puppet Development Kit (PDK)](https://www.puppet.com/docs/pdk/latest/pdk.html) ## 2. Where to start From 363c61cd7f667d61df5e9e0d0283921e2137a71a Mon Sep 17 00:00:00 2001 From: Samuel Richard <81189385+Scirelgar@users.noreply.github.com> Date: Fri, 28 Mar 2025 14:03:30 -0400 Subject: [PATCH 18/52] Add Trivy misconfiguration scan github workflow --- .github/workflows/trivy_scan.yaml | 51 +++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 .github/workflows/trivy_scan.yaml diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml new file mode 100644 index 000000000..0d56de7f1 --- /dev/null +++ b/.github/workflows/trivy_scan.yaml @@ -0,0 +1,51 @@ +name: Trivy Vulnerabilities Scan + +on: + pull_request: + +jobs: + trivy-vuln-scan: + name: Running Trivy Scan + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Resolve symbolic links + run: | + rm {aws,azure,gcp,openstack}/{outputs.tf,variables.tf} + for cloud in aws azure gcp openstack; do + cp common/outputs.tf common/variables.tf $cloud/; + done + + - name: Manual Trivy Setup + uses: aquasecurity/setup-trivy@v0.2.2 + with: + version: v0.61.1 + cache: true + + - name: Run Trivy on providers + shell: bash + run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples --format json -o trivy-results.json . + + - name: Convert Trivy JSON output to SARIF + run: trivy convert --format sarif --output trivy-results.sarif trivy-results.json + + - name: Upload Trivy scan results to GitHub Security tab + uses: github/codeql-action/upload-sarif@v3 + with: + sarif_file: "trivy-results.sarif" + + - name: Publish Trivy Output to Summary + run: | + if [[ -s trivy-results.json ]]; then + { + echo "### Security Output" + echo "
Click to expand" + echo "" + echo '```terraform' + cat trivy-results.json + echo '```' + echo "
" + } >> $GITHUB_STEP_SUMMARY + fi From 6d5ba7e381a4f6ba60a6c7a69b26fab3e57154bc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 14:30:14 -0400 Subject: [PATCH 19/52] Add examples to scanning (excluding advanced) --- .github/workflows/trivy_scan.yaml | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 0d56de7f1..8fa9ca3a2 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -11,12 +11,15 @@ jobs: - name: Checkout code uses: actions/checkout@v4 - - name: Resolve symbolic links + - name: Resolve symbolic links and fix source run: | rm {aws,azure,gcp,openstack}/{outputs.tf,variables.tf} for cloud in aws azure gcp openstack; do cp common/outputs.tf common/variables.tf $cloud/; - done + done + for example in examples/*/*.tf; do + sed 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' $example + done - name: Manual Trivy Setup uses: aquasecurity/setup-trivy@v0.2.2 @@ -26,7 +29,7 @@ jobs: - name: Run Trivy on providers shell: bash - run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples --format json -o trivy-results.json . + run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced --format json -o trivy-results.json . - name: Convert Trivy JSON output to SARIF run: trivy convert --format sarif --output trivy-results.sarif trivy-results.json From e7338fa679e3dc0b7f369f7d12869f6a47e39e6e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 14:31:11 -0400 Subject: [PATCH 20/52] Change trivy workflow name --- .github/workflows/trivy_scan.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 8fa9ca3a2..56d56d550 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -1,4 +1,4 @@ -name: Trivy Vulnerabilities Scan +name: Trivy Misconfiguration Scan on: pull_request: @@ -18,7 +18,7 @@ jobs: cp common/outputs.tf common/variables.tf $cloud/; done for example in examples/*/*.tf; do - sed 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' $example + sed -i 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' $example done - name: Manual Trivy Setup From 31b42be8b56456dfefe1f8b153bf57d146a14881 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 15:07:06 -0400 Subject: [PATCH 21/52] Add .trivyignore file --- .trivyignore | 6 ++++++ 1 file changed, 6 insertions(+) create mode 100644 .trivyignore diff --git a/.trivyignore b/.trivyignore new file mode 100644 index 000000000..39e04c9c1 --- /dev/null +++ b/.trivyignore @@ -0,0 +1,6 @@ +# Some instance should have public ip addresses +AVD-GCP-0031 + +# Magic Castle does not handle VPC flow logs +AVD-GCP-0029 +AVD-AWS-0178 From f2098c084c9663ea2706ccdf55f93ac160ed6b39 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 15:35:28 -0400 Subject: [PATCH 22/52] Filter sarif duplicated results --- .github/workflows/trivy_scan.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 56d56d550..3facb5fea 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -31,8 +31,8 @@ jobs: shell: bash run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced --format json -o trivy-results.json . - - name: Convert Trivy JSON output to SARIF - run: trivy convert --format sarif --output trivy-results.sarif trivy-results.json + - name: Convert Trivy JSON output to SARIF and filter duplicated results + run: trivy convert --format sarif trivy-results.json | jq 'reduce .runs[0].results[] as $a ([]; if IN(.[]; $a) then . else . += [$a] end)' > trivy-results.sarif - name: Upload Trivy scan results to GitHub Security tab uses: github/codeql-action/upload-sarif@v3 From 0204d5ad38bbda0395121a7e334f4d3db3f69f6f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 16:04:06 -0400 Subject: [PATCH 23/52] Improve sarif filtering --- .github/workflows/trivy_scan.yaml | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 3facb5fea..5f916e869 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -32,7 +32,14 @@ jobs: run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced --format json -o trivy-results.json . - name: Convert Trivy JSON output to SARIF and filter duplicated results - run: trivy convert --format sarif trivy-results.json | jq 'reduce .runs[0].results[] as $a ([]; if IN(.[]; $a) then . else . += [$a] end)' > trivy-results.sarif + run: | + trivy convert --format sarif trivy-results.json --output trivy-results.sarif + # When converting from JSON to SARIF, some information, like origin of the misconfiguration, is lost. + # The lost information results in duplicated issues. We filter these issues with jq and create a new + # sarif file that will be uploaded to the security tab. + jq 'reduce .runs[0].results[] as $a ([]; if IN(.[]; $a) then . else . += [$a] end)' trivy-results.sarif > trivy-results-filtered.sarif + jq ".runs[0].results |= $(cat trivy-results-filtered.sarif)" trivy-results.sarif > trivy-results-final.sarif + mv trivy-results-final.sarif trivy-results.sarif - name: Upload Trivy scan results to GitHub Security tab uses: github/codeql-action/upload-sarif@v3 From c2abdc226ef88356a252c11d7ea87e0bfd998abb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 24 Apr 2025 15:04:16 -0400 Subject: [PATCH 24/52] Change output format --- .github/workflows/trivy_scan.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 5f916e869..49fbd2286 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -54,7 +54,7 @@ jobs: echo "
Click to expand" echo "" echo '```terraform' - cat trivy-results.json + trivy convert --format table trivy-results.json echo '```' echo "
" } >> $GITHUB_STEP_SUMMARY From 7c6874f3f581808d77557c6b8ae99d9c4804215c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 16:48:56 -0400 Subject: [PATCH 25/52] Change highlight language in security summary --- .github/workflows/trivy_scan.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 49fbd2286..949e5a1d6 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -53,7 +53,7 @@ jobs: echo "### Security Output" echo "
Click to expand" echo "" - echo '```terraform' + echo '```bash' trivy convert --format table trivy-results.json echo '```' echo "
" From ad444064909eb46fb7d1315159772a1e7252786f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 23 Apr 2025 16:55:32 -0400 Subject: [PATCH 26/52] Add on push any branch to trivy scan --- .github/workflows/trivy_scan.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 949e5a1d6..3481dd7ef 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -1,7 +1,10 @@ name: Trivy Misconfiguration Scan on: + push: pull_request: + branches: + - main jobs: trivy-vuln-scan: From a39becb7911c80b3ed1b0156bc7f247116bc3f2c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 24 Apr 2025 09:55:17 -0400 Subject: [PATCH 27/52] Change output format for trivy in action --- .github/workflows/trivy_scan.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 3481dd7ef..969b0f213 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -53,10 +53,10 @@ jobs: run: | if [[ -s trivy-results.json ]]; then { - echo "### Security Output" + echo "### Trivy Misconfiguration Scan Output" echo "
Click to expand" echo "" - echo '```bash' + echo '```console' trivy convert --format table trivy-results.json echo '```' echo "
" From bff00179b4fa2890ff3cd1c27bed6c3dd6101ba5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 24 Apr 2025 10:09:29 -0400 Subject: [PATCH 28/52] Add trivy command --- .github/workflows/trivy_scan.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 969b0f213..9d482bbdd 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -57,6 +57,7 @@ jobs: echo "
Click to expand" echo "" echo '```console' + echo '$ trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced .' trivy convert --format table trivy-results.json echo '```' echo "
" From de698ae70a7e1dc592c1ae8c591ecdd9092ea13a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 24 Apr 2025 16:31:08 -0400 Subject: [PATCH 29/52] Add documentation on trivy --- docs/README.md | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 0f977cee6..4fe8ebdb8 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1188,7 +1188,7 @@ described by the `main.tf` configuration file. Terraform should now be able to communicate with your cloud provider. To test your configuration file, enter the following command ``` -terraform plan +terraform plan -out tfplan ``` This command will validate the syntax of your configuration file and @@ -1197,6 +1197,49 @@ is only a dry-run. If Terraform does not report any error, you can move to the next step. Otherwise, read the errors and fix your configuration file accordingly. +### 7.1 Scanning plan for misconfiguration (optional) + +[Trivy](https://trivy.dev/latest/) is an open source security scanner +that scans Terraform files (code and plan) and reports about potential issues. +Magic Castle development team has integrated Trivy in its +[CI/CD pipeline](https://github.com/ComputeCanada/magic_castle/blob/main/.github/workflows/trivy_scan.yaml) +to prevent misconfiguration and security issues that could be introduced +by commits or a pull-requests. You too can use Trivy to verify your Terraform plan +before applying it. + +After [installing Trivy](https://trivy.dev/latest/getting-started/), you can +scan the Terraform plan produced in section 7, like this: +``` +trivy conf tfplan +``` + +Trivy then produces a report about configuration issues like this: +```console +AVD-OPNSTK-0003 (MEDIUM): Security group rule allows ingress to multiple public addresses. +═════════════════════════════════════════════════════════════════════════ +Opening up ports to the public internet is generally to be avoided. You should +restrict access to IP addresses or ranges that explicitly require it where possible. + +See https://avd.aquasec.com/misconfig/avd-opnstk-0003 +───────────────────────────────────────────────────────────────────────── + ./openstack/openstack/network-2.tf:53 + via ./openstack/openstack/network-2.tf:45-56 (openstack_networking_secgroup_rule_v2.rule["ssh"]) + via main.tf:10-44 (module.openstack) +───────────────────────────────────────────────────────────────────────── + 45 resource openstack_networking_secgroup_rule_v2 "rule" { + .. + 53 [ remote_ip_prefix = each.value.cidr + .. + 56 } +───────────────────────────────────────────────────────────────────────── +``` + +The most common configuration issues identified by Trivy in Magic Castle plans +(illustrated in the previous output example), are firewall rules allowing access to port from +public internet. If you know which IP addresses should have access to the cluster, +you can harden the firewall rules. Refer to section [4.16 firewall_rules](#416-firewall_rules-optional) +for more information. + ## 8. Deployment To create the resources defined by your main, enter the following command From 32a17b3d44160cf6c3d45b31d49b673befa71c9b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 12:09:23 -0400 Subject: [PATCH 30/52] Replace steps by strategy matrix in terraform CI/CD --- .github/actions/test_provider/action.yaml | 11 ++-- .github/workflows/test.yaml | 80 ++++++----------------- 2 files changed, 24 insertions(+), 67 deletions(-) diff --git a/.github/actions/test_provider/action.yaml b/.github/actions/test_provider/action.yaml index 2eb1f0c90..4c0970e28 100644 --- a/.github/actions/test_provider/action.yaml +++ b/.github/actions/test_provider/action.yaml @@ -4,15 +4,14 @@ inputs: provider: description: 'name of the provider' required: true - path: - description: 'path to terraform' + runs: using: "composite" steps: - - run: ${{ inputs.path }}/terraform -chdir=${{ inputs.provider }} init + - run: terraform -chdir=${{ inputs.provider }} init shell: bash id: init - - run: ${{ inputs.path }}/terraform -chdir=${{ inputs.provider }} validate + - run: terraform -chdir=${{ inputs.provider }} validate shell: bash id: validate - run: find examples -name ${{ inputs.provider }} -type d -not -path '*/\.*' @@ -21,9 +20,9 @@ runs: - run: sed -E -i 's;(source)\s*=.*${{ inputs.provider }}.*;\1 = "../../${{ inputs.provider }}";g' examples/${{ inputs.provider }}/main.tf; shell: bash id: sed-example - - run: ${{ inputs.path }}/terraform -chdir=examples/${{ inputs.provider }} init + - run: terraform -chdir=examples/${{ inputs.provider }} init shell: bash id: init-example - - run: ${{ inputs.path }}/terraform -chdir=examples/${{ inputs.provider }} validate + - run: terraform -chdir=examples/${{ inputs.provider }} validate shell: bash id: validate-example diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 70896edb3..e7624e461 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -9,72 +9,30 @@ on: - main jobs: - test: - env: - TF_VERSION: 1.5.7 - + test_cloud_provider: runs-on: ubuntu-latest - + strategy: + matrix: + provider: ['aws', 'azure', 'gcp', 'openstack', 'ovh'] steps: - name: Checkout code uses: actions/checkout@main - - - name: Cache Terraform - id: cache-terraform - uses: actions/cache@v4 + - uses: hashicorp/setup-terraform@v3 with: - path: ~/bin - key: terraform-${{ env.TF_VERSION }} - - - name: Download terraform - if: steps.cache-terraform.outputs.cache-hit != 'true' - run: | - mkdir -p "${HOME}/bin" - curl -sSL -o terraform.zip "https://releases.hashicorp.com/terraform/${TF_VERSION}/terraform_${TF_VERSION}_linux_amd64.zip" - unzip terraform.zip - mv -v terraform "${HOME}/bin/terraform" - ~/bin/terraform version - + terraform_version: "1.5.7" - name: Create SSH keys - run: | - ssh-keygen -b 2048 -t rsa -q -N "" -f ~/.ssh/id_rsa - - - name: Test AWS - uses: ./.github/actions/test_provider - with: - path: ~/bin - provider: 'aws' - - - name: Test Azure - uses: ./.github/actions/test_provider - with: - path: ~/bin - provider: 'azure' - - - name: Test GCP - uses: ./.github/actions/test_provider - with: - path: ~/bin - provider: 'gcp' - - - name: Test OpenStack + run: ssh-keygen -b 2048 -t rsa -q -N "" -f ~/.ssh/id_rsa + - name: Test ${{ matrix.provider }} uses: ./.github/actions/test_provider with: - path: ~/bin - provider: 'openstack' - - - name: Test OVH - uses: ./.github/actions/test_provider - with: - path: ~/bin - provider: 'ovh' - - - name: Test CloudFlare DNS - run: | - ~/bin/terraform -chdir=dns/cloudflare init - ~/bin/terraform -chdir=dns/cloudflare validate - - - name: Test Google Cloud DNS - run: | - ~/bin/terraform -chdir=dns/gcloud init - ~/bin/terraform -chdir=dns/gcloud validate + provider: ${{ matrix.provider }} + test_dns_provider: + runs-on: ubuntu-latest + strategy: + matrix: + provider: ['cloudflare', 'gcloud', 'txt'] + steps: + - name: Init ${{ matrix.provider }} DNS + run: terraform -chdir=dns/${{ matrix.provider }} init + - name: Validate ${{ matrix.provider }} DNS + run: terraform -chdir=dns/${{ matrix.provider }} validate From 98654032b7193e2f53f59df5e4d25fb1cae76860 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 12:14:02 -0400 Subject: [PATCH 31/52] Add missing terraform setup to dns test --- .github/workflows/test.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index e7624e461..679706187 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -32,6 +32,11 @@ jobs: matrix: provider: ['cloudflare', 'gcloud', 'txt'] steps: + - name: Checkout code + uses: actions/checkout@main + - uses: hashicorp/setup-terraform@v3 + with: + terraform_version: "1.5.7" - name: Init ${{ matrix.provider }} DNS run: terraform -chdir=dns/${{ matrix.provider }} init - name: Validate ${{ matrix.provider }} DNS From 6c0116b6e93e4f918c6b332fb59677978640f226 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 14:43:55 -0400 Subject: [PATCH 32/52] Remove shell bash from trivy run --- .github/workflows/trivy_scan.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml index 9d482bbdd..398695b61 100644 --- a/.github/workflows/trivy_scan.yaml +++ b/.github/workflows/trivy_scan.yaml @@ -31,7 +31,6 @@ jobs: cache: true - name: Run Trivy on providers - shell: bash run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced --format json -o trivy-results.json . - name: Convert Trivy JSON output to SARIF and filter duplicated results From 4d2e4fb58a71a5286ca1fb785435d4dfa1812eaa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:02:18 -0400 Subject: [PATCH 33/52] Merge trivy and terraform testing job --- .github/workflows/test.yaml | 55 ++++++++++++++++++++++++++ .github/workflows/trivy_scan.yaml | 64 ------------------------------- 2 files changed, 55 insertions(+), 64 deletions(-) delete mode 100644 .github/workflows/trivy_scan.yaml diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 679706187..743208f70 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -41,3 +41,58 @@ jobs: run: terraform -chdir=dns/${{ matrix.provider }} init - name: Validate ${{ matrix.provider }} DNS run: terraform -chdir=dns/${{ matrix.provider }} validate + trivy-vuln-scan: + name: Running Trivy Scan + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Resolve symbolic links and fix source + run: | + rm {aws,azure,gcp,openstack}/{outputs.tf,variables.tf} + for cloud in aws azure gcp openstack; do + cp common/outputs.tf common/variables.tf $cloud/; + done + for example in examples/*/*.tf; do + sed -i 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' $example + done + + - name: Manual Trivy Setup + uses: aquasecurity/setup-trivy@v0.2.2 + with: + version: v0.61.1 + cache: true + + - name: Run Trivy on providers + run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced --format json -o trivy-results.json . + + - name: Convert Trivy JSON output to SARIF and filter duplicated results + run: | + trivy convert --format sarif trivy-results.json --output trivy-results.sarif + # When converting from JSON to SARIF, some information, like origin of the misconfiguration, is lost. + # The lost information results in duplicated issues. We filter these issues with jq and create a new + # sarif file that will be uploaded to the security tab. + jq 'reduce .runs[0].results[] as $a ([]; if IN(.[]; $a) then . else . += [$a] end)' trivy-results.sarif > trivy-results-filtered.sarif + jq ".runs[0].results |= $(cat trivy-results-filtered.sarif)" trivy-results.sarif > trivy-results-final.sarif + mv trivy-results-final.sarif trivy-results.sarif + + - name: Upload Trivy scan results to GitHub Security tab + uses: github/codeql-action/upload-sarif@v3 + with: + sarif_file: "trivy-results.sarif" + + - name: Publish Trivy Output to Summary + run: | + if [[ -s trivy-results.json ]]; then + { + echo "### Trivy Misconfiguration Scan Output" + echo "
Click to expand" + echo "" + echo '```console' + echo '$ trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced .' + trivy convert --format table trivy-results.json + echo '```' + echo "
" + } >> $GITHUB_STEP_SUMMARY + fi diff --git a/.github/workflows/trivy_scan.yaml b/.github/workflows/trivy_scan.yaml deleted file mode 100644 index 398695b61..000000000 --- a/.github/workflows/trivy_scan.yaml +++ /dev/null @@ -1,64 +0,0 @@ -name: Trivy Misconfiguration Scan - -on: - push: - pull_request: - branches: - - main - -jobs: - trivy-vuln-scan: - name: Running Trivy Scan - runs-on: ubuntu-latest - steps: - - name: Checkout code - uses: actions/checkout@v4 - - - name: Resolve symbolic links and fix source - run: | - rm {aws,azure,gcp,openstack}/{outputs.tf,variables.tf} - for cloud in aws azure gcp openstack; do - cp common/outputs.tf common/variables.tf $cloud/; - done - for example in examples/*/*.tf; do - sed -i 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' $example - done - - - name: Manual Trivy Setup - uses: aquasecurity/setup-trivy@v0.2.2 - with: - version: v0.61.1 - cache: true - - - name: Run Trivy on providers - run: trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced --format json -o trivy-results.json . - - - name: Convert Trivy JSON output to SARIF and filter duplicated results - run: | - trivy convert --format sarif trivy-results.json --output trivy-results.sarif - # When converting from JSON to SARIF, some information, like origin of the misconfiguration, is lost. - # The lost information results in duplicated issues. We filter these issues with jq and create a new - # sarif file that will be uploaded to the security tab. - jq 'reduce .runs[0].results[] as $a ([]; if IN(.[]; $a) then . else . += [$a] end)' trivy-results.sarif > trivy-results-filtered.sarif - jq ".runs[0].results |= $(cat trivy-results-filtered.sarif)" trivy-results.sarif > trivy-results-final.sarif - mv trivy-results-final.sarif trivy-results.sarif - - - name: Upload Trivy scan results to GitHub Security tab - uses: github/codeql-action/upload-sarif@v3 - with: - sarif_file: "trivy-results.sarif" - - - name: Publish Trivy Output to Summary - run: | - if [[ -s trivy-results.json ]]; then - { - echo "### Trivy Misconfiguration Scan Output" - echo "
Click to expand" - echo "" - echo '```console' - echo '$ trivy config --misconfig-scanners terraform --tf-exclude-downloaded-modules --skip-dirs examples/advanced .' - trivy convert --format table trivy-results.json - echo '```' - echo "
" - } >> $GITHUB_STEP_SUMMARY - fi From 65da305c8ea28ce379a40190be193e89cf223919 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:12:53 -0400 Subject: [PATCH 34/52] Normalize checkout version --- .github/workflows/test.yaml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 743208f70..695e31cff 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -16,7 +16,7 @@ jobs: provider: ['aws', 'azure', 'gcp', 'openstack', 'ovh'] steps: - name: Checkout code - uses: actions/checkout@main + uses: actions/checkout@v4 - uses: hashicorp/setup-terraform@v3 with: terraform_version: "1.5.7" @@ -26,6 +26,7 @@ jobs: uses: ./.github/actions/test_provider with: provider: ${{ matrix.provider }} + test_dns_provider: runs-on: ubuntu-latest strategy: @@ -33,7 +34,7 @@ jobs: provider: ['cloudflare', 'gcloud', 'txt'] steps: - name: Checkout code - uses: actions/checkout@main + uses: actions/checkout@v4 - uses: hashicorp/setup-terraform@v3 with: terraform_version: "1.5.7" @@ -41,6 +42,7 @@ jobs: run: terraform -chdir=dns/${{ matrix.provider }} init - name: Validate ${{ matrix.provider }} DNS run: terraform -chdir=dns/${{ matrix.provider }} validate + trivy-vuln-scan: name: Running Trivy Scan runs-on: ubuntu-latest From e5ae1c264c6cdb73db0279303353b30c7eb3d574 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:39:38 -0400 Subject: [PATCH 35/52] Remove usage of test_provider --- .github/workflows/test.yaml | 46 ++++++++++++++++++++++++------------- 1 file changed, 30 insertions(+), 16 deletions(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 695e31cff..ee1a438ae 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -9,25 +9,20 @@ on: - main jobs: - test_cloud_provider: + validate_cloud_providers: runs-on: ubuntu-latest strategy: matrix: provider: ['aws', 'azure', 'gcp', 'openstack', 'ovh'] steps: - - name: Checkout code - uses: actions/checkout@v4 + - uses: actions/checkout@v4 - uses: hashicorp/setup-terraform@v3 with: terraform_version: "1.5.7" - - name: Create SSH keys - run: ssh-keygen -b 2048 -t rsa -q -N "" -f ~/.ssh/id_rsa - - name: Test ${{ matrix.provider }} - uses: ./.github/actions/test_provider - with: - provider: ${{ matrix.provider }} + - run: terraform -chdir=${{ matrix.provider }} init + - run: terraform -chdir=${{ matrix.provider }} validate - test_dns_provider: + validate_dns_providers: runs-on: ubuntu-latest strategy: matrix: @@ -38,17 +33,36 @@ jobs: - uses: hashicorp/setup-terraform@v3 with: terraform_version: "1.5.7" - - name: Init ${{ matrix.provider }} DNS - run: terraform -chdir=dns/${{ matrix.provider }} init - - name: Validate ${{ matrix.provider }} DNS - run: terraform -chdir=dns/${{ matrix.provider }} validate + - run: terraform -chdir=dns/${{ matrix.provider }} init + - run: terraform -chdir=dns/${{ matrix.provider }} validate + + validate_examples: + runs-on: ubuntu-latest + strategy: + matrix: + example: + - aws + - azure + - gcp + - openstack + - ovh + - advanced/spot_instance/aws + steps: + - uses: actions/checkout@v4 + - uses: hashicorp/setup-terraform@v3 + with: + terraform_version: "1.5.7" + - name: Generate an SSH key + run: ssh-keygen -b 2048 -t rsa -q -N "" -f ~/.ssh/id_rsa + - run: sed -i "s;git::${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}//;../../;g" examples/${{ matrix.example }}/main.tf; + - run: terraform -chdir=examples/${{ matrix.example }} init + - run: terraform -chdir=examples/${{ matrix.example }} validate trivy-vuln-scan: name: Running Trivy Scan runs-on: ubuntu-latest steps: - - name: Checkout code - uses: actions/checkout@v4 + - uses: actions/checkout@v4 - name: Resolve symbolic links and fix source run: | From 90f72b451c989692552f22f1962ec00ea68ddaad Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:40:19 -0400 Subject: [PATCH 36/52] Remove action test_provider no longer used --- .github/actions/test_provider/action.yaml | 28 ----------------------- 1 file changed, 28 deletions(-) delete mode 100644 .github/actions/test_provider/action.yaml diff --git a/.github/actions/test_provider/action.yaml b/.github/actions/test_provider/action.yaml deleted file mode 100644 index 4c0970e28..000000000 --- a/.github/actions/test_provider/action.yaml +++ /dev/null @@ -1,28 +0,0 @@ -name: 'Test provider' -description: 'Try to initialize a Magic Castle provider folder' -inputs: - provider: - description: 'name of the provider' - required: true - -runs: - using: "composite" - steps: - - run: terraform -chdir=${{ inputs.provider }} init - shell: bash - id: init - - run: terraform -chdir=${{ inputs.provider }} validate - shell: bash - id: validate - - run: find examples -name ${{ inputs.provider }} -type d -not -path '*/\.*' - shell: bash - id: find-examples - - run: sed -E -i 's;(source)\s*=.*${{ inputs.provider }}.*;\1 = "../../${{ inputs.provider }}";g' examples/${{ inputs.provider }}/main.tf; - shell: bash - id: sed-example - - run: terraform -chdir=examples/${{ inputs.provider }} init - shell: bash - id: init-example - - run: terraform -chdir=examples/${{ inputs.provider }} validate - shell: bash - id: validate-example From b051f17c76ba20b4045bac9bff2a9a44bc2e02e6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:44:57 -0400 Subject: [PATCH 37/52] Fix example name --- .github/workflows/test.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index ee1a438ae..aa0f0b103 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -46,7 +46,7 @@ jobs: - gcp - openstack - ovh - - advanced/spot_instance/aws + - advanced/spot_instances/aws steps: - uses: actions/checkout@v4 - uses: hashicorp/setup-terraform@v3 From 747628523cebe59c450e8be183f54f0e92d1757e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:48:06 -0400 Subject: [PATCH 38/52] Fix sed --- .github/workflows/test.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index aa0f0b103..3c31c141d 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -54,7 +54,7 @@ jobs: terraform_version: "1.5.7" - name: Generate an SSH key run: ssh-keygen -b 2048 -t rsa -q -N "" -f ~/.ssh/id_rsa - - run: sed -i "s;git::${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}//;../../;g" examples/${{ matrix.example }}/main.tf; + - run: sed -i "s;git::${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git//;../../;g" examples/${{ matrix.example }}/main.tf; - run: terraform -chdir=examples/${{ matrix.example }} init - run: terraform -chdir=examples/${{ matrix.example }} validate From 52a05b157087c5c6e2746cc7bb4808780bddd2d8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 15:59:27 -0400 Subject: [PATCH 39/52] Remove advanced example for now --- .github/workflows/test.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 3c31c141d..7a3c33028 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -46,7 +46,7 @@ jobs: - gcp - openstack - ovh - - advanced/spot_instances/aws + # - advanced/spot_instances/aws steps: - uses: actions/checkout@v4 - uses: hashicorp/setup-terraform@v3 From 18caf3982dd7c5eb3d73d484261a544d49181f4b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 16:05:32 -0400 Subject: [PATCH 40/52] Disable fail-fast in github action --- .github/workflows/test.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 7a3c33028..69baecc9e 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -12,6 +12,7 @@ jobs: validate_cloud_providers: runs-on: ubuntu-latest strategy: + fail-fast: false matrix: provider: ['aws', 'azure', 'gcp', 'openstack', 'ovh'] steps: @@ -25,6 +26,7 @@ jobs: validate_dns_providers: runs-on: ubuntu-latest strategy: + fail-fast: false matrix: provider: ['cloudflare', 'gcloud', 'txt'] steps: @@ -39,6 +41,7 @@ jobs: validate_examples: runs-on: ubuntu-latest strategy: + fail-fast: false matrix: example: - aws From df27d7e0c253dc539ce9e533ac7e8694427e5fe6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 16:17:57 -0400 Subject: [PATCH 41/52] Limit execution of workflows on paths for docs and terraform --- .github/workflows/mkdocs_test.yml | 9 ++++++++- .github/workflows/test.yaml | 20 ++++++++++++++++++-- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/.github/workflows/mkdocs_test.yml b/.github/workflows/mkdocs_test.yml index ce7a9e945..c7405949a 100644 --- a/.github/workflows/mkdocs_test.yml +++ b/.github/workflows/mkdocs_test.yml @@ -1,6 +1,13 @@ # documentation: https://help.github.com/en/articles/workflow-syntax-for-github-actions name: build documentation -on: [push, pull_request] +on: + push: + paths: + - docs/* + pull_request: + paths: + - docs/* + # Declare default permissions as read only. permissions: read-all jobs: diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 69baecc9e..18079c173 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -2,11 +2,27 @@ name: Validate Terraform code on: push: - branches: - - '*' + paths: + - aws/* + - azure/* + - common/* + - dns/* + - examples/* + - openstack/* + - ovh/* + - .github/workflows/test.yaml pull_request: branches: - main + paths: + - aws/* + - azure/* + - common/* + - dns/* + - examples/* + - openstack/* + - ovh/* + - .github/workflows/test.yaml jobs: validate_cloud_providers: From 89eaeba97526b100e61700193f3aedf5c66246ea Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Fri, 25 Apr 2025 16:26:39 -0400 Subject: [PATCH 42/52] Drop usage of github-action-markdown-link-check It is deprecated and we used it only a small number of files --- .github/workflows/docs.yaml | 24 ------------------------ .github/workflows/mlc_config.json | 14 -------------- 2 files changed, 38 deletions(-) delete mode 100644 .github/workflows/docs.yaml delete mode 100644 .github/workflows/mlc_config.json diff --git a/.github/workflows/docs.yaml b/.github/workflows/docs.yaml deleted file mode 100644 index 843696c1d..000000000 --- a/.github/workflows/docs.yaml +++ /dev/null @@ -1,24 +0,0 @@ -name: Check Markdown links - -on: - push: - branches: - - main - pull_request: - branches: - - main - schedule: - - cron: "0 9 * * *" - -jobs: - markdown-link-check: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@master - - uses: gaurav-nelson/github-action-markdown-link-check@v1 - with: - config-file: './.github/workflows/mlc_config.json' - use-quiet-mode: 'yes' - use-verbose-mode: 'yes' - folder-path: '.' - file-path: './README.md, ./CHANGELOG.md, ./LICENSE' \ No newline at end of file diff --git a/.github/workflows/mlc_config.json b/.github/workflows/mlc_config.json deleted file mode 100644 index 95cf07e9e..000000000 --- a/.github/workflows/mlc_config.json +++ /dev/null @@ -1,14 +0,0 @@ -{ - "ignorePatterns": [ - { - "pattern": "^https://dash.cloudflare.com" - } - ], - "replacementPatterns": [ - { - "pattern": "^/", - "replacement": "https://github.com/ComputeCanada/magic_castle/tree/main/" - } - ], - "aliveStatusCodes": [200, 206, 429] -} \ No newline at end of file From b69968dfbc70b1911105dfeefe67d539cdd3b5e7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 29 Apr 2025 12:58:05 -0400 Subject: [PATCH 43/52] Update action versions --- .github/workflows/release.yaml | 4 ++-- .github/workflows/spelling.yaml | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml index 7aa1306f5..7ea387ffb 100644 --- a/.github/workflows/release.yaml +++ b/.github/workflows/release.yaml @@ -10,14 +10,14 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout code - uses: actions/checkout@main + uses: actions/checkout@v4 - name: Retrieve tag name id: tag_name run: | echo ::set-output name=SOURCE_TAG::${GITHUB_REF#refs/tags/} - - uses: hashicorp/setup-terraform@v1 + - uses: hashicorp/setup-terraform@v3 with: terraform_version: 1.5.7 diff --git a/.github/workflows/spelling.yaml b/.github/workflows/spelling.yaml index 8b45f7443..f1d8ea789 100644 --- a/.github/workflows/spelling.yaml +++ b/.github/workflows/spelling.yaml @@ -12,8 +12,8 @@ jobs: codespell: runs-on: ubuntu-latest steps: - - uses: actions/checkout@master - - uses: codespell-project/actions-codespell@master + - uses: actions/checkout@v4 + - uses: codespell-project/actions-codespell@v2.1 with: check_filenames: true ignore_words_list: keypair, te From fd2627c1c31cd7e79f6a24de130e5d0b46593e87 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 29 Apr 2025 12:58:44 -0400 Subject: [PATCH 44/52] Run trivy action only if provider and example code is valid --- .github/workflows/test.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 18079c173..a279d0f95 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -80,6 +80,7 @@ jobs: trivy-vuln-scan: name: Running Trivy Scan runs-on: ubuntu-latest + needs: [validate_cloud_providers, validate_examples] steps: - uses: actions/checkout@v4 From 1256be6c4ab457178d3fb28e60c33f97ad53ee79 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 29 Apr 2025 13:21:21 -0400 Subject: [PATCH 45/52] Replace for loop by single sed call --- .github/workflows/test.yaml | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index a279d0f95..af98985cc 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -90,9 +90,7 @@ jobs: for cloud in aws azure gcp openstack; do cp common/outputs.tf common/variables.tf $cloud/; done - for example in examples/*/*.tf; do - sed -i 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' $example - done + sed -i 's;git::https://github.com/ComputeCanada/magic_castle.git//;../../;g' examples/*/*.tf - name: Manual Trivy Setup uses: aquasecurity/setup-trivy@v0.2.2 From 8a10392cc8d4870d6e442bc229645bbc19711dec Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 29 Apr 2025 14:19:07 -0400 Subject: [PATCH 46/52] Make count optional in validation --- common/variables.tf | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/common/variables.tf b/common/variables.tf index 8edaf9424..1fde95565 100644 --- a/common/variables.tf +++ b/common/variables.tf @@ -16,7 +16,7 @@ variable "nb_users" { variable "instances" { description = "Map that defines the parameters for each type of instance of the cluster" validation { - condition = alltrue([for key, values in var.instances: can(regex("^[a-z][0-9a-z-]{1,63}$", "${key}${values.count}"))]) + condition = alltrue([for key, values in var.instances: can(regex("^[a-z][0-9a-z-]{1,63}$", "${key}${lookup(values, "count", 1)}"))]) error_message = "Instances' prefix plus index must be at most 63 lowercase alphanumeric characters and start with a letter. It can include dashes." } validation { @@ -24,7 +24,7 @@ variable "instances" { error_message = "Each entry in var.instances needs to have at least a type and a list of tags." } validation { - condition = sum([for key, values in var.instances: contains(values["tags"], "proxy") ? values["count"] : 0]) < 2 + condition = sum([for key, values in var.instances: contains(values["tags"], "proxy") ? lookup(values, "count", 1) : 0]) < 2 error_message = "At most one instance in var.instances can have the _proxy_ tag" } validation { From e431789bfaa3ffabbea07212fd6624117202401e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Tue, 29 Apr 2025 14:38:18 -0400 Subject: [PATCH 47/52] Add advanced examples to validation in CI/CD --- .github/workflows/test.yaml | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index af98985cc..a18a3121e 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -77,6 +77,31 @@ jobs: - run: terraform -chdir=examples/${{ matrix.example }} init - run: terraform -chdir=examples/${{ matrix.example }} validate + validate_advanced_examples: + runs-on: ubuntu-latest + strategy: + fail-fast: false + matrix: + example: + - spot_instances/aws + - spot_instances/azure + - spot_instances/gcp + - basic_puppet/openstack + - elk/openstack + - k8s/openstack + - lustre/openstack + - spark/openstack + steps: + - uses: actions/checkout@v4 + - uses: hashicorp/setup-terraform@v3 + with: + terraform_version: "1.5.7" + - name: Generate an SSH key + run: ssh-keygen -b 2048 -t rsa -q -N "" -f ~/.ssh/id_rsa + - run: sed -i "s;git::${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git//;../../../../;g" examples/advanced/${{ matrix.example }}/main.tf; + - run: terraform -chdir=examples/advanced/${{ matrix.example }} init + - run: terraform -chdir=examples/advanced/${{ matrix.example }} validate + trivy-vuln-scan: name: Running Trivy Scan runs-on: ubuntu-latest From 91ecbc1ea999446650ac7f9060d1f35e8aa8a744 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Wed, 30 Apr 2025 13:37:40 -0400 Subject: [PATCH 48/52] Add main branch push --- .github/workflows/test.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index a18a3121e..7a5ee7166 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -2,6 +2,8 @@ name: Validate Terraform code on: push: + branches: + - main paths: - aws/* - azure/* From cca5be075aca443c3fab03025a9b62f58a8e7c11 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Mon, 5 May 2025 10:56:51 -0400 Subject: [PATCH 49/52] Remove aws_key_pair It is optional and has no use in Magic Castle as we supply our own cloud-init. --- aws/infrastructure.tf | 7 ------- 1 file changed, 7 deletions(-) diff --git a/aws/infrastructure.tf b/aws/infrastructure.tf index 1459b77e8..f196e3221 100644 --- a/aws/infrastructure.tf +++ b/aws/infrastructure.tf @@ -108,11 +108,6 @@ resource "aws_placement_group" "efa_group" { strategy = "cluster" } -resource "aws_key_pair" "key" { - key_name = "${var.cluster_name}-key" - public_key = var.public_keys[0] -} - data "aws_ec2_instance_type" "instance_type" { for_each = var.instances instance_type = each.value.type @@ -132,8 +127,6 @@ resource "aws_instance" "instances" { availability_zone = local.availability_zone placement_group = contains(each.value.tags, "efa") ? aws_placement_group.efa_group.id : null - key_name = aws_key_pair.key.key_name - network_interface { network_interface_id = aws_network_interface.nic[each.key].id device_index = 0 From 27a2be4939ee68d20953dd846fd2bd2af2642502 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 22 May 2025 10:24:38 -0400 Subject: [PATCH 50/52] Move puppet server inclusion etc/hosts to earlier steps --- common/configuration/puppet.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/common/configuration/puppet.yaml b/common/configuration/puppet.yaml index d087049b5..ec28b649f 100644 --- a/common/configuration/puppet.yaml +++ b/common/configuration/puppet.yaml @@ -32,6 +32,10 @@ runcmd: - chmod 644 /etc/ssh/ssh_host_*_key.pub - chgrp ssh_keys /etc/ssh/ssh_host_*_key.pub - systemctl restart sshd +# Make sure puppet server can be reached by name early in the process if we need to debug. +%{ for host, ip in puppetservers ~} + - echo "${ip} ${host}" >> /etc/hosts +%{ endfor ~} # Enable fastest mirror for distribution using dnf package manager - dnf config-manager --setopt=fastestmirror=True --save # Install package and configure kernel only if building from a "vanilla" linux image @@ -103,10 +107,6 @@ runcmd: %{ endif } - chgrp puppet /etc/puppetlabs/puppet/csr_attributes.yaml %{ endif } -# Setup puppet servers -%{ for host, ip in puppetservers ~} - - echo "${ip} ${host}" >> /etc/hosts -%{ endfor ~} %{ if length(puppetservers) > 0 ~} - /opt/puppetlabs/bin/puppet config set server ${keys(puppetservers)[0]} %{ endif ~} From 569363ad44e305b66deb1036c38438588e375fd8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 22 May 2025 15:51:57 -0400 Subject: [PATCH 51/52] Update changelog --- CHANGELOG.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index ad97f2f81..d69792f59 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,10 +5,26 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). ## [14.3.0] UNRELEASED +### Added +- [github] Added Trivy misconfiguration scan of Terraform code (PR #355) +- [github] Added advanced examples to validation in CI/CD (PR #358) + ### Changed - [dns] The default list of vhost subdomains has been replaced by a `["*"]`. -This simplifies configuration of new virtual hosts in the reverse proxy. +This simplifies configuration of new virtual hosts in the reverse proxy. (PR #347) +- [common] Made sure ssh keys do not have whitespace prefix or suffix (PR #350) +- [aws] Reduced choices of availablity zones in AWS (PR #351) +- [common] Bumped terraform minimum version to 1.5.7 +- [common] Improved instance root disk size computation and warnings (PR #353) +- [github] Modernized github workflows (PR #356) +- [common] Made `count` optional in validation (PR #357) +- [cloud-init] Enabled puppet prometheus reporting (PR #349) +- [cloud-init] Moved puppet server inclusion in /etc/hosts to earlier steps + +### Removed + +- [aws] Removed key pair resource (PR #359) ## [14.2.1] 2025-02-21 From c1f2fdab4d6ec2090eed3990abe4da67afc2a460 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix-Antoine=20Fortin?= Date: Thu, 22 May 2025 16:01:21 -0400 Subject: [PATCH 52/52] Add release date to 14.3.0 --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index d69792f59..ab3f2b2eb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,7 +3,7 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). -## [14.3.0] UNRELEASED +## [14.3.0] 2025-05-22 ### Added - [github] Added Trivy misconfiguration scan of Terraform code (PR #355)