This repository was archived by the owner on Mar 20, 2023. It is now read-only.

Description
Problem Description
Creating a multi-instance pool with NC24rs_v3 fails during start prep as it is looking for the mlx5_0 in shipyard_nodeprep.sh lines 1609-1612:
export_ib_pkey()
{
key0=$(cat /sys/class/infiniband/mlx5_0/ports/1/pkeys/0)
key1=$(cat /sys/class/infiniband/mlx5_0/ports/1/pkeys/1)
The NC24rs_v3 has the ConnectX3 card and is identified as mlx4_0 not mlx5_0. Manually modifying shipyard_nodeprep.sh each time a pool is created will workaround the issue.
Batch Shipyard Version
3.9.1 (Mac)
Steps to Reproduce
Resize a multi-instance pool containing NC24rs_v3 and wait for it to fail.
Expected Results
Node finds the PKEYS and boots normally without intervention.
Actual Results
Manual intervention is required each time a pool is created or modified.
Redacted Configuration
pool_specification:
id: arvinas-relion-pool-NCv3
vm_configuration:
platform_image:
offer: CentOS-HPC
publisher: OpenLogic
sku: '7.7'
version: '7.7.2020062600'
vm_count:
dedicated: 0
low_priority: 0
vm_size: STANDARD_NC24rs_v3
autoscale:
evaluation_interval: 00:05:00
scenario:
name: active_tasks
maximum_vm_count:
dedicated: 4
low_priority: 4
maximum_vm_increment_per_evaluation:
dedicated: -1
low_priority: -1
bias_node_type: low_priority
inter_node_communication_enabled: true
virtual_network:
arm_subnet_id: /subscriptions/{sub}/resourceGroups/{RG}/providers/Microsoft.Network/virtualNetworks/{Vnet}/subnets/{sn}
ssh:
username: shipyard