Using the :upgrade action with the package provider in this cookbook results in excessive re-installation of packages because install_package is invoked unconditionally.
R will replace the installed package even if a newer version is unavailable, and this can cause problems for any jobs that attempt to make use of the package while it is being re-installed. Since Chef runs every half hour, this can happen frequently on a busy server.
I've written a short Ruby method to generate a bash command that returns true if the package needs to be updated. It does this by comparing the PACKAGES manifest against the DESCRIPTION of the package in the site-library.
def r_package_needs_update(package_name)
available_packages_file = "/opt/R/src/contrib/PACKAGES"
installed_package_manifest = "/usr/local/lib/R/site-library/#{package_name}/DESCRIPTION"
sh_available_version = "awk '/^Package: #{package_name}/ {P=1} P==1 && /^Version:/ {print $2; exit}' #{available_packages_file}"
sh_installed_version = "awk '/^Version/ {print $2}' #{installed_package_manifest}"
"[ $(#{sh_available_version}) != $(#{sh_installed_version}) ]"
end
This seems to work, but would need to be generalized instead of hard-coding the paths. The check executes very quickly (~0.01s) compared to calling into R (~0.50s) but I'm not sure if this is a durable approach for checking package versions.
Using the
:upgradeaction with thepackageprovider in this cookbook results in excessive re-installation of packages because install_package is invoked unconditionally.R will replace the installed package even if a newer version is unavailable, and this can cause problems for any jobs that attempt to make use of the package while it is being re-installed. Since Chef runs every half hour, this can happen frequently on a busy server.
I've written a short Ruby method to generate a bash command that returns true if the package needs to be updated. It does this by comparing the PACKAGES manifest against the DESCRIPTION of the package in the site-library.
This seems to work, but would need to be generalized instead of hard-coding the paths. The check executes very quickly (~0.01s) compared to calling into R (~0.50s) but I'm not sure if this is a durable approach for checking package versions.