Skip to content

local_cache returner fails on missing .minions.p on master of masters #60251

@onmeac

Description

@onmeac

Description of Issue

In a master of master setup with one or more syndic servers the .minions.p file inside job results cache directory will be absent when salt commands are run/started from syndic, causing salt-run jobs.lookup_jid <jid> to fail on master of masters.

Setup

  • have setup with "master of masters" and at least one "syndic" conected
  • have master of masters returner configured as default (e.g. local_cache as returner).
  • have syndic configured with any master_job_cache other than local_cache, e.g. copy local_cache.py to test_issue.py and configured master_job_cache: test_issue
  • two or more minions connected to syndic

Steps to Reproduce Issue

  • from master of masters [1]: salt \* test.ping --async
  • from syndic [2]: salt \* test.ping --async

[1]: .minions.py file created in /var/cache/salt/master/jobs/<some random jid dir>
[2]: .minions.py absent in /var/cache/salt/master/jobs/<some random jid dir>

Both random jid directories will have a .minions.<name of syndic>.p file.

When looking up a jid result the local_cache returner will create a list containing path to .minions.p (MINIONS_P) and then extend that list with .minions.<name of syndic>.p (SYNDIC_MINIONS_P)

code from returners/local_cache.py:

317     minions_cache = [os.path.join(jid_dir, MINIONS_P)]
318     minions_cache.extend(
319         glob.glob(os.path.join(jid_dir, SYNDIC_MINIONS_P.format('*')))
320     )
321     all_minions = set()
322     for minions_path in minions_cache:
323         log.debug('Reading minion list from %s', minions_path)
324         try:
325             with salt.utils.files.fopen(minions_path, 'rb') as rfh:
326                 all_minions.update(serial.load(rfh))
327         except IOError as exc:
328             salt.utils.files.process_read_exception(exc, minions_path)

Because MINIONS_P does not exist, process_read_exception exception is raised.
Example exception:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/salt/client/mixins.py", line 374, in low
    data['return'] = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/runners/jobs.py", line 128, in lookup_jid
    display_progress=display_progress
  File "/usr/lib/python3.6/site-packages/salt/runners/jobs.py", line 198, in list_job
    job = mminion.returners['{0}.get_load'.format(returner)](jid)
  File "/usr/lib/python3.6/site-packages/salt/returners/local_cache.py", line 328, in get_load
    salt.utils.files.process_read_exception(exc, minions_path)
  File "/usr/lib/python3.6/site-packages/salt/utils/files.py", line 225, in process_read_exception
    raise CommandExecutionError('{0} does not exist'.format(path))
salt.exceptions.CommandExecutionError: /var/cache/salt/master/jobs/ff/29df13854b66d66262bbb9484de6dc180c140489b3e75871f34f0a6e5c957c/.minions.p does not exist

Possible solutions:

process_read_exception takes an optional argument to ignore certain error codes that might be an option?
Or perhaps an if statement to check if os.path.join(jid_dir, MINIONS_P) exists?

Versions Report

Salt Version:
           Salt: 3000.9
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
         Jinja2: 2.11.1
        libgit2: Not Installed
       M2Crypto: 0.35.2
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.6.2
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: 3.9.7
         pygit2: Not Installed
         Python: 3.6.8 (default, Nov 16 2020, 16:55:22)
   python-gnupg: Not Installed
         PyYAML: 3.13
          PyZMQ: 15.3.0
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.1.4
 
System Versions:
           dist: centos 7.9.2009 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-1160.24.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.9.2009 Core

Metadata

Metadata

Assignees

No one assigned

    Labels

    Salt-Syndicbugbroken, incorrect, or confusing behavior

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions