Skip to content

Allow mixing map calls #74

@sybrenjansen

Description

@sybrenjansen

Currently, mixing (i)map calls is not allowed as that messes with the internal state of the workers.

PR #70 is already a good step forward for allowing this, but it needs a bit more work.

Current situation:

  • Workers receive a map parameters object with which they know what the current settings are. E.g., function to execute, worker lifespan, etc.
  • When a new map call is issued these map parameters are overwritten
  • An apply job actually sets a new function temporarily. The next task will use the map function again. This allows the mixing of map with apply jobs

Changes required:

  • Associate the map parameters with the corresponding job ID and store it within the cache
  • Whenever a worker is started we send the currently operational map parameters to it
  • Within a worker we store these, such that we can use the right map parameters based on the task job ID
  • Whenever a new map call is started while workers are still alive, send these map parameters to the worker (this is already done) and store them together with the rest
  • Whenever a non-poisonous pill is received by the worker (which means a map call is done), also send the job id of the completed map, such that each worker can remove the map parameters. This should reduce the memory usage of the workers when many map calls are being made while keep_alive=true
  • For apply jobs we don't want to store the map parameters in the worker, because these parameters are only used once. They should, however, be stored in the WorkerPool cache for checking the timeout parameter
  • When two or more map functions are live and have different worker_init and/or worker_exit functions defined, call them all in succession. The user needs to ensure that keys in the worker state aren't reused
  • worker_lifespan can be different for different map calls. I guess we will take the minimum value in that case

Open questions:

  • Multiple progress bars can be required. This means a worker should store progress info also per job ID. Not sure yet how to enable this
  • Exception handling maybe needs to change. Right now, workers are stopped (terminated) whenever some worker got an exception (except when this happened during an apply function). We might want to change this, such that other maps can continue? But we can also leave this for another issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions