FromConfig Yarn

A fromconfig Launcher for yarn execution.

Install
Quickstart
Usage Reference
- Options

Install

pip install fromconfig_yarn

Quickstart

Once installed, the launcher is available with the name yarn.

Given the following module

class Model:
    def __init__(self, learning_rate: float):
        self.learning_rate = learning_rate

    def train(self):
        print(f"Training model with learning_rate {self.learning_rate}")

and config files

# config.yaml
model:
  _attr_: foo.Model
  learning_rate: "${params.learning_rate}"

# params.yaml
params:
  learning_rate: 0.001

# launcher.yaml
yarn:
  name: test-fromconfig

logging:
  level: 20

launcher:
  run: yarn

Run (assuming you are in a Hadoop environment)

fromconfig config.yaml params.yaml launcher.yaml - model - train

Which prints

INFO:fromconfig.launcher.logger:- yarn.name: test-fromconfig
INFO:fromconfig.launcher.logger:- logging.level: 20
INFO:fromconfig.launcher.logger:- params.learning_rate: 0.001
INFO:fromconfig.launcher.logger:- model._attr_: foo.Model
INFO:fromconfig.launcher.logger:- model.learning_rate: 0.001
INFO skein.Driver: Driver started, listening on 12345
INFO:fromconfig_yarn.launcher:Uploading pex to viewfs://root/user/path/to/pex
INFO:cluster_pack.filesystem:Resolved base filesystem: <class 'pyarrow.hdfs.HadoopFileSystem'>
INFO:cluster_pack.uploader:Zipping and uploading your env to viewfs://root/user/path/to/pex
INFO skein.Driver: Uploading application resources to viewfs://root/user/...
INFO skein.Driver: Submitting application...
INFO impl.YarnClientImpl: Submitted application application_12345
INFO:fromconfig_yarn.launcher:TRACKING_URL: http://12.34.56/application_12345

You can also monkeypatch the relevant functions to "fake" the Hadoop environment with

python monkeypatch_fromconfig.py config.yaml params.yaml launcher.yaml - model - train

This example can be found in docs/examples/quickstart.

Usage Reference

Options

To configure Yarn, add a yarn entry to your config.

You can set the following parameters.

env_vars: A list of environment variables to forward to the container(s)
hadoop_file_systems: The list of available filesystems
ignored_packages: The list of packages not to include in the environment
jvm_memory_in_gb: The JVM memory (default, 8)
memory: The executor's memory (default, 32 GiB)
num_cores: The executor's number of cores (default, 8)
package_path: The HDFS location where to save the environment
zip_file: The path to an existing pex file, either local or on HDFS
name: The application name
queue: The yarn queue to submit the application to
node_label: The label of the hadoop node to be scheduled
pre_script_hook: A script to be executed before python is invoked
extra_env_vars: A mapping of extra environment variables to forward to the container(s)

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
docs/examples/quickstart		docs/examples/quickstart
fromconfig_yarn		fromconfig_yarn
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
check_version.py		check_version.py
mypy.ini		mypy.ini
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FromConfig Yarn

Install

Quickstart

Usage Reference

Options

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

criteo/fromconfig-yarn

Folders and files

Latest commit

History

Repository files navigation

FromConfig Yarn

Install

Quickstart

Usage Reference

Options

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages