-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I think that overall we are right that tasks should run as containers.
However, in this world we've created for ourselves, the only interface we are leaving people is cli tools.
While this is good in some cases, in other cases, this can be a burden. I think it is a problem we need to address to ensure enthusiastic adoption.
The airflow-docker side of this implementation we can hash out in that repo, but what I think is that on the airflow-docker-helper side we need two things:
- A "helper" codec to ensure we can serialize/deserialize data consistently
- A CLI that takes a path to a python callable, function args, and function kwarg, and then imports and calls that callable.
Furthermore, I think a good first pass at this is:
- a "codec" that deals with json data that is base64 encoded
- A very simple wrapper that uses
argparseand__import__
This will permit an abstraction around:
docker run -it some-image airflow-docker-call \
--call foo.bar.baz:my_func \
--args fasdfasdfadsf= \
--kwargs afasdf3wrwfssadf3=
Since airflow-docker-helper is python2/3 compatible and has no external dependencies, this will let any legacy or new code utilize this method for safely calling python functions without using the PythonOperator.