Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 12 additions & 38 deletions FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,25 +18,9 @@ pipeline, Kale sets the notebook server's image as the steps' base image (or a
custom user-defined image), so all those incremental changes (e.g. new
installations) will be lost.

You will notice this is not happening in our CodeLab because, when running in
MiniKF, Kale integrates with Rok, a data management platform that takes care of
snapshotting the mounted volumes and making them available to the pipeline step.
Thus preserving the exact development environment found in the notebook.

### Pod has unbound immediate PersistentVolumeClaim

In order to data, Kale mounts a data volume on each pipeline step. Since steps
can run concurrently, your storage class needs to support `RWX`
(`ReadWriteMany`) volumes. If that is not the case, the pod will be left
unschedulable as it won't find this kind of resource.

What you can do in this case is either install a storage class that enables
`RWX` volumes or:

1. Retrieve the `.py` file generated by Kale (it should be next to the `.ipynb`)
2. Search for `marshal_vop` definition (`marshal_vop = dsl.VolumeOp...`)
3. Change this line `modes=dsl.VOLUME_MODE_RWM`, to `modes=dsl.VOLUME_MODE_RWO`
4. Run the `.py` file
To solve this, you can either:
1. Build a custom Docker image with all your dependencies pre-installed
2. List additional packages in the cell tags that Kale will include in `packages_to_install`

### Data passing and pickle errors

Expand All @@ -61,27 +45,17 @@ implemented.

### Compiler errors

When compiling your notebook you may encounter the following error:
```
Internal compiler error: Compiler has produced Argo-incompatible workflow.
Please create a new issue at https://github.com/kubeflow/pipelines/issues attaching the pipeline code and the pipeline package.
```
followed by some explanation. For example:
```
Error: time="2020-10-12T17:57:45-07:00" level=fatal msg="/dev/stdin failed to parse: error unmarshaling JSON: while decoding JSON: json: unknown field \"volumes\""
```

This is an error raised by the KFP compiler. Kale compile process contains
converting to KFP DSL and then compiling it, so it triggers the KFP compiler.
If you encounter compiler errors, ensure you're using a compatible version of
KFP (v2.4.0+). The KFP v2 compiler produces IR YAML that is submitted to the
Kubeflow Pipelines backend.

The KFP compiler runs `argo lint` on the generated workflow, if it finds the
`argo` executable in your environment's `PATH`.
Common issues:
- **Missing dependencies**: Ensure all required packages are listed in your imports cell
- **Invalid Python syntax**: Check that your notebook cells contain valid Python 3.12+ code
- **Type mismatches**: KFP v2 uses typed artifacts; ensure inputs/outputs match expected types

To overcome this issue, you could either remove `argo` from your `PATH` or
replace it with a version that is supported by KFP. At the time of writing this
section, the recommended version is 2.4.3. Follow [this
link](https://github.com/argoproj/argo/releases/tag/v2.4.3) to get the proper
binary.
If issues persist, check the generated `.kale.py` file in the `.kale/` directory
and file an issue at https://github.com/kubeflow-kale/kale/issues.

## Limitations

Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,13 @@ See the `Kale v2.0 Demo` video at the bottom of the `README` for more details.
Read more about Kale and how it works in this Medium post:
[Automating Jupyter Notebook Deployments to Kubeflow Pipelines with Kale](https://medium.com/kubeflow/automating-jupyter-notebook-deployments-to-kubeflow-pipelines-with-kale-a4ede38bea1f)

For a detailed technical overview, see the [Architecture Documentation](docs/ARCHITECTURE.md).

## Getting started

### Requirements

- **Python 3.10+**
- **Python 3.12+**
- **Kubeflow Pipelines v2.4.0+** - Install as recommended in the official [Kubeflow Pipelines Installation](https://www.kubeflow.org/docs/components/pipelines/operator-guides/installation/) documentation
- A Kubernetes cluster (`minikube`, `kind`, or any K8s cluster)

Expand Down
Loading