From 67cae052d4b9b1fdc5e05328d911fbd182a383d0 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 16 Sep 2025 01:33:50 +0200 Subject: [PATCH 1/2] MLflow: Refactor section to dedicated page --- docs/integrate/index.md | 1 + docs/integrate/mlflow/index.md | 78 ++++++++++++++++++++++++++++++++++ docs/topic/ml/index.md | 76 +-------------------------------- 3 files changed, 81 insertions(+), 74 deletions(-) create mode 100644 docs/integrate/mlflow/index.md diff --git a/docs/integrate/index.md b/docs/integrate/index.md index e068b961..123a045c 100644 --- a/docs/integrate/index.md +++ b/docs/integrate/index.md @@ -47,6 +47,7 @@ marquez/index meltano/index metabase/index mindsdb/index +mlflow/index mongodb/index mqtt/index mysql/index diff --git a/docs/integrate/mlflow/index.md b/docs/integrate/mlflow/index.md new file mode 100644 index 00000000..d8cb59df --- /dev/null +++ b/docs/integrate/mlflow/index.md @@ -0,0 +1,78 @@ +(mlflow)= +# MLflow + +```{div} +:style: "float: right; margin-left: 1em" +[![MLflow logo](https://github.com/crate/crate-clients-tools/assets/453543/d1d4f4ac-1b44-46b8-ba6f-4a82607c57d3){height=60px loading=lazy}][MLflow] +``` +```{div} +:style: "clear: both" +``` + +:::{rubric} About +::: + +[MLflow] is an open source platform to manage the whole ML lifecycle, including +experimentation, reproducibility, deployment, and a central model registry. + +The [MLflow adapter for CrateDB], available through the [mlflow-cratedb] package +on PyPI, provides support to use CrateDB as a storage database for the +[MLflow Tracking] subsystem, which is about recording and querying experiments, +across code, data, config, and results. + +:::{rubric} Learn +::: +Tutorials and Notebooks about using [MLflow] together with CrateDB. + +::::{info-card} +:::{grid-item} +:columns: 9 +**Blog: Running Time Series Models in Production using CrateDB** + +Part 1: Introduction to [Time Series Modeling using Machine Learning] + +The article will introduce you to the concept of time series modeling, +discussing the main obstacles running it in production. +It will introduce you to CrateDB, highlighting its key features and +benefits, why it stands out in managing time series data, and why it is +an especially good fit for supporting machine learning models in production. +::: +:::{grid-item} +:columns: 3 +{tags-primary}`Fundamentals` \ +{tags-secondary}`Time Series Modeling` +::: +:::: + + +::::{info-card} +:::{grid-item} +:columns: 9 +**Notebook: Create a Time Series Anomaly Detection Model** + +Guidelines and runnable code to get started with MLflow and +CrateDB, exercising time series anomaly detection and time series forecasting / +prediction using NumPy, Salesforce Merlion, and Matplotlib. + +[![README](https://img.shields.io/badge/Open-README-darkblue?logo=GitHub)][MLflow and CrateDB] +[![Notebook on GitHub](https://img.shields.io/badge/Open-Notebook%20on%20GitHub-darkgreen?logo=GitHub)][tracking-merlion-github] +[![Notebook on Colab](https://img.shields.io/badge/Open-Notebook%20on%20Colab-blue?logo=Google%20Colab)][tracking-merlion-colab] +::: +:::{grid-item} +:columns: 3 +{tags-primary}`Fundamentals` \ +{tags-secondary}`Time Series` \ +{tags-secondary}`Anomaly Detection` \ +{tags-secondary}`Prediction / Forecasting` +::: +:::: + + +[MLflow]: https://mlflow.org/ +[MLflow adapter for CrateDB]: https://github.com/crate/mlflow-cratedb +[MLflow and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/mlflow +[mlflow-cratedb]: https://pypi.org/project/mlflow-cratedb/ +[MLflow Tracking]: https://mlflow.org/docs/latest/tracking.html +[Time Series Modeling using Machine Learning]: https://cratedb.com/blog/introduction-to-time-series-modeling-with-cratedb-machine-learning-time-series-data +[tracking-merlion-colab]: https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/mlflow/tracking_merlion.ipynb +[tracking-merlion-github]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/mlflow/tracking_merlion.ipynb diff --git a/docs/topic/ml/index.md b/docs/topic/ml/index.md index 8ab191f3..568bf387 100644 --- a/docs/topic/ml/index.md +++ b/docs/topic/ml/index.md @@ -58,74 +58,10 @@ generation (RAG), and other applications. ## Anomaly Detection and Forecasting -(mlflow)= ### MLflow - -:::{rubric} About -::: -```{div} -:style: "float: right; margin-left: 1em" -[![](https://github.com/crate/crate-clients-tools/assets/453543/d1d4f4ac-1b44-46b8-ba6f-4a82607c57d3){w=180px}](https://mlflow.org/) -``` - -[MLflow] is an open source platform to manage the whole ML lifecycle, including -experimentation, reproducibility, deployment, and a central model registry. - -The [MLflow adapter for CrateDB], available through the [mlflow-cratedb] package, -provides support to use CrateDB as a storage database for the [MLflow Tracking] -subsystem, which is about recording and querying experiments, across code, data, -config, and results. - -```{div} -:style: "clear: both" -``` - -:::{rubric} Learn -::: -Tutorials and Notebooks about using [MLflow] together with CrateDB. - -::::{info-card} -:::{grid-item} -:columns: 9 -**Blog: Running Time Series Models in Production using CrateDB** - -Part 1: Introduction to [Time Series Modeling using Machine Learning] - -The article will introduce you to the concept of time series modeling, -discussing the main obstacles running it in production. -It will introduce you to CrateDB, highlighting its key features and -benefits, why it stands out in managing time series data, and why it is -an especially good fit for supporting machine learning models in production. -::: -:::{grid-item} -:columns: 3 -{tags-primary}`Fundamentals` \ -{tags-secondary}`Time Series Modeling` -::: -:::: - - -::::{info-card} -:::{grid-item} -:columns: 9 -**Notebook: Create a Time Series Anomaly Detection Model** - -Guidelines and runnable code to get started with MLflow and -CrateDB, exercising time series anomaly detection and time series forecasting / -prediction using NumPy, Salesforce Merlion, and Matplotlib. - -[![README](https://img.shields.io/badge/Open-README-darkblue?logo=GitHub)][MLflow and CrateDB] -[![Notebook on GitHub](https://img.shields.io/badge/Open-Notebook%20on%20GitHub-darkgreen?logo=GitHub)][tracking-merlion-github] -[![Notebook on Colab](https://img.shields.io/badge/Open-Notebook%20on%20Colab-blue?logo=Google%20Colab)][tracking-merlion-colab] -::: -:::{grid-item} -:columns: 3 -{tags-primary}`Fundamentals` \ -{tags-secondary}`Time Series` \ -{tags-secondary}`Anomaly Detection` \ -{tags-secondary}`Prediction / Forecasting` +:::{seealso} +Please navigate to the dedicated page about {ref}`mlflow`. ::: -:::: ### PyCaret @@ -284,14 +220,6 @@ solution. [Machine Learning and CrateDB: An introduction]: https://cratedb.com/blog/machine-learning-and-cratedb-part-one [Machine Learning and CrateDB: Getting Started With Jupyter]: https://cratedb.com/blog/machine-learning-cratedb-jupyter [Machine Learning and CrateDB: Experiment Design & Linear Regression]: https://cratedb.com/blog/machine-learning-and-cratedb-part-three-experiment-design-and-linear-regression -[MLflow]: https://mlflow.org/ -[MLflow adapter for CrateDB]: https://github.com/crate/mlflow-cratedb -[MLflow and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/mlflow -[mlflow-cratedb]: https://pypi.org/project/mlflow-cratedb/ -[MLflow Tracking]: https://mlflow.org/docs/latest/tracking.html [MLOps]: https://en.wikipedia.org/wiki/MLOps [pandas]: https://pandas.pydata.org/ [scikit-learn]: https://scikit-learn.org/ -[Time Series Modeling using Machine Learning]: https://cratedb.com/blog/introduction-to-time-series-modeling-with-cratedb-machine-learning-time-series-data -[tracking-merlion-colab]: https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/mlflow/tracking_merlion.ipynb -[tracking-merlion-github]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/mlflow/tracking_merlion.ipynb From 499487ce114ac589b9b3e8cecd1d729aac39406a Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 16 Sep 2025 01:51:06 +0200 Subject: [PATCH 2/2] MLflow: Implement suggestions by CodeRabbit --- docs/integrate/mlflow/index.md | 2 +- docs/topic/ml/index.md | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/integrate/mlflow/index.md b/docs/integrate/mlflow/index.md index d8cb59df..72326a02 100644 --- a/docs/integrate/mlflow/index.md +++ b/docs/integrate/mlflow/index.md @@ -12,7 +12,7 @@ :::{rubric} About ::: -[MLflow] is an open source platform to manage the whole ML lifecycle, including +[MLflow] is an open-source platform to manage the whole ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. The [MLflow adapter for CrateDB], available through the [mlflow-cratedb] package diff --git a/docs/topic/ml/index.md b/docs/topic/ml/index.md index 568bf387..adfed8d0 100644 --- a/docs/topic/ml/index.md +++ b/docs/topic/ml/index.md @@ -59,6 +59,7 @@ generation (RAG), and other applications. ### MLflow +Use MLflow with CrateDB for experiment tracking and model registry. :::{seealso} Please navigate to the dedicated page about {ref}`mlflow`. :::