Skip to content

Free threading? #161

@tyralla

Description

@tyralla

We are currently creating a model that covers a huge river basin (Rhein) in a detailed spatial resolution (2.5 km). We want to calibrate it automatically "in one piece" based on the Multiscale Parameter Regionalization, which will hardly be possible without some kind of parallelisation. Now that HydPy runs on Python 3.13, the first Python version that provides a (still experimental) build that supports free threading, we asked ourselves if this is (already) the right way to introduce parallelism into HydPy.

(Parallelising HydPy-MPR based on multi-threading or multi-processing could eventually be simpler but would be a less general solution.)

For a start, I installed Python 3.13.1-t and tried to pip-install our requirements. Unfortunately, netCDF4 (version 4-1.7.2) does not seem to be ready:

      reading from setup.cfg...
          HDF5_DIR environment variable not set, checking some standard locations ..
...
        File "<string>", line 277, in <module>
        File "<string>", line 226, in _populate_hdf5_info
      ValueError: did not find HDF5 headers
      [end of output]

I could not find any information directly related to netCDF4. However, I found this issue related to H5Py, so there generally seems to be some movement. I also tried to install h5netcdf (an alternative to netCDF4 that we already considered when starting to add NetCDF features to HydPy), but this also did not work (version 1.4.1):

      running build_ext
      Loading library to get build settings and version: hdf5.dll
      error: Unable to load dependency HDF5, ensure HDF5 is installed properly
      on sys.platform='win32' with platform.machine()='AMD64'
      Library dirs checked: []
      error: Could not find module 'hdf5.dll' (or one of its dependencies). Try using the full path with constructor syntax.

So, at least out of the box, Python 3.13-t and NetCDF support do not currently go hand in hand.

In the next step, I removed netCDF4 from our requirements. The installation failed due to problems with building 'msgpack._cmsgpack' extension. Strangely, Visual Studio reports some severe errors (e.g. syntax errors):

      msgpack/_cmsgpack.c(2343): error C2146: Syntaxfehler: Fehlendes ")" vor Bezeichner "vc"
      msgpack/_cmsgpack.c(2343): error C2081: "__pyx_vectorcallfunc": Name in der formalen Parameterliste ist ung\x81ltig
      msgpack/_cmsgpack.c(2343): error C2061: Syntaxfehler: Bezeichner "vc"
      msgpack/_cmsgpack.c(2343): error C2059: Syntaxfehler: ";"
      msgpack/_cmsgpack.c(2343): error C2059: Syntaxfehler: ","
      msgpack/_cmsgpack.c(2343): error C2059: Syntaxfehler: ")"
      msgpack/_cmsgpack.c(6941): warning C4996: 'Py_OptimizeFlag': deprecated in 3.12
      msgpack/_cmsgpack.c(8188): warning C4244: "Funktion": Konvertierung von "Py_ssize_t" in "unsigned int", m”glicher Datenverlust
      msgpack/_cmsgpack.c(8343): warning C4244: "Funktion": Konvertierung von "long" in "char", m”glicher Datenverlust
      msgpack/_cmsgpack.c(8493): warning C4244: "Funktion": Konvertierung von "Py_ssize_t" in "unsigned int", m”glicher Datenverlust
      msgpack/_cmsgpack.c(9882): warning C4244: "Funktion": Konvertierung von "__int64" in "unsigned int", m”glicher Datenverlust
      msgpack/_cmsgpack.c(10124): warning C4244: "Funktion": Konvertierung von "__int64" in "unsigned int", m”glicher Datenverlust
      msgpack/_cmsgpack.c(10386): warning C4244: "Funktion": Konvertierung von "Py_ssize_t" in "unsigned int", m”glicher Datenverlust
      msgpack/_cmsgpack.c(21391): error C2146: Syntaxfehler: Fehlendes ")" vor Bezeichner "vc"
      msgpack/_cmsgpack.c(21391): error C2081: "__pyx_vectorcallfunc": Name in der formalen Parameterliste ist ung\x81ltig
      msgpack/_cmsgpack.c(21391): error C2061: Syntaxfehler: Bezeichner "vc"
      msgpack/_cmsgpack.c(21391): error C2059: Syntaxfehler: ";"
      msgpack/_cmsgpack.c(21391): error C2059: Syntaxfehler: ","
      msgpack/_cmsgpack.c(21391): error C2059: Syntaxfehler: ")"
      msgpack/_cmsgpack.c(21436): error C2146: Syntaxfehler: Fehlendes ")" vor Bezeichner "vc"
      msgpack/_cmsgpack.c(21436): error C2081: "__pyx_vectorcallfunc": Name in der formalen Parameterliste ist ung\x81ltig
      msgpack/_cmsgpack.c(21436): error C2061: Syntaxfehler: Bezeichner "vc"
      msgpack/_cmsgpack.c(21436): error C2059: Syntaxfehler: ";"
      msgpack/_cmsgpack.c(21436): error C2059: Syntaxfehler: ","
      msgpack/_cmsgpack.c(21436): error C2059: Syntaxfehler: ")"
      msgpack/_cmsgpack.c(22125): error C2065: "__pyx_vectorcallfunc": nichtdeklarierter Bezeichner
      msgpack/_cmsgpack.c(22125): error C2146: Syntaxfehler: Fehlendes ";" vor Bezeichner "vc"
      msgpack/_cmsgpack.c(22125): error C2065: "vc": nichtdeklarierter Bezeichner
      msgpack/_cmsgpack.c(22125): warning C4047: "=": Anzahl der Dereferenzierungen bei "int" und "vectorcallfunc" unterschiedlich
      msgpack/_cmsgpack.c(22126): error C2065: "vc": nichtdeklarierter Bezeichner
      msgpack/_cmsgpack.c(22128): warning C4013: "__Pyx_PyVectorcall_FastCallDict" undefiniert; Annahme: extern mit R\x81ckgabetyp int
      msgpack/_cmsgpack.c(22128): error C2065: "vc": nichtdeklarierter Bezeichner
      msgpack/_cmsgpack.c(22128): warning C4047: "return": Anzahl der Dereferenzierungen bei "PyObject *" und "int" unterschiedlich

I did not even know what msgpack is. According to johnnydep, it is a dependency of cachecontrol, which is a dependency of lastversion. However, lastversion is "only" a helper tool for making the Installer for Windows, so it is irrelevant for a possible parallelisation of a separate version of HydPy. If I also remove latestversion from requirements.txt, pip install finishes successfully.

Here is a tracking issue that documents the free threading support of some critical Python site packages. Numpy should be ready since version 2.1 (so, no compatibility with Arcpy environments, but this is also not overly important). For matplotlib and pandas, there are also proper releases on PyPI. However, for Cython and Scipy, there are not. According to this issue, one would have to deal with Cython's master branch. Maybe it is similar to Scipy; here is the preliminary documentation on this topic.

Scipy should not concern us too much now because we do not use its functionalities during simulation runs (only, for example, when preparing a simulation run). Cython is definitely important, so I started a build with Cython 3.0.11*. python prepare_build.py finished successfully. python -m build results in similar msgpack-like errors:

hydpy\cythons\autogen\annutils.c(2765): error C2146: Syntaxfehler: Fehlendes ")" vor Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(2765): error C2081: "__pyx_vectorcallfunc": Name in der formalen Parameterliste ist ungültig
hydpy\cythons\autogen\annutils.c(2765): error C2061: Syntaxfehler: Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(2765): error C2059: Syntaxfehler: ";"
hydpy\cythons\autogen\annutils.c(2765): error C2059: Syntaxfehler: ","
hydpy\cythons\autogen\annutils.c(2765): error C2059: Syntaxfehler: ")"
hydpy\cythons\autogen\annutils.c(7612): warning C4996: 'Py_OptimizeFlag': deprecated in 3.12
hydpy\cythons\autogen\annutils.c(12342): warning C4996: 'Py_OptimizeFlag': deprecated in 3.12
hydpy\cythons\autogen\annutils.c(29791): error C2146: Syntaxfehler: Fehlendes ")" vor Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(29791): error C2081: "__pyx_vectorcallfunc": Name in der formalen Parameterliste ist ungültig
hydpy\cythons\autogen\annutils.c(29791): error C2061: Syntaxfehler: Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(29791): error C2059: Syntaxfehler: ";"
hydpy\cythons\autogen\annutils.c(29791): error C2059: Syntaxfehler: ","
hydpy\cythons\autogen\annutils.c(29791): error C2059: Syntaxfehler: ")"
hydpy\cythons\autogen\annutils.c(29836): error C2146: Syntaxfehler: Fehlendes ")" vor Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(29836): error C2081: "__pyx_vectorcallfunc": Name in der formalen Parameterliste ist ungültig
hydpy\cythons\autogen\annutils.c(29836): error C2061: Syntaxfehler: Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(29836): error C2059: Syntaxfehler: ";"
hydpy\cythons\autogen\annutils.c(29836): error C2059: Syntaxfehler: ","
hydpy\cythons\autogen\annutils.c(29836): error C2059: Syntaxfehler: ")"
hydpy\cythons\autogen\annutils.c(30525): error C2065: "__pyx_vectorcallfunc": nichtdeklarierter Bezeichner
hydpy\cythons\autogen\annutils.c(30525): error C2146: Syntaxfehler: Fehlendes ";" vor Bezeichner "vc"
hydpy\cythons\autogen\annutils.c(30525): error C2065: "vc": nichtdeklarierter Bezeichner
hydpy\cythons\autogen\annutils.c(30525): warning C4047: "=": Anzahl der Dereferenzierungen bei "int" und "vectorcallfunc" unterschiedlich
hydpy\cythons\autogen\annutils.c(30526): error C2065: "vc": nichtdeklarierter Bezeichner
hydpy\cythons\autogen\annutils.c(30528): warning C4013: "__Pyx_PyVectorcall_FastCallDict" undefiniert; Annahme: extern mit Rückgabetyp int
hydpy\cythons\autogen\annutils.c(30528): error C2065: "vc": nichtdeklarierter Bezeichner
hydpy\cythons\autogen\annutils.c(30528): warning C4047: "return": Anzahl der Dereferenzierungen bei "PyObject *" und "int" unterschiedlich

So, the next step necessary seems to install the master branch Cython version, which, based on this instruction, is easy. Running python prepare_build.py with the current version of Cython 3.1.0a1 succeeds but warns as follows:

<frozen importlib._bootstrap>:488: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'Cython.Utils', which has not declared that it can run safely without the GIL. To override this behaviour and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

python -m build still fails. According to this comment, it is because we have to build numpy with the previously built cython version. Numpy's documentation informs us that this requires more effort.

I stop here. It appears that trying to get everything running and then maintaining everything in a fast-changing environment is too much work and we should wait at least for an official release of Cython 3.1 and a fitting release of Numpy. However, implementing HydPy-internal parallelisation differently now, given that free threading seems reachable soon, also does not seem right. So, maybe we should postpone it and "just" try to parallelise HydPy-MPR for now.

@BGWKlein @GernotBelger

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions