Skip to content

[BUG] Incompatible Package Versions #962

@lumeilevel

Description

@lumeilevel

Describe the bug
The default version of datasets and pyarrow seems incompatible, giving rise to problems in finetuning.

To Reproduce

  1. Basic info:
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ cat /etc/os-release && uname -r
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
6.8.0-90-generic
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ python --version
Python 3.9.23
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ nvidia-smi
Thu Jan 15 03:46:52 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  |   00000000:04:00.0 Off |                    0 |
| N/A   25C    P0             47W /  400W |       0MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
  1. Set up the environment according to README.md.
(base) exouser@proj-signsgd:~/test$ git clone -b v1.0.0 https://github.com/OptimalScale/LMFlow.git
Cloning into 'LMFlow'...
remote: Enumerating objects: 19864, done.
remote: Counting objects: 100% (411/411), done.
remote: Compressing objects: 100% (140/140), done.
remote: Total 19864 (delta 308), reused 277 (delta 271), pack-reused 19453 (from 2)
Receiving objects: 100% (19864/19864), 57.10 MiB | 31.18 MiB/s, done.
Resolving deltas: 100% (11180/11180), done.
Note: switching to 'a6b97aa3a2efbfa16c71c9628ad2506839bf4bf9'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

(base) exouser@proj-signsgd:~/test$ cd LMFlow
(base) exouser@proj-signsgd:~/test/LMFlow$ conda create -n testlmflow python=3.9 -y
Retrieving notices: done
Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
    current version: 25.11.0
    latest version: 25.11.1

Please update conda by running

    $ conda update -n base -c conda-forge conda



## Package Plan ##

  environment location: /home/exouser/miniforge3/envs/testlmflow

  added / updated specs:
    - python=3.9


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    icu-78.2                   |       h33c6efd_0        12.1 MB  conda-forge
    libsqlite-3.51.2           |       hf4e2dac_0         921 KB  conda-forge
    ------------------------------------------------------------
                                           Total:        13.0 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
  _openmp_mutex      conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
  bzip2              conda-forge/linux-64::bzip2-1.0.8-hda65f42_8
  ca-certificates    conda-forge/noarch::ca-certificates-2026.1.4-hbd8a1cb_0
  icu                conda-forge/linux-64::icu-78.2-h33c6efd_0
  ld_impl_linux-64   conda-forge/linux-64::ld_impl_linux-64-2.45-default_hbd61a6d_105
  libexpat           conda-forge/linux-64::libexpat-2.7.3-hecca717_0
  libffi             conda-forge/linux-64::libffi-3.5.2-h9ec8514_0
  libgcc             conda-forge/linux-64::libgcc-15.2.0-he0feb66_16
  libgcc-ng          conda-forge/linux-64::libgcc-ng-15.2.0-h69a702a_16
  libgomp            conda-forge/linux-64::libgomp-15.2.0-he0feb66_16
  liblzma            conda-forge/linux-64::liblzma-5.8.1-hb9d3cd8_2
  libnsl             conda-forge/linux-64::libnsl-2.0.1-hb9d3cd8_1
  libsqlite          conda-forge/linux-64::libsqlite-3.51.2-hf4e2dac_0
  libstdcxx          conda-forge/linux-64::libstdcxx-15.2.0-h934c35e_16
  libuuid            conda-forge/linux-64::libuuid-2.41.3-h5347b49_0
  libxcrypt          conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
  libzlib            conda-forge/linux-64::libzlib-1.3.1-hb9d3cd8_2
  ncurses            conda-forge/linux-64::ncurses-6.5-h2d0b736_3
  openssl            conda-forge/linux-64::openssl-3.6.0-h26f9b46_0
  pip                conda-forge/noarch::pip-25.2-pyh8b19718_0
  python             conda-forge/linux-64::python-3.9.23-hc30ae73_0_cpython
  readline           conda-forge/linux-64::readline-8.3-h853b02a_0
  setuptools         conda-forge/noarch::setuptools-80.9.0-pyhff2d567_0
  tk                 conda-forge/linux-64::tk-8.6.13-noxft_ha0e22de_103
  tzdata             conda-forge/noarch::tzdata-2025c-hc9c84f9_1
  wheel              conda-forge/noarch::wheel-0.45.1-pyhd8ed1ab_1
  zstd               conda-forge/linux-64::zstd-1.5.7-hb78ec9c_6



Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate testlmflow
#
# To deactivate an active environment, use
#
#     $ conda deactivate

(base) exouser@proj-signsgd:~/test/LMFlow$ conda activate testlmflow
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ conda install mpi4py
Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
    current version: 25.11.0
    latest version: 25.11.1

Please update conda by running

    $ conda update -n base -c conda-forge conda



## Package Plan ##

  environment location: /home/exouser/miniforge3/envs/testlmflow

  added / updated specs:
    - mpi4py


The following NEW packages will be INSTALLED:

  attr               conda-forge/linux-64::attr-2.5.2-h39aace5_0
  libcap             conda-forge/linux-64::libcap-2.77-h3ff7636_0
  libevent           conda-forge/linux-64::libevent-2.1.12-hf998b51_1
  libfabric          conda-forge/linux-64::libfabric-2.4.0-ha770c72_0
  libfabric1         conda-forge/linux-64::libfabric1-2.4.0-h6c8fc0a_0
  libgfortran        conda-forge/linux-64::libgfortran-15.2.0-h69a702a_16
  libgfortran5       conda-forge/linux-64::libgfortran5-15.2.0-h68bc16d_16
  libhwloc           conda-forge/linux-64::libhwloc-2.12.1-default_hafda6a7_1003
  libiconv           conda-forge/linux-64::libiconv-1.18-h3b78370_2
  libnl              conda-forge/linux-64::libnl-3.11.0-hb9d3cd8_0
  libpmix            conda-forge/linux-64::libpmix-5.0.8-h4bd6b51_2
  libsystemd0        conda-forge/linux-64::libsystemd0-258.3-h6569c3e_0
  libudev1           conda-forge/linux-64::libudev1-258.3-h6569c3e_0
  libxml2            conda-forge/linux-64::libxml2-2.15.1-he237659_1
  libxml2-16         conda-forge/linux-64::libxml2-16-2.15.1-hca6bf5a_1
  mpi                conda-forge/noarch::mpi-1.0.1-openmpi
  mpi4py             conda-forge/linux-64::mpi4py-4.1.0-py39h62d117e_101
  openmpi            conda-forge/linux-64::openmpi-5.0.8-h2fe1745_110
  python_abi         conda-forge/noarch::python_abi-3.9-8_cp39
  rdma-core          conda-forge/linux-64::rdma-core-60.0-hecca717_0
  ucc                conda-forge/linux-64::ucc-1.6.0-hb729f83_1
  ucx                conda-forge/linux-64::ucx-1.19.1-h567e125_0


Proceed ([y]/n)? y


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: -
To enable CUDA support, UCX requires the CUDA Runtime library (libcudart).
The library can be installed with the appropriate command below:

* For CUDA 12, run:    conda install cuda-cudart cuda-version=12
* For CUDA 13, run:    conda install cuda-cudart cuda-version=13

If any of the packages you requested use CUDA then CUDA should already
have been installed for you.


|
To enable CUDA support, please follow UCX's instruction above.

To additionally enable NCCL support, run:    conda install nccl


/
On Linux, Open MPI is built with CUDA awareness but it is disabled by default.
To enable it, please set the environment variable
OMPI_MCA_opal_cuda_support=true
before launching your MPI processes.
Equivalently, you can set the MCA parameter in the command line:
mpiexec --mca opal_cuda_support 1 ...
Note that you might also need to set UCX_MEMTYPE_CACHE=n for CUDA awareness via
UCX. Please consult UCX documentation for further details.


done
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ pip install -e .
Obtaining file:///home/exouser/test/LMFlow
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Preparing editable metadata (pyproject.toml) ... done
Collecting packaging (from lmflow==1.0.0)
  Using cached packaging-25.0-py3-none-any.whl.metadata (3.3 kB)
Collecting numpy (from lmflow==1.0.0)
  Using cached numpy-2.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
Collecting datasets==2.14.6 (from lmflow==1.0.0)
  Using cached datasets-2.14.6-py3-none-any.whl.metadata (19 kB)
Collecting tokenizers>=0.13.3 (from lmflow==1.0.0)
  Using cached tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.3 kB)
Collecting peft>=0.10.0 (from lmflow==1.0.0)
  Using cached peft-0.17.1-py3-none-any.whl.metadata (14 kB)
Collecting torch>=2.0.1 (from lmflow==1.0.0)
  Using cached torch-2.8.0-cp39-cp39-manylinux_2_28_x86_64.whl.metadata (30 kB)
Collecting wandb (from lmflow==1.0.0)
  Downloading wandb-0.24.0-py3-none-manylinux_2_28_x86_64.whl.metadata (12 kB)
Collecting sentencepiece (from lmflow==1.0.0)
  Using cached sentencepiece-0.2.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (10 kB)
Collecting transformers>=4.31.0 (from lmflow==1.0.0)
  Downloading transformers-4.57.5-py3-none-any.whl.metadata (43 kB)
Collecting cpm_kernels==1.0.11 (from lmflow==1.0.0)
  Using cached cpm_kernels-1.0.11-py3-none-any.whl.metadata (1.2 kB)
Collecting evaluate==0.4.0 (from lmflow==1.0.0)
  Using cached evaluate-0.4.0-py3-none-any.whl.metadata (9.4 kB)
Collecting bitsandbytes>=0.40.0 (from lmflow==1.0.0)
  Using cached bitsandbytes-0.48.2-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Collecting pydantic (from lmflow==1.0.0)
  Using cached pydantic-2.12.5-py3-none-any.whl.metadata (90 kB)
Collecting accelerate>=0.27.2 (from lmflow==1.0.0)
  Using cached accelerate-1.10.1-py3-none-any.whl.metadata (19 kB)
Collecting einops>=0.6.1 (from lmflow==1.0.0)
  Using cached einops-0.8.1-py3-none-any.whl.metadata (13 kB)
Collecting pyarrow>=8.0.0 (from datasets==2.14.6->lmflow==1.0.0)
  Using cached pyarrow-21.0.0-cp39-cp39-manylinux_2_28_x86_64.whl.metadata (3.3 kB)
Collecting dill<0.3.8,>=0.3.0 (from datasets==2.14.6->lmflow==1.0.0)
  Using cached dill-0.3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting pandas (from datasets==2.14.6->lmflow==1.0.0)
  Using cached pandas-2.3.3-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (91 kB)
Collecting requests>=2.19.0 (from datasets==2.14.6->lmflow==1.0.0)
  Using cached requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting tqdm>=4.62.1 (from datasets==2.14.6->lmflow==1.0.0)
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting xxhash (from datasets==2.14.6->lmflow==1.0.0)
  Using cached xxhash-3.6.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (13 kB)
Collecting multiprocess (from datasets==2.14.6->lmflow==1.0.0)
  Using cached multiprocess-0.70.18-py39-none-any.whl.metadata (7.5 kB)
Collecting fsspec<=2023.10.0,>=2023.1.0 (from fsspec[http]<=2023.10.0,>=2023.1.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached fsspec-2023.10.0-py3-none-any.whl.metadata (6.8 kB)
Collecting aiohttp (from datasets==2.14.6->lmflow==1.0.0)
  Using cached aiohttp-3.13.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (8.1 kB)
Collecting huggingface-hub<1.0.0,>=0.14.0 (from datasets==2.14.6->lmflow==1.0.0)
  Using cached huggingface_hub-0.36.0-py3-none-any.whl.metadata (14 kB)
Collecting pyyaml>=5.1 (from datasets==2.14.6->lmflow==1.0.0)
  Using cached pyyaml-6.0.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (2.4 kB)
Collecting responses<0.19 (from evaluate==0.4.0->lmflow==1.0.0)
  Using cached responses-0.18.0-py3-none-any.whl.metadata (29 kB)
Collecting filelock (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached filelock-3.19.1-py3-none-any.whl.metadata (2.1 kB)
Collecting typing-extensions>=3.7.4.3 (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB)
Collecting hf-xet<2.0.0,>=1.1.3 (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached hf_xet-1.2.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Collecting urllib3>=1.25.10 (from responses<0.19->evaluate==0.4.0->lmflow==1.0.0)
  Downloading urllib3-2.6.3-py3-none-any.whl.metadata (6.9 kB)
Collecting charset_normalizer<4,>=2 (from requests>=2.19.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached charset_normalizer-3.4.4-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (37 kB)
Collecting idna<4,>=2.5 (from requests>=2.19.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached idna-3.11-py3-none-any.whl.metadata (8.4 kB)
Collecting certifi>=2017.4.17 (from requests>=2.19.0->datasets==2.14.6->lmflow==1.0.0)
  Using cached certifi-2026.1.4-py3-none-any.whl.metadata (2.5 kB)
Collecting psutil (from accelerate>=0.27.2->lmflow==1.0.0)
  Using cached psutil-7.2.1-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl.metadata (22 kB)
Collecting safetensors>=0.4.3 (from accelerate>=0.27.2->lmflow==1.0.0)
  Using cached safetensors-0.7.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB)
Collecting aiohappyeyeballs>=2.5.0 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached aiohappyeyeballs-2.6.1-py3-none-any.whl.metadata (5.9 kB)
Collecting aiosignal>=1.4.0 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached aiosignal-1.4.0-py3-none-any.whl.metadata (3.7 kB)
Collecting async-timeout<6.0,>=4.0 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached async_timeout-5.0.1-py3-none-any.whl.metadata (5.1 kB)
Collecting attrs>=17.3.0 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached attrs-25.4.0-py3-none-any.whl.metadata (10 kB)
Collecting frozenlist>=1.1.1 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached frozenlist-1.8.0-cp39-cp39-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.metadata (20 kB)
Collecting multidict<7.0,>=4.5 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached multidict-6.7.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (5.3 kB)
Collecting propcache>=0.2.0 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached propcache-0.4.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (13 kB)
Collecting yarl<2.0,>=1.17.0 (from aiohttp->datasets==2.14.6->lmflow==1.0.0)
  Using cached yarl-1.22.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (75 kB)
Collecting sympy>=1.13.3 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached sympy-1.14.0-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch>=2.0.1->lmflow==1.0.0)
  Using cached networkx-3.2.1-py3-none-any.whl.metadata (5.2 kB)
Collecting jinja2 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached jinja2-3.1.6-py3-none-any.whl.metadata (2.9 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.8.93 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-cuda-runtime-cu12==12.8.90 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cuda_runtime_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-cuda-cupti-cu12==12.8.90 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cuda_cupti_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-cudnn-cu12==9.10.2.21 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-cublas-cu12==12.8.4.1 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cublas_cu12-12.8.4.1-py3-none-manylinux_2_27_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-cufft-cu12==11.3.3.83 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-curand-cu12==10.3.9.90 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_curand_cu12-10.3.9.90-py3-none-manylinux_2_27_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-cusolver-cu12==11.7.3.90 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-cusparse-cu12==12.5.8.93 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-cusparselt-cu12==0.7.1 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cusparselt_cu12-0.7.1-py3-none-manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting nvidia-nccl-cu12==2.27.3 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_nccl_cu12-2.27.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.0 kB)
Collecting nvidia-nvtx-cu12==12.8.90 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-nvjitlink-cu12==12.8.93 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_nvjitlink_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-cufile-cu12==1.13.1.3 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached nvidia_cufile_cu12-1.13.1.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.7 kB)
Collecting triton==3.4.0 (from torch>=2.0.1->lmflow==1.0.0)
  Using cached triton-3.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (1.7 kB)
Requirement already satisfied: setuptools>=40.8.0 in /home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages (from triton==3.4.0->torch>=2.0.1->lmflow==1.0.0) (80.9.0)
Collecting importlib-metadata (from triton==3.4.0->torch>=2.0.1->lmflow==1.0.0)
  Using cached importlib_metadata-8.7.1-py3-none-any.whl.metadata (4.7 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch>=2.0.1->lmflow==1.0.0)
  Using cached mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Collecting regex!=2019.12.17 (from transformers>=4.31.0->lmflow==1.0.0)
  Downloading regex-2026.1.15-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (40 kB)
Collecting zipp>=3.20 (from importlib-metadata->triton==3.4.0->torch>=2.0.1->lmflow==1.0.0)
  Using cached zipp-3.23.0-py3-none-any.whl.metadata (3.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch>=2.0.1->lmflow==1.0.0)
  Using cached markupsafe-3.0.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (2.7 kB)
INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while.
Collecting multiprocess (from datasets==2.14.6->lmflow==1.0.0)
  Using cached multiprocess-0.70.17-py39-none-any.whl.metadata (7.2 kB)
  Using cached multiprocess-0.70.16-py39-none-any.whl.metadata (7.2 kB)
  Using cached multiprocess-0.70.15-py39-none-any.whl.metadata (7.2 kB)
Collecting python-dateutil>=2.8.2 (from pandas->datasets==2.14.6->lmflow==1.0.0)
  Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting pytz>=2020.1 (from pandas->datasets==2.14.6->lmflow==1.0.0)
  Using cached pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas->datasets==2.14.6->lmflow==1.0.0)
  Using cached tzdata-2025.3-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting six>=1.5 (from python-dateutil>=2.8.2->pandas->datasets==2.14.6->lmflow==1.0.0)
  Using cached six-1.17.0-py2.py3-none-any.whl.metadata (1.7 kB)
Collecting annotated-types>=0.6.0 (from pydantic->lmflow==1.0.0)
  Using cached annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)
Collecting pydantic-core==2.41.5 (from pydantic->lmflow==1.0.0)
  Using cached pydantic_core-2.41.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.3 kB)
Collecting typing-inspection>=0.4.2 (from pydantic->lmflow==1.0.0)
  Using cached typing_inspection-0.4.2-py3-none-any.whl.metadata (2.6 kB)
Collecting click>=8.0.1 (from wandb->lmflow==1.0.0)
  Using cached click-8.1.8-py3-none-any.whl.metadata (2.3 kB)
Collecting eval-type-backport (from wandb->lmflow==1.0.0)
  Using cached eval_type_backport-0.3.1-py3-none-any.whl.metadata (2.4 kB)
Collecting gitpython!=3.1.29,>=1.0.0 (from wandb->lmflow==1.0.0)
  Using cached gitpython-3.1.46-py3-none-any.whl.metadata (13 kB)
Collecting platformdirs (from wandb->lmflow==1.0.0)
  Using cached platformdirs-4.4.0-py3-none-any.whl.metadata (12 kB)
Collecting protobuf!=4.21.0,!=5.28.0,<7,>=3.15.0 (from wandb->lmflow==1.0.0)
  Downloading protobuf-6.33.4-cp39-abi3-manylinux2014_x86_64.whl.metadata (593 bytes)
Collecting sentry-sdk>=2.0.0 (from wandb->lmflow==1.0.0)
  Downloading sentry_sdk-2.49.0-py2.py3-none-any.whl.metadata (10 kB)
Collecting gitdb<5,>=4.0.1 (from gitpython!=3.1.29,>=1.0.0->wandb->lmflow==1.0.0)
  Using cached gitdb-4.0.12-py3-none-any.whl.metadata (1.2 kB)
Collecting smmap<6,>=3.0.1 (from gitdb<5,>=4.0.1->gitpython!=3.1.29,>=1.0.0->wandb->lmflow==1.0.0)
  Using cached smmap-5.0.2-py3-none-any.whl.metadata (4.3 kB)
Using cached cpm_kernels-1.0.11-py3-none-any.whl (416 kB)
Using cached datasets-2.14.6-py3-none-any.whl (493 kB)
Using cached evaluate-0.4.0-py3-none-any.whl (81 kB)
Using cached dill-0.3.7-py3-none-any.whl (115 kB)
Using cached fsspec-2023.10.0-py3-none-any.whl (166 kB)
Using cached huggingface_hub-0.36.0-py3-none-any.whl (566 kB)
Using cached hf_xet-1.2.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
Using cached responses-0.18.0-py3-none-any.whl (38 kB)
Using cached requests-2.32.5-py3-none-any.whl (64 kB)
Using cached charset_normalizer-3.4.4-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (153 kB)
Using cached idna-3.11-py3-none-any.whl (71 kB)
Downloading urllib3-2.6.3-py3-none-any.whl (131 kB)
Using cached accelerate-1.10.1-py3-none-any.whl (374 kB)
Using cached numpy-2.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.5 MB)
Using cached aiohttp-3.13.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.7 MB)
Using cached async_timeout-5.0.1-py3-none-any.whl (6.2 kB)
Using cached multidict-6.7.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (239 kB)
Using cached yarl-1.22.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (346 kB)
Using cached aiohappyeyeballs-2.6.1-py3-none-any.whl (15 kB)
Using cached aiosignal-1.4.0-py3-none-any.whl (7.5 kB)
Using cached attrs-25.4.0-py3-none-any.whl (67 kB)
Using cached bitsandbytes-0.48.2-py3-none-manylinux_2_24_x86_64.whl (59.4 MB)
Using cached torch-2.8.0-cp39-cp39-manylinux_2_28_x86_64.whl (888.0 MB)
Using cached nvidia_cublas_cu12-12.8.4.1-py3-none-manylinux_2_27_x86_64.whl (594.3 MB)
Using cached nvidia_cuda_cupti_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (10.2 MB)
Using cached nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (88.0 MB)
Using cached nvidia_cuda_runtime_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (954 kB)
Using cached nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl (706.8 MB)
Using cached nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (193.1 MB)
Using cached nvidia_cufile_cu12-1.13.1.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.2 MB)
Using cached nvidia_curand_cu12-10.3.9.90-py3-none-manylinux_2_27_x86_64.whl (63.6 MB)
Using cached nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl (267.5 MB)
Using cached nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (288.2 MB)
Using cached nvidia_cusparselt_cu12-0.7.1-py3-none-manylinux2014_x86_64.whl (287.2 MB)
Using cached nvidia_nccl_cu12-2.27.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (322.4 MB)
Using cached nvidia_nvjitlink_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (39.3 MB)
Using cached nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB)
Using cached triton-3.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.4 MB)
Using cached certifi-2026.1.4-py3-none-any.whl (152 kB)
Using cached einops-0.8.1-py3-none-any.whl (64 kB)
Using cached frozenlist-1.8.0-cp39-cp39-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (219 kB)
Using cached packaging-25.0-py3-none-any.whl (66 kB)
Using cached peft-0.17.1-py3-none-any.whl (504 kB)
Using cached propcache-0.4.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (197 kB)
Using cached pyarrow-21.0.0-cp39-cp39-manylinux_2_28_x86_64.whl (42.7 MB)
Using cached pyyaml-6.0.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (750 kB)
Using cached safetensors-0.7.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (507 kB)
Using cached sympy-1.14.0-py3-none-any.whl (6.3 MB)
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
Using cached tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
Using cached tqdm-4.67.1-py3-none-any.whl (78 kB)
Downloading transformers-4.57.5-py3-none-any.whl (12.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.0/12.0 MB 115.2 MB/s  0:00:00
Downloading regex-2026.1.15-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (791 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 791.3/791.3 kB 36.9 MB/s  0:00:00
Using cached typing_extensions-4.15.0-py3-none-any.whl (44 kB)
Using cached filelock-3.19.1-py3-none-any.whl (15 kB)
Using cached importlib_metadata-8.7.1-py3-none-any.whl (27 kB)
Using cached zipp-3.23.0-py3-none-any.whl (10 kB)
Using cached jinja2-3.1.6-py3-none-any.whl (134 kB)
Using cached markupsafe-3.0.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (20 kB)
Using cached multiprocess-0.70.15-py39-none-any.whl (133 kB)
Using cached networkx-3.2.1-py3-none-any.whl (1.6 MB)
Using cached pandas-2.3.3-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (12.8 MB)
Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Using cached pytz-2025.2-py2.py3-none-any.whl (509 kB)
Using cached six-1.17.0-py2.py3-none-any.whl (11 kB)
Using cached tzdata-2025.3-py2.py3-none-any.whl (348 kB)
Using cached psutil-7.2.1-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl (154 kB)
Using cached pydantic-2.12.5-py3-none-any.whl (463 kB)
Using cached pydantic_core-2.41.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
Using cached annotated_types-0.7.0-py3-none-any.whl (13 kB)
Using cached typing_inspection-0.4.2-py3-none-any.whl (14 kB)
Using cached sentencepiece-0.2.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.4 MB)
Downloading wandb-0.24.0-py3-none-manylinux_2_28_x86_64.whl (22.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22.8/22.8 MB 109.8 MB/s  0:00:00
Downloading protobuf-6.33.4-cp39-abi3-manylinux2014_x86_64.whl (323 kB)
Using cached click-8.1.8-py3-none-any.whl (98 kB)
Using cached gitpython-3.1.46-py3-none-any.whl (208 kB)
Using cached gitdb-4.0.12-py3-none-any.whl (62 kB)
Using cached smmap-5.0.2-py3-none-any.whl (24 kB)
Downloading sentry_sdk-2.49.0-py2.py3-none-any.whl (415 kB)
Using cached eval_type_backport-0.3.1-py3-none-any.whl (6.1 kB)
Using cached platformdirs-4.4.0-py3-none-any.whl (18 kB)
Using cached xxhash-3.6.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (192 kB)
Building wheels for collected packages: lmflow
  Building editable for lmflow (pyproject.toml) ... done
  Created wheel for lmflow: filename=lmflow-1.0.0-0.editable-py3-none-any.whl size=13648 sha256=cbbf95a6c709082f5025e64a6a1f12826b8ed5f27c76daf4d3f91baff35ea644
  Stored in directory: /tmp/pip-ephem-wheel-cache-uyovwgsa/wheels/a1/5d/af/bee2727493a4a4ad17fda9a57e5ddb51ac4cbc709d79c9e669
Successfully built lmflow
Installing collected packages: pytz, nvidia-cusparselt-cu12, mpmath, cpm_kernels, zipp, xxhash, urllib3, tzdata, typing-extensions, tqdm, sympy, smmap, six, sentencepiece, safetensors, regex, pyyaml, pyarrow, psutil, protobuf, propcache, platformdirs, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, MarkupSafe, idna, hf-xet, fsspec, frozenlist, filelock, eval-type-backport, einops, dill, click, charset_normalizer, certifi, attrs, async-timeout, annotated-types, aiohappyeyeballs, typing-inspection, sentry-sdk, requests, python-dateutil, pydantic-core, nvidia-cusparse-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, multiprocess, multidict, jinja2, importlib-metadata, gitdb, aiosignal, yarl, triton, responses, pydantic, pandas, nvidia-cusolver-cu12, huggingface-hub, gitpython, wandb, torch, tokenizers, aiohttp, transformers, bitsandbytes, accelerate, peft, datasets, evaluate, lmflow
Successfully installed MarkupSafe-3.0.3 accelerate-1.10.1 aiohappyeyeballs-2.6.1 aiohttp-3.13.3 aiosignal-1.4.0 annotated-types-0.7.0 async-timeout-5.0.1 attrs-25.4.0 bitsandbytes-0.48.2 certifi-2026.1.4 charset_normalizer-3.4.4 click-8.1.8 cpm_kernels-1.0.11 datasets-2.14.6 dill-0.3.7 einops-0.8.1 eval-type-backport-0.3.1 evaluate-0.4.0 filelock-3.19.1 frozenlist-1.8.0 fsspec-2023.10.0 gitdb-4.0.12 gitpython-3.1.46 hf-xet-1.2.0 huggingface-hub-0.36.0 idna-3.11 importlib-metadata-8.7.1 jinja2-3.1.6 lmflow-1.0.0 mpmath-1.3.0 multidict-6.7.0 multiprocess-0.70.15 networkx-3.2.1 numpy-2.0.2 nvidia-cublas-cu12-12.8.4.1 nvidia-cuda-cupti-cu12-12.8.90 nvidia-cuda-nvrtc-cu12-12.8.93 nvidia-cuda-runtime-cu12-12.8.90 nvidia-cudnn-cu12-9.10.2.21 nvidia-cufft-cu12-11.3.3.83 nvidia-cufile-cu12-1.13.1.3 nvidia-curand-cu12-10.3.9.90 nvidia-cusolver-cu12-11.7.3.90 nvidia-cusparse-cu12-12.5.8.93 nvidia-cusparselt-cu12-0.7.1 nvidia-nccl-cu12-2.27.3 nvidia-nvjitlink-cu12-12.8.93 nvidia-nvtx-cu12-12.8.90 packaging-25.0 pandas-2.3.3 peft-0.17.1 platformdirs-4.4.0 propcache-0.4.1 protobuf-6.33.4 psutil-7.2.1 pyarrow-21.0.0 pydantic-2.12.5 pydantic-core-2.41.5 python-dateutil-2.9.0.post0 pytz-2025.2 pyyaml-6.0.3 regex-2026.1.15 requests-2.32.5 responses-0.18.0 safetensors-0.7.0 sentencepiece-0.2.1 sentry-sdk-2.49.0 six-1.17.0 smmap-5.0.2 sympy-1.14.0 tokenizers-0.22.2 torch-2.8.0 tqdm-4.67.1 transformers-4.57.5 triton-3.4.0 typing-extensions-4.15.0 typing-inspection-0.4.2 tzdata-2025.3 urllib3-2.6.3 wandb-0.24.0 xxhash-3.6.0 yarl-1.22.0 zipp-3.23.0
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ pip show datasets
Name: datasets
Version: 2.14.6
Summary: HuggingFace community-driven open-source library of datasets
Home-page: https://github.com/huggingface/datasets
Author: HuggingFace Inc.
Author-email: thomas@huggingface.co
License: Apache 2.0
Location: /home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages
Requires: aiohttp, dill, fsspec, huggingface-hub, multiprocess, numpy, packaging, pandas, pyarrow, pyyaml, requests, tqdm, xxhash
Required-by: evaluate, lmflow
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ cd data && ./download.sh alpaca && cd -
downloading alpaca dataset
--2026-01-15 03:24:24--  http://lmflow.org:5000/alpaca.tar.gz
Resolving lmflow.org (lmflow.org)... 107.23.182.175
Connecting to lmflow.org (lmflow.org)|107.23.182.175|:5000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14508617 (14M) [application/gzip]
Saving to: ‘alpaca.tar.gz’

alpaca.tar.gz                 100%[=================================================>]  13.84M  42.3MB/s    in 0.3s

2026-01-15 03:24:24 (42.3 MB/s) - ‘alpaca.tar.gz’ saved [14508617/14508617]

alpaca/
alpaca/train_conversation/
alpaca/train_conversation/.train_52002.json.swp
alpaca/train_conversation/train_52002.json
alpaca/train/
alpaca/train/train_52002.json
alpaca/test_conversation/
alpaca/test_conversation/test_252.json
alpaca/test/
alpaca/test/test_252.json
/home/exouser/test/LMFlow
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ bash ./scripts/run_finetune.sh \
  --model_name_or_path /home/exouser/huggingface/hub/models--Qwen--Qwen2.5-7B \
  --dataset_path data/alpaca/train_conversation \
  --output_model_path output_models/finetuned_qwen
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
/home/exouser/test/LMFlow/src/lmflow/utils/versioning.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
Traceback (most recent call last):
  File "/home/exouser/test/LMFlow/examples/finetune.py", line 22, in <module>
    from lmflow.args import (
  File "/home/exouser/test/LMFlow/src/lmflow/__init__.py", line 8, in <module>
    from lmflow import args, datasets, models, pipeline, utils
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/__init__.py", line 8, in <module>
    from lmflow.datasets.dataset import Dataset
  File "/home/exouser/test/LMFlow/src/lmflow/datasets/dataset.py", line 16, in <module>
    from datasets import Dataset as HFDataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/__init__.py", line 22, in <module>
    from .arrow_dataset import Dataset
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 67, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/arrow_writer.py", line 27, in <module>
    from .features import Features, Image, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/__init__.py", line 18, in <module>
    from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/datasets/features/features.py", line 634, in <module>
    class _ArrayXDExtensionType(pa.PyExtensionType):
AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'
W0115 03:26:52.862906 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 102520 closing signal SIGTERM
W0115 03:26:52.863691 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 102521 closing signal SIGTERM
W0115 03:26:52.863948 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 102538 closing signal SIGTERM
W0115 03:26:52.864148 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 102547 closing signal SIGTERM
W0115 03:26:52.864333 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 102558 closing signal SIGTERM
W0115 03:26:52.864527 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 102573 closing signal SIGTERM
E0115 03:26:52.978914 102408 site-packages/torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: 1) local_rank: 2 (pid: 102524) of binary: /home/exouser/miniforge3/envs/testlmflow/bin/python3.9
Traceback (most recent call last):
  File "/home/exouser/miniforge3/envs/testlmflow/bin/accelerate", line 7, in <module>
    sys.exit(main())
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
    args.func(args)
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/accelerate/commands/launch.py", line 1222, in launch_command
    multi_gpu_launcher(args)
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/accelerate/commands/launch.py", line 853, in multi_gpu_launcher
    distrib_run.run(args)
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/torch/distributed/run.py", line 892, in run
    elastic_launch(
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 143, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 277, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
examples/finetune.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2026-01-15_03:26:52
  host      : host-10-1-70-190.js2local
  rank      : 3 (local_rank: 3)
  exitcode  : 1 (pid: 102529)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2026-01-15_03:26:52
  host      : host-10-1-70-190.js2local
  rank      : 2 (local_rank: 2)
  exitcode  : 1 (pid: 102524)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
  1. Package info:
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ pip show pyarrow
Name: pyarrow
Version: 21.0.0
Summary: Python library for Apache Arrow
Home-page: https://arrow.apache.org/
Author:
Author-email:
License: Apache Software License
Location: /home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages
Requires:
Required-by: datasets
(testlmflow) exouser@proj-signsgd:~/test/LMFlow$ pip show datasets
Name: datasets
Version: 2.14.6
Summary: HuggingFace community-driven open-source library of datasets
Home-page: https://github.com/huggingface/datasets
Author: HuggingFace Inc.
Author-email: thomas@huggingface.co
License: Apache 2.0
Location: /home/exouser/miniforge3/envs/testlmflow/lib/python3.9/site-packages
Requires: aiohttp, dill, fsspec, huggingface-hub, multiprocess, numpy, packaging, pandas, pyarrow, pyyaml, requests, tqdm, xxhash
Required-by: evaluate, lmflow

It seems a bit weird since requirements.txt explicitly sets datasets==3.6.0.

Expected behavior
I do not reall the same problem happening in an older version of LMFlow (before this recent major update).

Additional context
After this major update, src/lmflow/optim/optimizers.py becomes

#!/usr/bin/env python
"""All optimizers."""

In order to use customized optimizers like Muon (already implemented), we may need to manually add their names in this file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions