Python 3.12.1 with pandarallel==1.6.5 usage of parallel_apply time increase X3

## General

- **Operating System**: 8.9 (Ootpa)
- **Python version**: 3.12.1
- **Pandas version**: 2.1.3
- **Pandarallel version**: 1.6.5

## Acknowledgement

after upgrading to Python 3.12 from Python 3.10 the usage of parallel_apply increased almost X3.
running on docker with 8.9 (Ootpa)

this is the information about the OS that the docker is running

```
NAME="Red Hat Enterprise Linux"
VERSION="8.9 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.9"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.9 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.9"
```

Python 3.12 packages

```
annotated-types==0.6.0
astroid==3.0.1
attrs==23.1.0
Cerberus==1.3.5
certifi==2023.11.17
charset-normalizer==3.3.2
contourpy==1.2.0
coverage==7.3.2
cycler==0.12.1
debugpy==1.8.0
dill==0.3.7
distlib==0.3.7
docopt==0.6.2
execnet==2.0.2
fonttools==4.46.0
idna==3.6
iniconfig==2.0.0
isort==5.13.0
Jinja2==3.1.2
joblib==1.3.2
jsonschema==4.20.0
jsonschema-specifications==2023.11.2
kiwisolver==1.4.5
MarkupSafe==2.1.3
matplotlib==3.8.2
mccabe==0.7.0
mlxtend==0.23.0
numpy==1.26.2
packaging==23.2
pandarallel==1.6.5
pandas==2.1.3
pep517==0.13.1
pika==1.3.2
Pillow==10.1.0
pip-api==0.0.30
pipreqs==0.4.13
platformdirs==4.1.0
plette==0.4.4
pluggy==1.3.0
psutil==5.9.6
py-cpuinfo==9.0.0
pydantic==2.5.2
pydantic_core==2.14.5
pylint==3.0.2
pyparsing==3.1.1
pytest==7.4.3
pytest-benchmark==4.0.0
pytest-cov==4.1.0
pytest-html==4.1.1
pytest-metadata==3.0.0
pytest-mock==3.12.0
pytest-order==1.2.0
pytest-ordering==0.6
pytest-timeout==2.2.0
pytest-xdist==3.4.0
python-dateutil==2.8.2
pytz==2023.3.post1
redis==5.0.1
referencing==0.32.0
requests==2.31.0
requirementslib==3.0.0
rpds-py==0.13.2
scikit-learn==1.3.2
scipy==1.11.4
seaborn==0.13.0
setuptools==68.2.2
six==1.16.0
threadpoolctl==3.2.0
tomlkit==0.12.3
typing_extensions==4.9.0
tzdata==2023.3
urllib3==2.1.0
yarg==0.1.9
```

I can't add all my code but this is some of it.

```
results = combined.groupby(by='NewGroup').parallel_apply(
            lambda group: TestClass(data=group.drop(columns=columns, inplace=False)).run())
```

TestClass - init the class with the new data after the drop
columns - is a list of columns that we need to drop
run - is the function that runs on each group

the servers are the same and the code didn't change, but still, I got time increased almost by X3

with python 3.10.11 with pandarallel==1.6.5 and pandas==2.0.0
the same data frame takes 2.49 min and with the 3.12.1 it takes 7.22 min


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python 3.12.1 with pandarallel==1.6.5 usage of parallel_apply time increase X3 #261

General

Acknowledgement

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Python 3.12.1 with pandarallel==1.6.5 usage of parallel_apply time increase X3 #261

Description

General

Acknowledgement

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions