Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
96eb6eb
Use literal version instead of scm version. (#2)
pitt-liang Dec 4, 2023
99df4ed
Host documentation in ReadTheDocs. (#3)
pitt-liang Dec 4, 2023
124429a
Update README.md (#4)
pitt-liang Dec 6, 2023
b1d0a42
Add github workflow (#5)
pitt-liang Dec 27, 2023
10c9d54
Fix Github workflow (#6)
pitt-liang Dec 27, 2023
0dd26a4
fix: Hacky way to wait prediction service to be ready (#8)
pitt-liang Jan 2, 2024
f972256
fix: update ModelScopeEstimator image (#7)
YuZ1225 Jan 2, 2024
05e391f
release v0.4.5 (#9)
pitt-liang Jan 3, 2024
9ff250e
Supports dedicated resource && fix list training job logs. (#11)
pitt-liang Feb 26, 2024
3ea7b77
feat: add Processor for submitting ProcessingJob
firgavin Jan 24, 2024
487b212
misc: disabel warnings in nox integration test and fix nox test.ini.t…
firgavin Jan 24, 2024
22c68a4
feat: build evaluation processor for registered model
firgavin Jan 26, 2024
67874dc
chore: bump aiworkspace client to v3.0.2
firgavin Jan 30, 2024
f82074d
fix: fix outdated unit tests
firgavin Feb 20, 2024
7161b10
feat: add processor user guide
firgavin Mar 27, 2024
c071679
feat: add list datasets utils and processor waiting all job done
firgavin Apr 9, 2024
a63b181
chore: bump paistudio client to v1.1.7
firgavin Mar 22, 2024
a01c992
feat: estimator and processor support to set environments and require…
firgavin Mar 22, 2024
8534dff
feat: increase service gateway readiness threshold
pitt-liang Apr 18, 2024
47b45d9
release: 0.4.6 (#13)
pitt-liang Apr 22, 2024
0f7d2e5
feat: implement experiment feature with api, estimator, and processor
yangmianmian Apr 24, 2024
45f6f39
feat: enhance CI/CD, add features and fixes to improve model deployme…
pitt-liang Apr 26, 2024
9e1a08f
add labels for service deploy via quickstart model (#16)
pitt-liang Apr 29, 2024
786057c
release: 0.4.7 (#18)
pitt-liang Apr 29, 2024
094092d
feat: add logging utils for the library. (#20)
pitt-liang May 20, 2024
7f3c8d7
fix session setup and command line utilities for configuration. (#21)
pitt-liang May 20, 2024
5705a75
feat: RegisteredModel.get_estimator supports selecting training metho…
YuZ1225 May 20, 2024
cd5dcbc
fix pai.toolkit.config in dsw notebook binding with DefaultRole (#22)
pitt-liang Jun 3, 2024
d4238ed
release 0.4.7.post0 (#23)
pitt-liang Jun 3, 2024
50ad53b
feat: add network parameter for session (#24)
pitt-liang Jun 11, 2024
130c99b
Refactor training job subimit. (#28)
pitt-liang Jun 28, 2024
fb932ab
feat: add ModelTrainingRecipe/ModelRecipe (#29)
pitt-liang Jun 28, 2024
af857a7
feat: add `url_suffix` parameter to `predictor.openai` (#30)
pitt-liang Jul 1, 2024
be7962b
release: 0.4.8 (#31)
pitt-liang Jul 1, 2024
1dd66c0
feat: spot instance/job settings supports. (#33)
pitt-liang Jul 11, 2024
ca7322c
feat: Storage/SharedMemorry configuration supports in EAS service (#34)
pitt-liang Jul 12, 2024
3c3b162
release: 0.4.9 (#35)
pitt-liang Jul 12, 2024
cbe8ac7
feat: Add support for spot instance in Lingjun environment
luoyy82 Jul 15, 2024
9386329
build: support both `pai` and `alipai` package release (#37)
pitt-liang Jul 18, 2024
46b3200
release: 0.4.9.post0 (#38)
pitt-liang Jul 18, 2024
bf4dba2
feat: setup default session in DSW environment (#40)
pitt-liang Aug 22, 2024
0c12db6
doc: remove tutorial notebook, add model recipe document (#39)
pitt-liang Aug 22, 2024
d847a6f
feat: support model compression spec (#42)
YuZ1225 Oct 9, 2024
2e3cf64
release: 0.4.10 (#43)
pitt-liang Oct 9, 2024
45ac6e9
fix: hotfix code upload in training job (#44)
pitt-liang Oct 21, 2024
4bb3cef
release: 0.4.10.post0 (#45)
pitt-liang Oct 21, 2024
57f7579
update aiworkspace openapi to 5.0.1
everettli Nov 7, 2024
dd73703
feat: support log_lineage
everettli Nov 13, 2024
932b6ca
lint: code format
everettli Nov 13, 2024
6ea65e6
fix: unittest fail
everettli Nov 14, 2024
f636fa0
fix: unittest fail
everettli Nov 14, 2024
1458d1b
lint: code format
everettli Nov 14, 2024
86e4773
feat: support log_lineage in dlc
everettli Nov 15, 2024
7abec53
fix: lineage integration test fail
everettli Nov 15, 2024
a8c342e
fix: fix: lineage integration test fail
everettli Nov 21, 2024
5ac2ae2
feat: support pvc and nas-file lineage
everettli Dec 26, 2024
33182af
feat: support oss-file lineage
everettli Jan 8, 2025
61034f1
feat: Adapt to new data source configuration of lineage.
everettli Jan 4, 2026
6743e11
fix: Fix lint error.
everettli Jan 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Lint test

on: [push]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
common-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.8
uses: actions/setup-python@v5
with:
python-version: "3.8"
- name: Install pre-commit hook
run: |
pip install pre-commit
- name: Linting
run: pre-commit run --all-files
doc-lint:
Comment on lines +11 to +23

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 2 months ago

In general, the fix is to explicitly declare a restrictive permissions: block for the workflow or each job, instead of relying on the potentially broad repository default. For a pure lint workflow that just checks out code and runs local tools, the minimal required permission is contents: read.

The best way to fix this without changing existing functionality is to add a top-level permissions: block after the on: section in .github/workflows/lint.yaml, setting contents: read. This will apply to both common-lint and doc-lint jobs, which do not need to write to the repository or otherwise modify GitHub resources. No changes to the individual jobs or steps are required, and no imports or external dependencies are involved because this is a YAML workflow definition.

Concretely: in .github/workflows/lint.yaml, insert:

permissions:
  contents: read

between the existing on: [push] and concurrency: keys.

Suggested changeset 1
.github/workflows/lint.yaml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml
--- a/.github/workflows/lint.yaml
+++ b/.github/workflows/lint.yaml
@@ -2,6 +2,9 @@
 
 on: [push]
 
+permissions:
+  contents: read
+
 concurrency:
   group: ${{ github.workflow }}-${{ github.ref }}
   cancel-in-progress: true
EOF
@@ -2,6 +2,9 @@

on: [push]

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
Copilot is powered by AI and may make mistakes. Always verify output.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.8
uses: actions/setup-python@v5
with:
python-version: "3.8"
- name: Install Nox
run: |
pip install nox
- name: Linting
run: nox -s doc
Comment on lines +24 to +35

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 2 months ago

In general, the fix is to explicitly define a permissions block either at the workflow root (so it applies to all jobs) or per job, granting only the minimal scopes required. For this workflow, the jobs only need to read repository contents to allow actions/checkout to function; they do not need to write to contents, issues, or pull requests.

The best minimal fix while preserving behavior is to add a root-level permissions block just under the workflow name (or above on:), setting contents: read. This will apply to both common-lint and doc-lint jobs, and they will continue to function exactly as before, but with the GITHUB_TOKEN restricted to read-only repository contents. No other scopes (like pull-requests or issues) are needed because the jobs do not interact with those APIs.

Concretely, edit .github/workflows/lint.yaml near the top: insert

permissions:
  contents: read

after name: Lint test (line 1) and before on: [push] (line 3). No imports or additional methods are required because this is YAML configuration only.

Suggested changeset 1
.github/workflows/lint.yaml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml
--- a/.github/workflows/lint.yaml
+++ b/.github/workflows/lint.yaml
@@ -1,5 +1,8 @@
 name: Lint test
 
+permissions:
+  contents: read
+
 on: [push]
 
 concurrency:
EOF
@@ -1,5 +1,8 @@
name: Lint test

permissions:
contents: read

on: [push]

concurrency:
Copilot is powered by AI and may make mistakes. Always verify output.
41 changes: 41 additions & 0 deletions .github/workflows/publish.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: Publish Package
on:
push:
tags:
- 'v*'

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
publish:
name: Publish Package
runs-on: ubuntu-latest
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PAI_PYPI_TOKEN: ${{ secrets.PAI_PYPI_TOKEN }}
ALIPAI_PYPI_TOKEN: ${{ secrets.ALIPAI_PYPI_TOKEN }}
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.8
uses: actions/setup-python@v5
with:
python-version: '3.8'
- name: Install dependencies
run: pip install wheel setuptools twine
# build and upload package pai
- name: Build package for pai
run: python setup.py sdist bdist_wheel
- name: Publish package to PyPI (pai)
run: twine upload dist/* --skip-existing -u __token__ -p $PAI_PYPI_TOKEN
- name: cleanup
run: |
rm -rf dist
rm -rf build
rm -rf pai.egg-info
# build and upload package alipai
- name: Build package for alipai
run: PACKAGE_NAME=alipai python setup.py sdist bdist_wheel
- name: Publish package to PyPI (alipai)
run: twine upload dist/* --skip-existing -u __token__ -p $ALIPAI_PYPI_TOKEN
59 changes: 59 additions & 0 deletions .github/workflows/release_trigger.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Release Trigger
on:
pull_request:
types: [closed]
branches:
- master
paths:
- 'pai/version.py'

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
publish:
name: Release Trigger
runs-on: ubuntu-latest
if: github.event.pull_request.merged == true && startsWith(github.head_ref, 'releases/v')
env:
PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
PAI_PYPI_TOKEN: ${{ secrets.PAI_PYPI_TOKEN }}
ALIPAI_PYPI_TOKEN: ${{ secrets.ALIPAI_PYPI_TOKEN }}
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.8
uses: actions/setup-python@v5
with:
python-version: '3.8'
- name: Check version match
id: check_version
run: |
BRANCH_VERSION=${{ github.head_ref }}
BRANCH_VERSION=${BRANCH_VERSION#releases/v}
FILE_VERSION=$(python -c "from pai.version import VERSION; print(VERSION)")
if [[ "$BRANCH_VERSION" != "$FILE_VERSION" ]]; then
echo "Version in branch name ($BRANCH_VERSION) does not match version in file ($FILE_VERSION)"
exit 1
fi
- name: Get version and create version tag
run: |
VERSION=$(python -c "from pai.version import VERSION; print(VERSION)")
git tag v$VERSION
git push origin v$VERSION
# git tag pushed by GitHub action bot will not trigger another action.
- name: Install dependencies
run: pip install wheel setuptools twine
- name: Build package for pai
run: python setup.py sdist bdist_wheel
- name: Publish package to PyPI (pai)
run: twine upload dist/* --skip-existing -u __token__ -p $PAI_PYPI_TOKEN
- name: cleanup
run: |
rm -rf dist
rm -rf build
rm -rf pai.egg-info
- name: Build package for alipai
run: PACKAGE_NAME=alipai python setup.py sdist bdist_wheel
- name: Publish package to PyPI (alipai)
run: twine upload dist/* --skip-existing -u __token__ -p $ALIPAI_PYPI_TOKEN
22 changes: 22 additions & 0 deletions .github/workflows/unit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Unit test

on: [push]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.8
uses: actions/setup-python@v5
with:
python-version: "3.8"
- name: Install Nox
run: |
pip install nox
- name: Linting
run: nox -s unit
Comment on lines +11 to +22

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 2 months ago

In general, to fix this problem you should explicitly declare the minimal required GITHUB_TOKEN permissions for the workflow or for each job. For a unit-test workflow that only checks out code and runs tests, contents: read is usually sufficient. Defining this at the top level makes it apply to all jobs unless overridden.

The best minimal fix here is to add a permissions: block at the root of the workflow, just under the on: trigger. This will constrain GITHUB_TOKEN for the unit-test job (and any future jobs) without changing any steps. We do not need additional imports or changes to steps; we only add YAML configuration.

Concretely, in .github/workflows/unit.yaml, insert:

permissions:
  contents: read

between the on: [push] line and the existing concurrency: block. No other files or regions need to be changed.

Suggested changeset 1
.github/workflows/unit.yaml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/unit.yaml b/.github/workflows/unit.yaml
--- a/.github/workflows/unit.yaml
+++ b/.github/workflows/unit.yaml
@@ -2,6 +2,9 @@
 
 on: [push]
 
+permissions:
+  contents: read
+
 concurrency:
   group: ${{ github.workflow }}-${{ github.ref }}
   cancel-in-progress: true
EOF
@@ -2,6 +2,9 @@

on: [push]

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
Copilot is powered by AI and may make mistakes. Always verify output.
7 changes: 6 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ repos:
- -w

- repo: https://github.com/pycqa/isort
rev: 5.10.1
rev: 5.12.0
hooks:
- id: isort
name: isort (python)
Expand All @@ -53,3 +53,8 @@ repos:
rev: 0.6.1
hooks:
- id: nbstripout

- repo: https://github.com/gitleaks/gitleaks
rev: v8.16.1
hooks:
- id: gitleaks
99 changes: 68 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,82 +1,119 @@
# PAI Python SDK

[English](./README_CN.md) \| 简体中文

English \| [简体中文](./README_CN.md)
PAI Python SDK是阿里云 [机器学习平台 PAI(Platform for Artificial Intelligence)](https://www.aliyun.com/product/bigdata/learn) 提供的Python SDK,提供了更易用的HighLevel API,支持机器学习工程师简单地使用Python在PAI完成模型训练和部署,串联机器学习的流程。

The PAI Python SDK is provided by Alibaba Cloud\'s [Platform for Artificial Intelligence (PAI)](https://www.aliyun.com/product/bigdata/learn). It offers a user-friendly High-Level API, enabling machine learning engineers to easily train and deploy models on PAI using Python, streamlining the machine learning workflow.
## 🔧 安装

## Installation 🔧

Install the PAI Python SDK using the following command, which supports Python versions \>= 3.6 (it is recommended to use Python \>= 3.8):
使用以下命令安装PAI Python SDK(支持Python版本 \>= 3.8):

```shell
python -m pip install alipai
python -m pip install pai
```

## Documentation 📖
## 📖 文档

Find detailed documentation, including API references and user guides, in the [docs]{.title-ref} directory or visit [PAI Python SDK Documentation](https://pai-sdk.oss-cn-shanghai.aliyuncs.com/pai/doc/latest/index.html).
请通过访问 [PAI Python SDK文档](https://pai.readthedocs.io/) 或是查看 [docs](./docs) 目录下的文件获取SDK的详细文档,包括用户指南和API文档。

## Basic Usage 🛠
## 🛠 使用示例

- Submit a custom training job
- 提交自定义训练任务

The following example demonstrates how to submit a custom training job to PAI:
以下代码演示了如何通过SDK提交一个自定义的训练作业:

```python
from pai.estimator import Estimator
from pai.image import retrieve

est = Estimator(
# Retrieve the latest PyTorch image provided by PAI
# 获取PAI提供的最新PyTorch镜像
image_uri=retrieve(
framework_name="PyTorch", framework_version="latest"
).image_uri,
command="echo hello",
# Optionally, specify the source_dir to upload your training code:
# 可选,指定source_dir上传你的训练代码:
# source_dir="./train_src",
instance_type="ecs.c6.large",
)

# Submit the training job
# 提交训练任务
est.fit()

print(est.model_data())

```

- Deploy Large Language Model
- 部署大语言模型

PAI provides numerous pretrained models that you can easily deploy using the PAI Python SDK:
PAI提供了大量预训练模型,可以使用PAI Python SDK轻松部署:

```python
from pai.model import RegisteredModel

# Retrieve the QWen-7b model provided by PAI
qwen_model = RegisteredModel("qwen-7b-chat-lora", model_provider="pai")
# 获取PAI提供的QWen1.5-7b模型
qwen_model = RegisteredModel("qwen1.5-7b-chat", model_provider="pai")

# Deploy the model
# 部署模型
p = qwen_model.deploy(service_name="qwen_service")

# Call the service
# 调用服务
p.predict(
data={
"prompt": "How to install PyTorch?",
"system_prompt": "Act like you are programmer with 5+ years of experience.",
"prompt": "What is the purpose of life?",
"system_prompt": "You are helpful assistant.",
"temperature": 0.8,
}
)

# PAI提供的大语言模型支持OpenAI API,可以通过openai SDK调用
openai_client = p.openai()
res = openai_client.chat.completions.create(
model="default",
max_tokens=1024,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the purpose of life?"}
]
)
print(res.choices[0].message.content)

```

For more details, please refer to the [PAI Python SDK Documentation](https://pai-sdk.oss-cn-shanghai.aliyuncs.com/pai/doc/latest/index.html).
- 微调预训练模型

通过PAI提供的微调脚本,提交一个模型微调任务

```python

from pai.model import ModelTrainingRecipe

training_recipe = ModelTrainingRecipe(
model_name="qwen2-0.5b-instruct",
model_provider="pai",
instance_type="ecs.gn6e-c12g1.3xlarge",
)

training_recipe.train(
inputs={
# 本地或是阿里云OSS上的数据路径(oss://<bucketname>/path/to/data)
"train": "<YourTrainingDataPath>"
}
)


```

通过访问PAI提供的示例仓库,可以了解更多使用示例:[pai-examples](https://github.com/aliyun/pai-examples/tree/master/pai-python-sdk)

## 🤝 贡献代码

## Contributing 🤝
我们欢迎为PAI Python SDK贡献代码。请阅读 [CONTRIBUTING](./CONTRIBUTING.md) 文件了解如何为本项目贡献代码。

Contributions to the PAI Python SDK are welcome. Please read our contribution guidelines in the [CONTRIBUTING](./CONTRIBUTING.md) file.
## 📝 许可证

## License 📝
PAI Python SDK是由阿里云开发,并根据Apache许可证(版本2.0)授权使用。

PAI Python SDK is developed by Alibaba Cloud and licensed under the Apache License (Version 2.0).
## 📬 联系方式

## Contact 📬
如需支持或咨询,请在GitHub仓库中提交issue,或通过钉钉群联系我们:

For support or inquiries, please open an issue on the GitHub repository.
<img src="./assets/dingtalk-group.png" alt="DingTalkGroup" width="500"/>
Loading
Loading