You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source_en/Usage Guide/Server and Client/Server.md
-7Lines changed: 0 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,12 +55,9 @@ This configuration starts 3 nodes:
55
55
Before starting the Server, you need to set the following environment variables:
56
56
57
57
```bash
58
-
export DEVICE_COUNT_PER_PHYSICAL_NODE=8 # Specify the total number of GPUs on each physical machine
59
58
export TWINKLE_TRUST_REMOTE_CODE=0 # Whether to trust remote code (security consideration)
60
59
```
61
60
62
-
> **Important Note**: `DEVICE_COUNT_PER_PHYSICAL_NODE` must be set to the actual number of physical GPUs on the machine, which is crucial for correctly parsing the `ranks` configuration.
63
-
64
61
### Node Rank in YAML Configuration
65
62
66
63
In the YAML configuration file, **each component needs to occupy a separate Node**.
@@ -117,7 +114,6 @@ applications:
117
114
**Important notes:**
118
115
- The `ranks` configuration uses **physical GPU card numbers**, directly corresponding to the actual GPU devices on the machine
119
116
- The `device_mesh` configuration uses parameters like `dp_size`, `tp_size`, `pp_size`, `ep_size` instead of the original `mesh` and `mesh_dim_names`
120
-
- The environment variable `DEVICE_COUNT_PER_PHYSICAL_NODE` must be set to inform the system of the total number of physical GPUs on each machine
121
117
- Different components will be automatically assigned to different Nodes
122
118
- Ray will automatically schedule to the appropriate Node based on resource requirements (`num_gpus`, `num_cpus` in `ray_actor_options`)
123
119
@@ -393,7 +389,6 @@ applications:
393
389
num_cpus: 0.1
394
390
runtime_env:
395
391
env_vars:
396
-
DEVICE_COUNT_PER_PHYSICAL_NODE: "8" # Total number of physical GPUs on each machine
397
392
398
393
# 3. Sampler service (optional, for inference sampling)
399
394
- name: sampler-Qwen2.5-0.5B-Instruct
@@ -425,7 +420,6 @@ applications:
425
420
num_gpus: 1 # Sampler needs independent GPU
426
421
runtime_env:
427
422
env_vars:
428
-
DEVICE_COUNT_PER_PHYSICAL_NODE: "8" # Total number of physical GPUs on each machine
429
423
```
430
424
431
425
## Configuration Item Description
@@ -471,6 +465,5 @@ device_mesh:
471
465
**Environment variables:**
472
466
473
467
```bash
474
-
export DEVICE_COUNT_PER_PHYSICAL_NODE=8 # Total number of GPUs on each physical machine (must be set)
475
468
export TWINKLE_TRUST_REMOTE_CODE=0 # Whether to trust remote code
0 commit comments