forked from kubeedge/sedna
-
Notifications
You must be signed in to change notification settings - Fork 1
[增量学习]支持不同节点训练/评估 #7
Copy link
Copy link
Open
Description
plantuml 序列图
@startuml
'https://plantuml.com/sequence-diagram
'autonumber
actor User
participant "K8S API" as API
participant GM
participant "LC at dataset-node" as LC0
participant "LC at train-node" as LC1
participant "LC at eval-node" as LC2
GM -> API: list/watch dataset / incremental job
User -> API: Create a dataset with: \n1. s3 specified url\n2. nodeName: dataset-node
API --> User:
GM -> LC0 : sync the dataset info to the LC located in dataset-node
LC0 -> LC0 : monitor the dataset and update the dataset's status
User -> API: Create an incremental job with:\n \
1. train worker spec with train-node\n \
2. eval worker spec with eval-node\n \
3. infer worker spec with nodeSelector
API --> User:
API --> GM: watched new job
GM -> API: create infer-worker
loop incremental traning
GM -> API: set the job state to train-waiting
GM -> LC0: sync the job info
loop train-trigger is not satisfied
LC0 -> LC0: append the new-incremental samples into the job if any
end
LC0 -> GM: triggered, translate the job state to train-ready
GM -> API: create train-worker
GM -> LC1: sync the job info
LC1 -> GM: get the message from train-worker, \ntranslate the job state to eval-ready
GM -> API: create eval-worker
GM -> LC2: sync the job info
LC2 -> LC2: handle eval result:
alt deploy-trigger is satisfied
LC2 -> LC2: update the deploy model,\ntranslate state to deploy-ready
GM -> API: restart the infer-worker (cold model-update)
else no satisfied
LC2 -> GM: translate state to no-deploy
end
end
@endumlReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels