Add pod template like support


pod-template like support at worker:

## current state
the spec definitions of worker:
```go
 type WorkerSpec struct {
    ScriptDir        string     `json:"scriptDir"`
    ScriptBootFile   string     `json:"scriptBootFile"`
    FrameworkType    string     `json:"frameworkType"`
    FrameworkVersion string     `json:"frameworkVersion"`
    Parameters       []ParaSpec `json:"parameters"`
 }

 // ParaSpec is a description of a parameter
 type ParaSpec struct {
    Key   string `json:"key"`
    Value string `json:"value"`
 }
```

1. `ScriptDir/ScriptBootFile` is the entrypoint of worker, localpath or central storage(e.g. s3).
1. `FrameworkType/FrameworkVersion` specifies the base container image of worker.
1. `Parameters` specifies the environment of worker.

### pros
1. simply for demo
### cons
1. don't support docker-container cap: code version mgmt, distribution etc.
1. don't support k8s pod similar features: resource limits, user defined volumes etc.
1. need central storage(e.g. s3) for code if not localpath.
1. need to build base image if the current base image can't satisfy the user
requirements(user-defined code package dependents, or new framework).
And then reedit the configuration of GM and restart it.



## proposals: Add pod template support for workers

### proposal 1: just pod template
And deprecate the current ScriptDir.

```go
import v1 "k8s.io/api/core/v1"
 type WorkerSpec struct {
    v1.PodTemplateSpec `json:",inline"`
 }

```

### examples and discussions
#### joint-inference-service
so in this proposal, the example of joint-inference [in here](https://github.com/kubeedge/sedna/blob/main/build/crd-samples/sedna/jointinferenceservice_v1alpha1.yaml) would be:
```yaml
apiVersion: sedna.io/v1alpha1
kind: JointInferenceService
metadata:
  name: example
spec:
  edgeWorker:
    model:
      name: "small-model"
    nodeName: "edge0"
    hardExampleMining:
      name: "IBT"
    workerSpec:
      containers:
      - image: edge-inference-worker:latest
        imagePullPolicy: Always
        env:  # user defined environments
        - name: nms_threshold
          value: "0.6"
        ports:  # user defined ports
          - containerPort: 80
            protocol: TCP
        resources:  # user defined resources
          requests:
            memory: 64Mi
            cpu: 100m
          limits:
            memory: 512Mi
        volumeMounts:
          - name: localvideo
            mountPath: /data/
      volumes:   # user defined volumes
        - name: localvideo
          emptyDir: {}

  cloudWorker:
    model:
      name: "big-model"
    nodeName: "solar-corona-cloud"
    workerSpec:
      containers:
        - image: cloud-inference-worker:latest
          imagePullPolicy: Always
          env:  # user defined environments
            - name: nms_threshold
              value: "0.6"
          ports:  # user defined ports
            - containerPort: 80
              protocol: TCP
          resources:  # user defined resources
            limits:
              memory: 2Gi
```

something need to discuss for joint inference service:
1. where's the resource limits of model? share with the container resource limits?
1. where's the serving container-side port of cloudworker?
1. cloudWorker's workerSpec is needed? the user may only specify the big model.

#### federated-learning-job

so in this proposal, the example of joint-inference
 [in here](https://github.com/kubeedge/sedna/blob/main/build/crd-samples/sedna/federatedlearningjob_v1alpha.yaml)
 would be:
```yaml
apiVersion: sedna.io/v1alpha1
kind: FederatedLearningJob
metadata:
  name: surface-defect-detection
spec:
  aggregationWorker:
    model:
      name: "surface-defect-detection-model"
    nodeName: "cloud0"
    # where's the serving port of aggregator worker
    workerSpec:
      containers:
        - image: aggregator-worker:latest
          imagePullPolicy: Always
          env:  # user defined environments
            - name: exit_round
              value: "0.3"
          ports:
            - containerPort: 80
              protocol: TCP
          resources:  # user defined resources
            requests:
              memory: 64Mi
              cpu: 100m
            limits:
              memory: 512Mi

  trainingWorkers:
    - nodeName: "edge1"
      dataset:
        name: "edge-1-surface-defect-detection-dataset"
      workerSpec:
        containers:
          - image: training-worker:latest
            imagePullPolicy: Always
            env:  # user defined environments
              - name: batch_size
                value: "0.3"
              - name: learning_rate
                value: "0.001"
              - name: epochs
                value: "1"
            resources:  # user defined resources
              requests:
                memory: 64Mi
                cpu: 100m
              limits:
                memory: 512Mi

    - nodeName: "edge2"
      dataset:
        name: "edge-2-surface-defect-detection-dataset"
      workerSpec:
        containers:
          - image: training-worker:latest
            imagePullPolicy: Always
            env:  # user defined environments
              - name: batch_size
                value: "0.3"
              - name: learning_rate
                value: "0.001"
              - name: epochs
                value: "1"
            resources:  # user defined resources
              requests:
                memory: 64Mi
                cpu: 100m
              limits:
                memory: 512Mi
```
#### incremental-learning-job

the common problem:
1. find to a good way to write the openapi of crd since podSpec has a lot of fields.


### deployment support

using the feature of deployment:
1. replica pod in case pod failure

alternative: using replicaSet


```go

 type DeploymentSpec struct {
   Replicas *int32 `json:"replicas,omitempty"`
   Template WorkerSpec `json:"template"`
   // etc.
 }

```

### daemonset support
use case:
1. running training worker of federated learning in every node of a group.

```go

 type DaemonsetSpec struct {
   Selector *metav1.LabelSelector `json:"selector"`
   Template WorkerSpec `json:"template"`
   // etc.
 }

```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pod template like support #2

current state

pros

cons

proposals: Add pod template support for workers

proposal 1: just pod template

examples and discussions

joint-inference-service

federated-learning-job

incremental-learning-job

deployment support

daemonset support

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add pod template like support #2

Description

current state

pros

cons

proposals: Add pod template support for workers

proposal 1: just pod template

examples and discussions

joint-inference-service

federated-learning-job

incremental-learning-job

deployment support

daemonset support

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions