ArgoWorkflow教程(七)—高效的步骤间文件共享策略

ArgoWorkflow教程(七)---高效的步骤间文件共享策略

之前我们分析了使用 artifact 实现步骤间文件共享,今天分享一下如何使用 PVC 实现高效的步骤间文件共享。

1. 概述

之前在 artifact 篇我们演示了如何使用 artifact 实现步骤间文件传递,今天介绍一种更为简单的文件传递方式:PVC 共享

artifact 毕竟是借助 S3 实现中转,效率上肯定是低于直接共享 PVC 的,而且 artifact 一般用于结果输出,将最终结果保存到 S3,而不是单纯的用来共享文件。

2. 使用 artifact 共享文件

之前已经分享过了怎么通过 artifact 在不同步骤之间传递文件,这里在回顾一下。

apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata:   generateName: artifact-passing- spec:   entrypoint: artifact-example   templates:   - name: artifact-example     steps:     - - name: generate-artifact         template: whalesay     - - name: consume-artifact         template: print-message         arguments:           artifacts:           # bind message to the hello-art artifact           # generated by the generate-artifact step           - name: message             from: "{{steps.generate-artifact.outputs.artifacts.hello-art}}"    - name: whalesay     container:       image: docker/whalesay:latest       command: [sh, -c]       args: ["cowsay hello world | tee /tmp/hello_world.txt"]     outputs:       artifacts:       # generate hello-art artifact from /tmp/hello_world.txt       # artifacts can be directories as well as files       - name: hello-art         path: /tmp/hello_world.txt    - name: print-message     inputs:       artifacts:       # unpack the message input artifact       # and put it at /tmp/message       - name: message         path: /tmp/message     container:       image: alpine:latest       command: [sh, -c]       args: ["cat /tmp/message"] 

可以看到,artifact 方式共享文件步骤间传递参数是比较类似。

导出 artifact

outputs:   artifacts:   # generate hello-art artifact from /tmp/hello_world.txt   # artifacts can be directories as well as files   - name: hello-art     path: /tmp/hello_world.txt 

后续步骤引用导出的 artifact

arguments:   artifacts:   # bind message to the hello-art artifact   # generated by the generate-artifact step   - name: message     from: "{{steps.generate-artifact.outputs.artifacts.hello-art}}" 

以及步骤中怎么将 artifact 引入,比如下面 demo 则是将 artifact 做为 /tmp/message 挂载到 Pod 中。

inputs:   artifacts:   # unpack the message input artifact   # and put it at /tmp/message   - name: message     path: /tmp/message 

3. 使用 PVC 高效共享文件

顾名思义,就是不同步骤都挂载同一个 PVC,这样自然就实现了文件共享。

ArgoWorkflow 中的每一步都会单独启动一个 Pod 来运行

也有两种方式:

  • 1)动态创建 PVC:Workflow 运行时创建 PVC,运行结束后删除 PVC
  • 2)使用已有 PVC:不会创建也不会删除

动态创建 PVC

完整 Demo 如下:

apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata:   generateName: volumes-pvc- spec:   entrypoint: volumes-pvc-example   volumeClaimTemplates:                 # define volume, same syntax as k8s Pod spec   - metadata:       name: workdir                     # name of volume claim     spec:       accessModes: [ "ReadWriteOnce" ]       resources:         requests:           storage: 1Gi                  # Gi => 1024 * 1024 * 1024    templates:   - name: volumes-pvc-example     steps:     - - name: generate         template: whalesay     - - name: print         template: print-message    - name: whalesay     container:       image: docker/whalesay:latest       command: [sh, -c]       args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]       # Mount workdir volume at /mnt/vol before invoking docker/whalesay       volumeMounts:                     # same syntax as k8s Pod spec       - name: workdir         mountPath: /mnt/vol    - name: print-message     container:       image: alpine:latest       command: [sh, -c]       args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]       # Mount workdir volume at /mnt/vol before invoking docker/whalesay       volumeMounts:                     # same syntax as k8s Pod spec       - name: workdir         mountPath: /mnt/vol 

首先定义了一个 PVC 模版,Workflow 运行时就会使用该模版创建一个 PVC

spec:   entrypoint: volumes-pvc-example   volumeClaimTemplates:                 # define volume, same syntax as k8s Pod spec   - metadata:       name: workdir                     # name of volume claim     spec:       accessModes: [ "ReadWriteOnce" ]       resources:         requests:           storage: 1Gi                  # Gi => 1024 * 1024 * 1024 

然后其他步骤需要将该 PVC 挂载到对应目录

  - name: whalesay     container:       image: docker/whalesay:latest       command: [sh, -c]       args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]       # Mount workdir volume at /mnt/vol before invoking docker/whalesay       volumeMounts:                     # same syntax as k8s Pod spec       - name: workdir         mountPath: /mnt/vol    - name: print-message     container:       image: alpine:latest       command: [sh, -c]       args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]       # Mount workdir volume at /mnt/vol before invoking docker/whalesay       volumeMounts:                     # same syntax as k8s Pod spec       - name: workdir         mountPath: /mnt/vol 

这样就实现了文件共享,非常简单。

等 Workflow 运行结束后,Argo 会自动将创建出的 PVC 删除。

使用已有 PVC

在某些情况下,我们可以希望访问已经存在的卷,而不是动态创建/销毁卷。

完整 Demo 如下:

# Define Kubernetes PVC kind: PersistentVolumeClaim apiVersion: v1 metadata:   name: my-existing-volume spec:   accessModes: [ "ReadWriteOnce" ]   resources:     requests:       storage: 1Gi  --- apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata:   generateName: volumes-existing- spec:   entrypoint: volumes-existing-example   volumes:   # Pass my-existing-volume as an argument to the volumes-existing-example template   # Same syntax as k8s Pod spec   - name: workdir     persistentVolumeClaim:       claimName: my-existing-volume    templates:   - name: volumes-existing-example     steps:     - - name: generate         template: whalesay     - - name: print         template: print-message    - name: whalesay     container:       image: docker/whalesay:latest       command: [sh, -c]       args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]       volumeMounts:       - name: workdir         mountPath: /mnt/vol    - name: print-message     container:       image: alpine:latest       command: [sh, -c]       args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]       volumeMounts:       - name: workdir         mountPath: /mnt/vol 

首先就是手动创建一个 PVC

# Define Kubernetes PVC kind: PersistentVolumeClaim apiVersion: v1 metadata:   name: my-existing-volume spec:   accessModes: [ "ReadWriteOnce" ]   resources:     requests:       storage: 1Gi 

然后在 Workflow 中定义要使用这个 PVC

apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata:   generateName: volumes-existing- spec:   entrypoint: volumes-existing-example   volumes:   # Pass my-existing-volume as an argument to the volumes-existing-example template   # Same syntax as k8s Pod spec   - name: workdir     persistentVolumeClaim:       claimName: my-existing-volume 

可以看做是使用 persistentVolumeClaim 来替换了之前的 volumeClaimTemplates

然后就是步骤将这个 PVC 挂载到对应目录

  - name: whalesay     container:       image: docker/whalesay:latest       command: [sh, -c]       args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]       volumeMounts:       - name: workdir         mountPath: /mnt/vol    - name: print-message     container:       image: alpine:latest       command: [sh, -c]       args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]       volumeMounts:       - name: workdir         mountPath: /mnt/vol 

这一步和使用动态创建 PVC 时没有任何变化。


【ArgoWorkflow 系列】持续更新中,搜索公众号【探索云原生】订阅,阅读更多文章。

ArgoWorkflow教程(七)---高效的步骤间文件共享策略


4. 小结

本文主要分析了 Argo 中的 Workflow 中怎么使用 PVC 共享文件。

  • 1)定义 PVC 模版或者指定使用已有的 PVC
  • 2)步骤中将 PVC 挂载到对应目录使用
发表评论

评论已关闭。

相关文章