diff --git a/README.md b/README.md index fa16dadd..a000eace 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,11 @@ English version|[中文版](README_cn.md) - [About](#about) - [When to use](#when-to-use) +- [Prerequisites](#prerequisites) +- [Quick Start](#quick-start) + - [Preparing your GPU Nodes](#preparing-your-gpu-nodes) + - [Enabling vGPU Support in Kubernetes](#enabling-vGPU-support-in-kubernetes) + - [Running GPU Jobs](#running-gpu-jobs) - [Stategy](#strategy) - [Benchmarks](#Benchmarks) - [Features](#Features) @@ -18,10 +23,6 @@ English version|[中文版](README_cn.md) - [Known Issues](#Known-Issues) - [TODO](#TODO) - [Prerequisites](#prerequisites) -- [Quick Start](#quick-start) - - [Preparing your GPU Nodes](#preparing-your-gpu-nodes) - - [Enabling vGPU Support in Kubernetes](#enabling-vGPU-support-in-kubernetes) - - [Running GPU Jobs](#running-gpu-jobs) - [Uninstall](#Uninstall) - [Tests](#Tests) - [Issues and Contributing](#issues-and-contributing) @@ -38,86 +39,6 @@ The **k8s vGPU scheduler** is based on 4pd-k8s-device-plugin ([4paradigm/k8s-dev 4. Situations that require a large number of small GPUs, such as teaching scenarios where one GPU is provided for multiple students to use, and the cloud platform provides small GPU instances. 5. In the case of insufficient physical device memory, virtual device memory can be turned on, such as training of large batches and large models. -## Scheduling - -Current schedule strategy is to select GPU with lowest task, thus balance the loads across mutiple GPUs - -## Benchmarks - -Three instances from ai-benchmark have been used to evaluate vGPU-device-plugin performance as follows - -| Test Environment | description | -| ---------------- | :------------------------------------------------------: | -| Kubernetes version | v1.12.9 | -| Docker version | 18.09.1 | -| GPU Type | Tesla V100 | -| GPU Num | 2 | - -| Test instance | description | -| ------------- | :---------------------------------------------------------: | -| nvidia-device-plugin | k8s + nvidia k8s-device-plugin | -| vGPU-device-plugin | k8s + VGPU k8s-device-plugin,without virtual device memory | -| vGPU-device-plugin(virtual device memory) | k8s + VGPU k8s-device-plugin,with virtual device memory | - -Test Cases: - -| test id | case | type | params | -| ------- | :-----------: | :-------: | :---------------------: | -| 1.1 | Resnet-V2-50 | inference | batch=50,size=346*346 | -| 1.2 | Resnet-V2-50 | training | batch=20,size=346*346 | -| 2.1 | Resnet-V2-152 | inference | batch=10,size=256*256 | -| 2.2 | Resnet-V2-152 | training | batch=10,size=256*256 | -| 3.1 | VGG-16 | inference | batch=20,size=224*224 | -| 3.2 | VGG-16 | training | batch=2,size=224*224 | -| 4.1 | DeepLab | inference | batch=2,size=512*512 | -| 4.2 | DeepLab | training | batch=1,size=384*384 | -| 5.1 | LSTM | inference | batch=100,size=1024*300 | -| 5.2 | LSTM | training | batch=10,size=1024*300 | - -Test Result: ![img](./imgs/benchmark_inf.png) - -![img](./imgs/benchmark_train.png) - -To reproduce: - -1. install vGPU-nvidia-device-plugin,and configure properly -2. run benchmark job - -``` -$ kubectl apply -f benchmarks/ai-benchmark/ai-benchmark.yml -``` - -3. View the result by using kubctl logs - -``` -$ kubectl logs [pod id] -``` - -## Features - -- Specify the number of vGPUs divided by each physical GPU. -- Limit vGPU's Device Memory. -- Allows vGPU allocation by specifying device memory -- Limit vGPU's Streaming Multiprocessor. -- Allows vGPU allocation by specifying device core usage -- Zero changes to existing programs - -## Experimental Features - -- Virtual Device Memory - - The device memory of the vGPU can exceed the physical device memory of the GPU. At this time, the excess part will be put in the RAM, which will have a certain impact on the performance. - -## Known Issues - -- Currently, A100 MIG not supported -- Currently, only computing tasks are supported, and video codec processing is not supported. - -## TODO - -- Support video codec processing -- Support Multi-Instance GPUs (MIG) - ## Prerequisites The list of prerequisites for running the NVIDIA device plugin is described below: @@ -172,24 +93,42 @@ Then, you need to label your GPU nodes which can be scheduled by 4pd-k8s-schedul kubectl label nodes {nodeid} gpu=on ``` -### Enabling vGPU Support in Kubernetes +### Download Once you have configured the options above on all the GPU nodes in your cluster, remove existing NVIDIA device plugin for Kubernetes if it already exists. Then, you need to clone our project, and enter deployments folder ``` $ git clone https://github.com/4paradigm/k8s-vgpu-scheduler.git -$ cd k8s-vgpu/deployments +$ cd k8s-vgpu-scheduler/deployments +``` + +### Set scheduler image version + +Check your kubernetes version by the using the following command + +``` +kubectl version +``` + +Then you need to set the kubernetes scheduler image version according to your kubernetes server version key `scheduler.kubeScheduler.image` in `deployments/values.yaml` file , for example, if your cluster server version is 1.16.8, then you should change image version to 1.16.8 + +``` +scheduler: + kubeScheduler: + image: "registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.16.8" ``` -In the deployments folder, you can customize your vGPU support by modifying following values in `values.yaml/devicePlugin/extraArgs` : +### Enabling vGPU Support in Kubernetes + +In the deployments folder, you can customize your vGPU support by modifying following keys `devicePlugin.extraArgs` in `values.yaml` file: * `device-memory-scaling:` - Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `device-memory-scaling` argument to *S*, vGPUs splitted by this GPU will totaly get *S \* M* memory in Kubernetes with our device plugin. + Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `device-memory-scaling` argument to *S*, vGPUs splitted by this GPU will totaly get `S * M` memory in Kubernetes with our device plugin. * `device-split-count:` Integer type, by default: equals 10. Maxinum tasks assigned to a simple GPU device. -Besides, you can customize the follwing values in `values.yaml/scheduler/extender/extraArgs`: +Besides, you can customize the follwing keys `devicePlugin.extraArgs` in `values.yaml` file`: * `default-mem:` Integer type, by default: 5000. The default device memory of the current task, in MB @@ -199,16 +138,16 @@ Besides, you can customize the follwing values in `values.yaml/scheduler/extende After configure those optional arguments, you can enable the vGPU support by following command: ``` -$ helm install vgpu vgpu +$ helm install vgpu vgpu -n kube-system ``` You can verify your install by following command: ``` -$ kubectl get pods +$ kubectl get pods -n kube-system ``` -If the following two pods `vgpu-device-plugin` and `vgpu-scheduler` are in running state, then your installation is successful. +If the following two pods `vgpu-device-plugin` and `vgpu-scheduler` are in *Running* state, then your installation is successful. ### Running GPU Jobs @@ -252,6 +191,86 @@ $ helm install vgpu vgpu -n kube-system helm uninstall vgpu -n kube-system ``` +## Scheduling + +Current schedule strategy is to select GPU with lowest task, thus balance the loads across mutiple GPUs + +## Benchmarks + +Three instances from ai-benchmark have been used to evaluate vGPU-device-plugin performance as follows + +| Test Environment | description | +| ---------------- | :------------------------------------------------------: | +| Kubernetes version | v1.12.9 | +| Docker version | 18.09.1 | +| GPU Type | Tesla V100 | +| GPU Num | 2 | + +| Test instance | description | +| ------------- | :---------------------------------------------------------: | +| nvidia-device-plugin | k8s + nvidia k8s-device-plugin | +| vGPU-device-plugin | k8s + VGPU k8s-device-plugin,without virtual device memory | +| vGPU-device-plugin(virtual device memory) | k8s + VGPU k8s-device-plugin,with virtual device memory | + +Test Cases: + +| test id | case | type | params | +| ------- | :-----------: | :-------: | :---------------------: | +| 1.1 | Resnet-V2-50 | inference | batch=50,size=346*346 | +| 1.2 | Resnet-V2-50 | training | batch=20,size=346*346 | +| 2.1 | Resnet-V2-152 | inference | batch=10,size=256*256 | +| 2.2 | Resnet-V2-152 | training | batch=10,size=256*256 | +| 3.1 | VGG-16 | inference | batch=20,size=224*224 | +| 3.2 | VGG-16 | training | batch=2,size=224*224 | +| 4.1 | DeepLab | inference | batch=2,size=512*512 | +| 4.2 | DeepLab | training | batch=1,size=384*384 | +| 5.1 | LSTM | inference | batch=100,size=1024*300 | +| 5.2 | LSTM | training | batch=10,size=1024*300 | + +Test Result: ![img](./imgs/benchmark_inf.png) + +![img](./imgs/benchmark_train.png) + +To reproduce: + +1. install vGPU-nvidia-device-plugin,and configure properly +2. run benchmark job + +``` +$ kubectl apply -f benchmarks/ai-benchmark/ai-benchmark.yml +``` + +3. View the result by using kubctl logs + +``` +$ kubectl logs [pod id] +``` + +## Features + +- Specify the number of vGPUs divided by each physical GPU. +- Limit vGPU's Device Memory. +- Allows vGPU allocation by specifying device memory +- Limit vGPU's Streaming Multiprocessor. +- Allows vGPU allocation by specifying device core usage +- Zero changes to existing programs + +## Experimental Features + +- Virtual Device Memory + + The device memory of the vGPU can exceed the physical device memory of the GPU. At this time, the excess part will be put in the RAM, which will have a certain impact on the performance. + +## Known Issues + +- Currently, A100 MIG not supported +- Currently, only computing tasks are supported, and video codec processing is not supported. + +## TODO + +- Support video codec processing +- Support Multi-Instance GPUs (MIG) + ## Tests - TensorFlow 1.14.0/2.4.1 diff --git a/README_cn.md b/README_cn.md index cb6b87af..5d81d485 100644 --- a/README_cn.md +++ b/README_cn.md @@ -9,17 +9,17 @@ - [关于](#关于) - [使用场景](#使用场景) +- [安装要求](#安装要求) +- [快速入门](#快速入门) + - [GPU节点准备](#GPU节点准备) + - [Kubernetes开启vGPU支持](#Kubernetes开启vGPU支持) + - [运行GPU任务](#运行GPU任务 - [调度策略](#调度策略) - [性能测试](#性能测试) - [功能](#功能) - [实验性功能](#实验性功能) - [已知问题](#已知问题) - [开发计划](#开发计划) -- [安装要求](#安装要求) -- [快速入门](#快速入门) - - [GPU节点准备](#GPU节点准备) - - [Kubernetes开启vGPU支持](#Kubernetes开启vGPU支持) - - [运行GPU任务](#运行GPU任务) - [测试](#测试) - [卸载](#卸载) - [问题反馈及代码贡献](#问题反馈及代码贡献) @@ -36,93 +36,6 @@ 4. 需要大量小显卡的情况,如教学场景把一张GPU提供给多个学生使用、云平台提供小GPU实例。 5. 物理显存不足的情况,可以开启虚拟显存,如大batch、大模型的训练。 -## 调度策略 - -调度策略为,在保证显存和算力满足需求的GPU中,优先选择任务数最少的GPU执行任务,这样做可以使任务均匀分配到所有的GPU中 - -## 性能测试 - -## 使用场景 - -1. 显存、计算单元利用率低的情况,如在一张GPU卡上运行10个tf-serving。 -2. 需要大量小显卡的情况,如教学场景把一张GPU提供给多个学生使用、云平台提供小GPU实例。 -3. 物理显存不足的情况,可以开启虚拟显存,如大batch、大模型的训练。 - -## 性能测试 - -在测试报告中,我们一共在下面五种场景都执行了ai-benchmark 测试脚本,并汇总最终结果: - -| 测试环境 | 环境描述 | -| ---------------- | :------------------------------------------------------: | -| Kubernetes version | v1.12.9 | -| Docker version | 18.09.1 | -| GPU Type | Tesla V100 | -| GPU Num | 2 | - -| 测试名称 | 测试用例 | -| -------- | :------------------------------------------------: | -| Nvidia-device-plugin | k8s + nvidia官方k8s-device-plugin | -| vGPU-device-plugin | k8s + VGPU k8s-device-plugin,无虚拟显存 | -| vGPU-device-plugin(virtual device memory) | k8s + VGPU k8s-device-plugin,高负载,开启虚拟显存 | - -测试内容 - -| test id | 名称 | 类型 | 参数 | -| ------- | :-----------: | :-------: | :---------------------: | -| 1.1 | Resnet-V2-50 | inference | batch=50,size=346*346 | -| 1.2 | Resnet-V2-50 | training | batch=20,size=346*346 | -| 2.1 | Resnet-V2-152 | inference | batch=10,size=256*256 | -| 2.2 | Resnet-V2-152 | training | batch=10,size=256*256 | -| 3.1 | VGG-16 | inference | batch=20,size=224*224 | -| 3.2 | VGG-16 | training | batch=2,size=224*224 | -| 4.1 | DeepLab | inference | batch=2,size=512*512 | -| 4.2 | DeepLab | training | batch=1,size=384*384 | -| 5.1 | LSTM | inference | batch=100,size=1024*300 | -| 5.2 | LSTM | training | batch=10,size=1024*300 | - -测试结果: ![img](./imgs/benchmark_inf.png) - -![img](./imgs/benchmark_train.png) - -测试步骤: - -1. 安装nvidia-device-plugin,并配置相应的参数 -2. 运行benchmark任务 - -``` -$ kubectl apply -f benchmarks/ai-benchmark/ai-benchmark.yml -``` - -3. 通过kubctl logs 查看结果 - -``` -$ kubectl logs [pod id] -``` - -## 功能 - -- 指定每张物理GPU切分的最大vGPU的数量 -- 限制vGPU的显存 -- 允许通过指定显存来申请GPU -- 限制vGPU的计算单元 -- 允许通过指定vGPU使用比例来申请GPU -- 对已有程序零改动 - -## 实验性功能 - -- 虚拟显存 - - vGPU的显存总和可以超过GPU实际的显存,这时候超过的部分会放到内存里,对性能有一定的影响。 - -## 已知问题 - -- 目前仅支持计算任务,不支持视频编解码处理。 -- 暂时不支持MIG - -## 开发计划 - -- 支持视频编解码处理 -- 支持Multi-Instance GPUs (MIG) ## 安装要求 @@ -136,6 +49,7 @@ $ kubectl logs [pod id] ## 快速入门 + ### GPU节点准备 以下步骤要在所有GPU节点执行。这份README文档假定GPU节点已经安装NVIDIA驱动和`nvidia-docker`套件。 @@ -174,24 +88,42 @@ $ sudo systemctl restart docker $ kubectl label nodes {nodeid} gpu=on ``` - -### Kubernetes开启vGPU支持 +### 下载项目并进入deployments文件夹 当你在所有GPU节点完成前面提到的准备动作,如果Kubernetes有已经存在的NVIDIA装置插件,需要先将它移除。然后,你需要下载整个项目,并进入deployments文件夹 ``` $ git clone https://github.com/4paradigm/k8s-vgpu-scheduler.git -$ cd k8s-vgpu/deployments +$ cd k8s-vgpu-scheduler/deployments ``` -在这个deployments文件中, 你可以在 `values.yaml/devicePlugin/extraArgs` 中使用以下的客制化参数: +### 设置调度器镜像版本 + +使用下列执行获取集群服务端版本 + +``` +kubectl version +``` + +随后,根据获得的集群服务端版本,修改 `vgpu/values.yaml` 文件的 `scheduler.kubeScheduler.image` 中调度器镜像版本。例如,如果你的服务端版本为1.16.8,则你需要将镜像版本修改为1.16.8 + +``` +scheduler: + kubeScheduler: + image: "registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.16.8" +``` + + +### Kubernetes开启vGPU支持 + +在这个deployments文件中, 你可以在 `vgpu/values.yaml` 文件的 `devicePlugin.extraArgs` 中使用以下的客制化参数: * `device-split-count:` 整数类型,预设值是10。GPU的分割数,每一张GPU都不能分配超过其配置数目的任务。若其配置为N的话,每个GPU上最多可以同时存在N个任务。 * `device-memory-scaling:` - 浮点数类型,预设值是1。NVIDIA装置显存使用比例,可以大于1(启用虚拟显存,实验功能)。对于有*M​*显存大小的NVIDIA GPU,如果我们配置`device-memory-scaling`参数为*S*,在部署了我们装置插件的Kubenetes集群中,这张GPU分出的vGPU将总共包含 *S \* M*显存。 + 浮点数类型,预设值是1。NVIDIA装置显存使用比例,可以大于1(启用虚拟显存,实验功能)。对于有*M*显存大小的NVIDIA GPU,如果我们配置`device-memory-scaling`参数为*S*,在部署了我们装置插件的Kubenetes集群中,这张GPU分出的vGPU将总共包含 `S * M` 显存。 -除此之外,你可以在 `values.yaml/scheduler/extender/extraArgs` 中使用以下客制化参数: +除此之外,你可以在 `vgpu/values.yaml` 文件的 `devicePlugin.extraArgs` 中使用以下客制化参数: * `default-mem:` 整数类型,预设值为5000,表示不配置显存时使用的默认显存大小,单位为MB @@ -202,13 +134,13 @@ $ cd k8s-vgpu/deployments 配置完成后,随后使用helm安装整个chart ``` -$ helm install vgpu vgpu +$ helm install vgpu vgpu -n kube-system ``` -通过kubectl get pods指令看到vgpu-device-plugin与vgpu-scheduler两个pod即为安装成功 +通过kubectl get pods指令看到 `vgpu-device-plugin` 与 `vgpu-scheduler` 两个pod 状态为*Running* 即为安装成功 ``` -$ kubectl get pods +$ kubectl get pods -n kube-system ``` ### 运行GPU任务 @@ -249,7 +181,95 @@ $ helm install vgpu vgpu -n kube-system $ helm uninstall vgpu -n kube-system ``` -## 测试 +## 调度策略 + +调度策略为,在保证显存和算力满足需求的GPU中,优先选择任务数最少的GPU执行任务,这样做可以使任务均匀分配到所有的GPU中 + +## 性能测试 + +## 使用场景 + +1. 显存、计算单元利用率低的情况,如在一张GPU卡上运行10个tf-serving。 +2. 需要大量小显卡的情况,如教学场景把一张GPU提供给多个学生使用、云平台提供小GPU实例。 +3. 物理显存不足的情况,可以开启虚拟显存,如大batch、大模型的训练。 + +## 性能测试 + +在测试报告中,我们一共在下面五种场景都执行了ai-benchmark 测试脚本,并汇总最终结果: + +| 测试环境 | 环境描述 | +| ---------------- | :------------------------------------------------------: | +| Kubernetes version | v1.12.9 | +| Docker version | 18.09.1 | +| GPU Type | Tesla V100 | +| GPU Num | 2 | + +| 测试名称 | 测试用例 | +| -------- | :------------------------------------------------: | +| Nvidia-device-plugin | k8s + nvidia官方k8s-device-plugin | +| vGPU-device-plugin | k8s + VGPU k8s-device-plugin,无虚拟显存 | +| vGPU-device-plugin(virtual device memory) | k8s + VGPU k8s-device-plugin,高负载,开启虚拟显存 | + +测试内容 + +| test id | 名称 | 类型 | 参数 | +| ------- | :-----------: | :-------: | :---------------------: | +| 1.1 | Resnet-V2-50 | inference | batch=50,size=346*346 | +| 1.2 | Resnet-V2-50 | training | batch=20,size=346*346 | +| 2.1 | Resnet-V2-152 | inference | batch=10,size=256*256 | +| 2.2 | Resnet-V2-152 | training | batch=10,size=256*256 | +| 3.1 | VGG-16 | inference | batch=20,size=224*224 | +| 3.2 | VGG-16 | training | batch=2,size=224*224 | +| 4.1 | DeepLab | inference | batch=2,size=512*512 | +| 4.2 | DeepLab | training | batch=1,size=384*384 | +| 5.1 | LSTM | inference | batch=100,size=1024*300 | +| 5.2 | LSTM | training | batch=10,size=1024*300 | + +测试结果: ![img](./imgs/benchmark_inf.png) + +![img](./imgs/benchmark_train.png) + +测试步骤: + +1. 安装nvidia-device-plugin,并配置相应的参数 +2. 运行benchmark任务 + +``` +$ kubectl apply -f benchmarks/ai-benchmark/ai-benchmark.yml +``` + +3. 通过kubctl logs 查看结果 + +``` +$ kubectl logs [pod id] +``` + +## 功能 + +- 指定每张物理GPU切分的最大vGPU的数量 +- 限制vGPU的显存 +- 允许通过指定显存来申请GPU +- 限制vGPU的计算单元 +- 允许通过指定vGPU使用比例来申请GPU +- 对已有程序零改动 + +## 实验性功能 + +- 虚拟显存 + + vGPU的显存总和可以超过GPU实际的显存,这时候超过的部分会放到内存里,对性能有一定的影响。 + +## 已知问题 + +- 目前仅支持计算任务,不支持视频编解码处理。 +- 暂时不支持MIG + +## 开发计划 + +- 支持视频编解码处理 +- 支持Multi-Instance GPUs (MIG) + +# 测试 - TensorFlow 1.14.0/2.4.1 - torch1.1.0 diff --git a/cmd/nvidia-container-runtime/nvcr.go b/cmd/nvidia-container-runtime/nvcr.go index 0930e3d8..21e15262 100644 --- a/cmd/nvidia-container-runtime/nvcr.go +++ b/cmd/nvidia-container-runtime/nvcr.go @@ -19,6 +19,7 @@ package main import ( "errors" "fmt" + "io/ioutil" "os" "os/exec" "strings" @@ -163,55 +164,53 @@ func (r nvidiaContainerRuntime) addNVIDIAHook(spec *specs.Spec) error { } } - /* - r.logger.Printf("prestart hook path: %s %s\n", path) - envmap, newuuids, err := GetNvidiaUUID(r, spec.Process.Env) - if err != nil { - r.logger.Println("GetNvidiaUUID failed") - } else { - if len(envmap) > 0 { - restr := "" - for idx, val := range envmap { - restr = appendtofilestr(idx, val, restr) - - tmp1 := idx + "=" + val - found := false - for idx1, val1 := range spec.Process.Env { - if strings.Compare(strings.Split(val1, "=")[0], idx) == 0 { - spec.Process.Env[idx1] = tmp1 - found = true - r.logger.Println("modified env", tmp1) - continue - } - } - if !found { - spec.Process.Env = append(spec.Process.Env, tmp1) - r.logger.Println("appended env", tmp1) + r.logger.Printf("prestart hook path: %s %s\n", path) + envmap, newuuids, err := GetNvidiaUUID(r, spec.Process.Env) + if err != nil { + r.logger.Println("GetNvidiaUUID failed") + } else { + if len(envmap) > 0 { + restr := "" + for idx, val := range envmap { + restr = appendtofilestr(idx, val, restr) + + tmp1 := idx + "=" + val + found := false + for idx1, val1 := range spec.Process.Env { + if strings.Compare(strings.Split(val1, "=")[0], idx) == 0 { + spec.Process.Env[idx1] = tmp1 + found = true + r.logger.Println("modified env", tmp1) + continue } } - restr = appendtofilestr("CUDA_DEVICE_MEMORY_SHARED_CACHE", "/tmp/vgpu/cudevshr.cache", restr) - ioutil.WriteFile("envfile.vgpu", []byte(restr), os.ModePerm) - dir, _ := os.Getwd() - sharedmnt := specs.Mount{ - Destination: "/tmp/envfile.vgpu", - Source: dir + "/envfile.vgpu", - Type: "bind", - Options: []string{"rbind", "rw"}, + if !found { + spec.Process.Env = append(spec.Process.Env, tmp1) + r.logger.Println("appended env", tmp1) } - spec.Mounts = append(spec.Mounts, sharedmnt) - - //spec.Mounts = append(spec.Mounts, ) } - if len(newuuids) > 0 { - //r.logger.Println("Get new uuids", newuuids) - //spec.Process.Env = append(spec.Process.Env, newuuids[0]) - err1 := r.addMonitor(newuuids, spec) - if err1 != nil { - r.logger.Println("addMonitorPath failed", err1.Error()) - } + restr = appendtofilestr("CUDA_DEVICE_MEMORY_SHARED_CACHE", "/tmp/vgpu/cudevshr.cache", restr) + ioutil.WriteFile("envfile.vgpu", []byte(restr), os.ModePerm) + dir, _ := os.Getwd() + sharedmnt := specs.Mount{ + Destination: "/tmp/envfile.vgpu", + Source: dir + "/envfile.vgpu", + Type: "bind", + Options: []string{"rbind", "rw"}, } + spec.Mounts = append(spec.Mounts, sharedmnt) + + //spec.Mounts = append(spec.Mounts, ) } - */ + if len(newuuids) > 0 { + //r.logger.Println("Get new uuids", newuuids) + //spec.Process.Env = append(spec.Process.Env, newuuids[0]) + err1 := r.addMonitor(newuuids, spec) + if err1 != nil { + r.logger.Println("addMonitorPath failed", err1.Error()) + } + } + } args := []string{path} if spec.Hooks == nil { spec.Hooks = &specs.Hooks{} diff --git a/deployments/vgpu/templates/scheduler/configmap.yaml b/deployments/vgpu/templates/scheduler/configmap.yaml index cf96ad25..69554151 100644 --- a/deployments/vgpu/templates/scheduler/configmap.yaml +++ b/deployments/vgpu/templates/scheduler/configmap.yaml @@ -6,28 +6,28 @@ metadata: app.kubernetes.io/component: 4pd-scheduler {{- include "4pd-vgpu.labels" . | nindent 4 }} data: - config.yaml: | - apiVersion: kubescheduler.config.k8s.io/v1beta1 - kind: KubeSchedulerConfiguration - healthzBindAddress: 0.0.0.0:10251 - leaderElection: - leaderElect: false - metricsBindAddress: 0.0.0.0:10251 - profiles: - - schedulerName: {{ .Values.schedulerName }} - extenders: - - urlPrefix: "https://127.0.0.1:443" - filterVerb: filter - nodeCacheCapable: true - weight: 1 - httpTimeout: 30s - enableHTTPS: true - tlsConfig: - insecure: true - managedResources: - - name: {{ .Values.resourceName }} - ignoredByScheduler: true - - name: {{ .Values.resourceMem }} - ignoredByScheduler: true - - name: {{ .Values.resourceCores }} - ignoredByScheduler: true + config.json: | + { + "kind": "Policy", + "apiVersion": "v1", + "extenders": [ + { + "urlPrefix": "https://127.0.0.1:443", + "filterVerb": "filter", + "enableHttps": true, + "weight": 1, + "nodeCacheCapable": true, + "httpTimeout": 30000000000, + "tlsConfig": { + "insecure": true + }, + "managedResources": [ + { + "name": "nvidia.com/gpu", + "ignoredByScheduler": true + } + ], + "ignoreable": false + } + ] + } \ No newline at end of file diff --git a/deployments/vgpu/templates/scheduler/deployment.yaml b/deployments/vgpu/templates/scheduler/deployment.yaml index 35f0d8d1..9ebc0e81 100644 --- a/deployments/vgpu/templates/scheduler/deployment.yaml +++ b/deployments/vgpu/templates/scheduler/deployment.yaml @@ -36,7 +36,9 @@ spec: imagePullPolicy: {{ .Values.scheduler.kubeScheduler.imagePullPolicy | quote }} command: - kube-scheduler - - --config=/config/config.yaml + - --policy-config-file=/config/config.json + - --leader-elect=false + - --scheduler-name={{ .Values.schedulerName }} {{- range .Values.scheduler.kubeScheduler.extraArgs }} - {{ . }} {{- end }} diff --git a/deployments/vgpu/values.yaml b/deployments/vgpu/values.yaml index 41b778ea..2b919182 100644 --- a/deployments/vgpu/values.yaml +++ b/deployments/vgpu/values.yaml @@ -19,7 +19,7 @@ global: scheduler: kubeScheduler: - image: "4pdosc/kube-scheduler:v1.20.9" + image: "registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.20.6" imagePullPolicy: IfNotPresent extraArgs: - -v=4