Develop Container and Kubernetes artefacts perform DL training and DL inference hosting in IBM Kubernetes cluster.
- Create the VPC on IBM Cloud
- Create the k8 cluster
- On the local VM, install all the necessary packages to run IBM cloud CLI, kubectl and minikube (to test locally). I have used the vagrant VM (that was used for the Kubernetes lab) to install all necessary packages mentioned in the slides.
- Developed the deep learning model for MNIST data classification.
- The files containing the code for training and inference of the model are train.py and inference.py.
- The files containing code for the frontend are front.html and backend.html
- Two Dockerfiles are used to create two containers: one for train and one for inference. The train container trains the model and saves the model weights. These weights are then used by the inference container and the prediction/classification is done.
- The container images are pushed to the docker hub. (Repository: sjdocker3409/k8_dl)
- Command For running the containers locally (without the Kubernetes), the working directory of the container was mapped to a local directory while docker was run for training. The same was done while the docker run for inference along with mapping the port of the container to the localhost port.
- Then access the server at
http://localhost:39000/
- Commands used:
sudo docker run -it -v /home:/mnist mnist_train:latest
sudo docker run -it -v /home:/mnist -p 39000:9000 mnist_inference:latest
- Once the containers were working as required. Tested the program locally on minikube.
- Created for 4 yaml files : deployment.yaml, train.yaml, service.yaml, kustomization.yaml.
- The train container was run using train.yaml. Create a Pod (kind:Pod) for training purposes. In the yaml file mounted a local directory to the directory where the model weights will save.
- The inference container was run using deployment.yaml. Created a deployment for inference (kind: Deployment) and mounted the same local directory as in the above steps.
- The service was created using service.yaml (kind: Service, type: LoadBalancer). This created a service and mapped the port of deployment to an external port so the URL can be accessed from outside.
- Also created a kustomization.yaml.
- To get the external IP address of service. Use the command :
minikube service <servicename> --url
- Can access the URL :
http://<external-ip>:<node-port>/
- Log in to IBM cloud and select your resource group.
- Run the command:
ibmcloud ks cluster config -c <cluster-id>
- Now we can use kubectl on IBM cluster.
- Create a new yaml file pvc.yaml. This one is to use the Persistent Volume Claim on IBM Cloud.
- Create the train.yaml, inference.yaml, service.yaml and customization.yaml as before. In train.yaml and inference.yaml mount pvc, to be used as the shared volume between the containers.
- Run the following commands in order:
cd <directory which contains all yaml files>
kubectl apply -f pvc.yaml
. Wait till the pvc is up and ready.kubectl apply -f train.yaml
Wait till the pod has completed training.kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl get service
. Get the external IP address. The port will the the “port” specified in the service.yaml.