-
Notifications
You must be signed in to change notification settings - Fork 118
Use a pre-installed Minikube instance for integration tests. #521
base: branch-2.2-kubernetes
Are you sure you want to change the base?
Conversation
This changes the integration test workflow to assume that Minikube is already installed on the testing machine. Previously, the integration tests downloaded Minikube in the build cycle, and started/stopped the Minikube VM on every test execution. However, this made it such that multiple integration tests cannot run concurrently. This commit allows multiple tests to share a single Minikube instance, and also requires users that run integration tests to have Minikube pre-installed. If the minikube instance has enough resources, multiple tests can run against it at the same time. Each test needs to use its own set of Docker images, so the docker image builder now tags images uniquely on every test execution.
This build won't pass until Minikube is pre-installed on our Jenkins node. Conversely, if we set up Minikube on the Jenkins node, all outstanding PRs will need to rebase on top of this commit once it merges in to pass those builds. i'm not sure what the best way forward is with regards to sequencing these changes. This change allows tests to run on the Riselab Jenkins environment as well. @ash211 @ifilonenko for review, @ssuchter @shaneknapp for infrastructure considerations. |
val originalDockerFileText = Files.readLines( | ||
DOCKER_BUILD_PATH.resolve(dockerFile).toFile, Charsets.UTF_8).asScala | ||
val dockerFileTextWithProperBaseImage = originalDockerFileText.map( | ||
_.replace("FROM spark-base", s"FROM spark-base:$dockerTag")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anyway we can create pointers to the docker images themselves, or at least store their RMI
so we don't have to go into the contents of the Dockerfile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure - we're always building all of the Docker images from scratch including the spark-base
image with the given tag, and the children images need to be built with the base set with the appropriate tag.
One option is to push this to the Maven level instead. The pre-integration-test phase can copy the Dockerfiles from docker-minimal-bundle but instead of keeping the tags, Maven can set the tags accordingly. I'm not sure what the Maven primitives would be to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of tags would you want to set? Something unique about the build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker image could be tagged with the git hash of the head commit of the repository. This should be unique per build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the maven mojo you're looking for is the 'buildnumber-maven-plugin': http://www.mojohaus.org/buildnumber-maven-plugin/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an overall question. Instead of having the requirement of using a pre-installed minikube, could we leave the old functionality in a way that it's flag guarded? A maven profile would be a good way to activate it.
|
||
private def deleteImage(name: String): Unit = { | ||
try { | ||
dockerClient.removeImage(s"$name:$dockerTag") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this right, you're modifying the system global docker images here. Given that dockerTag is always 'latest' is there a concern that multiple simultaneous invocations of this test could be fighting for the same 'latest' image? Should we change the tag to have a unique value per build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you just answered this in the other comment thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tag is actually generated here in the Minikube backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tag should be set to a UUID. I also considered the git hash as mentioned before.
We could but I'd prefer to avoid having to maintain two code paths if possible. If we're sending this upstream, it would be best to make the system simple. |
Is this waiting for any more changes? Do we need the pepperdata build system also to install minikube? |
Still ironing out some errors in the Riselab system, but yes we will need Pepperdata Jenkins to install minikube as well. Once we merge this into branch-2.2-kubernetes, ALL open PRs need to rebase on top of this change to pass CI... and to not delete Minikube accidentally. |
@@ -37,10 +37,22 @@ private[spark] object Minikube extends Logging { | |||
|
|||
def getMinikubeStatus: MinikubeStatus.Value = synchronized { | |||
val statusString = executeMinikube("status") | |||
<<<<<<< HEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad merge?
Great to have the option to run concurrent integration tests on an already existing minikube instance with different tags. I haven't read the proposed changes, but how do you create the image tags : rebuilding or simply tagging an initial build? However I'd like to second @ssuchter about the interest in keeping the previous behavior: a CI server should not have minikube pre-installed. I think it will be harder to ask Apache infra to have this on their servers, rather than letting maven download it. My vote would go for concurrent run (with tags) on a downloaded minikube, but an option to connect to an existing one, so a mix of existing and proposed behavior. |
I have run the integration test on this branch and get a failure, which I suppose, is a random one as described in #571.
|
This changes the integration test workflow to assume that Minikube is already installed on the testing machine. Previously, the integration tests downloaded Minikube in the build cycle, and started/stopped the Minikube VM on every test execution. However, this made it such that multiple integration tests cannot run concurrently.
This commit allows multiple tests to share a single Minikube instance, and also requires users that run integration tests to have Minikube pre-installed. If the minikube instance has enough resources, multiple tests can run against it at the same time. Each test needs to use its own set of Docker images, so the docker image builder now tags images uniquely on every test execution.