Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Use a pre-installed Minikube instance for integration tests. #521

Open
wants to merge 11 commits into
base: branch-2.2-kubernetes
Choose a base branch
from

Conversation

mccheah
Copy link

@mccheah mccheah commented Oct 6, 2017

This changes the integration test workflow to assume that Minikube is already installed on the testing machine. Previously, the integration tests downloaded Minikube in the build cycle, and started/stopped the Minikube VM on every test execution. However, this made it such that multiple integration tests cannot run concurrently.

This commit allows multiple tests to share a single Minikube instance, and also requires users that run integration tests to have Minikube pre-installed. If the minikube instance has enough resources, multiple tests can run against it at the same time. Each test needs to use its own set of Docker images, so the docker image builder now tags images uniquely on every test execution.

This changes the integration test workflow to assume that Minikube is
already installed on the testing machine. Previously, the integration
tests downloaded Minikube in the build cycle, and started/stopped the
Minikube VM on every test execution. However, this made it such that
multiple integration tests cannot run concurrently.

This commit allows multiple tests to share a single Minikube instance,
and also requires users that run integration tests to have Minikube
pre-installed. If the minikube instance has enough resources, multiple
tests can run against it at the same time. Each test needs to use its
own set of Docker images, so the docker image builder now tags images
uniquely on every test execution.
@mccheah
Copy link
Author

mccheah commented Oct 6, 2017

This build won't pass until Minikube is pre-installed on our Jenkins node. Conversely, if we set up Minikube on the Jenkins node, all outstanding PRs will need to rebase on top of this commit once it merges in to pass those builds. i'm not sure what the best way forward is with regards to sequencing these changes.

This change allows tests to run on the Riselab Jenkins environment as well.

@ash211 @ifilonenko for review, @ssuchter @shaneknapp for infrastructure considerations.

val originalDockerFileText = Files.readLines(
DOCKER_BUILD_PATH.resolve(dockerFile).toFile, Charsets.UTF_8).asScala
val dockerFileTextWithProperBaseImage = originalDockerFileText.map(
_.replace("FROM spark-base", s"FROM spark-base:$dockerTag"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyway we can create pointers to the docker images themselves, or at least store their RMI so we don't have to go into the contents of the Dockerfile?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure - we're always building all of the Docker images from scratch including the spark-base image with the given tag, and the children images need to be built with the base set with the appropriate tag.

One option is to push this to the Maven level instead. The pre-integration-test phase can copy the Dockerfiles from docker-minimal-bundle but instead of keeping the tags, Maven can set the tags accordingly. I'm not sure what the Maven primitives would be to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of tags would you want to set? Something unique about the build?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker image could be tagged with the git hash of the head commit of the repository. This should be unique per build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the maven mojo you're looking for is the 'buildnumber-maven-plugin': http://www.mojohaus.org/buildnumber-maven-plugin/

Copy link
Member

@ssuchter ssuchter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an overall question. Instead of having the requirement of using a pre-installed minikube, could we leave the old functionality in a way that it's flag guarded? A maven profile would be a good way to activate it.


private def deleteImage(name: String): Unit = {
try {
dockerClient.removeImage(s"$name:$dockerTag")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this right, you're modifying the system global docker images here. Given that dockerTag is always 'latest' is there a concern that multiple simultaneous invocations of this test could be fighting for the same 'latest' image? Should we change the tag to have a unique value per build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you just answered this in the other comment thread.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag is actually generated here in the Minikube backend.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tag should be set to a UUID. I also considered the git hash as mentioned before.

@mccheah
Copy link
Author

mccheah commented Oct 10, 2017

We could but I'd prefer to avoid having to maintain two code paths if possible. If we're sending this upstream, it would be best to make the system simple.

@foxish
Copy link
Member

foxish commented Dec 5, 2017

Is this waiting for any more changes? Do we need the pepperdata build system also to install minikube?

@mccheah
Copy link
Author

mccheah commented Dec 5, 2017

Still ironing out some errors in the Riselab system, but yes we will need Pepperdata Jenkins to install minikube as well. Once we merge this into branch-2.2-kubernetes, ALL open PRs need to rebase on top of this change to pass CI... and to not delete Minikube accidentally.

@@ -37,10 +37,22 @@ private[spark] object Minikube extends Logging {

def getMinikubeStatus: MinikubeStatus.Value = synchronized {
val statusString = executeMinikube("status")
<<<<<<< HEAD

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad merge?

@echarles
Copy link
Member

Great to have the option to run concurrent integration tests on an already existing minikube instance with different tags. I haven't read the proposed changes, but how do you create the image tags : rebuilding or simply tagging an initial build?

However I'd like to second @ssuchter about the interest in keeping the previous behavior: a CI server should not have minikube pre-installed. I think it will be harder to ask Apache infra to have this on their servers, rather than letting maven download it.

My vote would go for concurrent run (with tags) on a downloaded minikube, but an option to connect to an existing one, so a mix of existing and proposed behavior.

@foxish foxish mentioned this pull request Dec 13, 2017
@echarles
Copy link
Member

I have run the integration test on this branch and get a failure, which I suppose, is a random one as described in #571.

- Run PySpark Job on file from SUBMITTER with --py-files
- Run PySpark Job on file from CONTAINER with spark.jar defined
- Run SparkR Job on file locally
- Run SparkR Job on file from SUBMITTER
- Simple submission test with the resource staging server.
- Enable SSL on the resource staging server
- Use container-local resources without the resource staging server
- Dynamic executor scaling basic test
- Use remote resources without the resource staging server.
- Mix remote resources with submitted ones.
- Use key and certificate PEM files for TLS.
- Use client key and client cert file when requesting executors
- Added files should be placed in the driver's working directory. *** FAILED ***
  java.net.ConnectException: Failed to connect to /192.168.99.100:30941
  at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:225)
  at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149)
  at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
  at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
  at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
  at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
  at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
  at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
  at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
  at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
  ...
  Cause: java.net.ConnectException: Connection refused
  at java.net.PlainSocketImpl.socketConnect(Native Method)
  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
  at java.net.Socket.connect(Socket.java:589)
  at okhttp3.internal.platform.Platform.connectSocket(Platform.java:124)
  at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:223)
  at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149)
  at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
  ...
- Setting JVM options on the driver and executors with spaces.
- Submit small local files without the resource staging server.
- Use a very long application name.
...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants