Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Docker logging handler #576

Open
wants to merge 13 commits into
base: branch-2.2-kubernetes
Choose a base branch
from

Conversation

echarles
Copy link
Member

What changes were proposed in this pull request?

To better view the progress of the integration test when the docker image are build, this custom handler prints progress.

How was this patch tested?

Run integration test and check you see output like this:

...
* DONE (SparkR)
Building docker image spark-base from Dockerfile dockerfiles/spark-base/Dockerfile
Docker build: Step 1 : FROM openjdk:8-alpine
...

@foxish
Copy link
Member

foxish commented Dec 13, 2017

I think the println is adding a scalastyle warning. Change LGTM otherwise - but please note that we are moving the docker image building step to a separate step which may not necessarily be in scala code in the near future.

@echarles
Copy link
Member Author

Thx for the review @foxish. Any link to the Docker image building change?

@foxish
Copy link
Member

foxish commented Dec 13, 2017

That change isn't in yet - but we did have a discussion about it in today's meeting. It'll likely be started by the work in #521

@echarles
Copy link
Member Author

Reading the meeting notes, I realize I had to leave just before you have talked about the PR I have submitted. too bad... I will follow-up also the point about the docker images.

For this PR, if the other way to build docker images is implemented quickly, no need to merge, otherwise I think this can help to have a better view (or at least to be less worried as a manual tester) about the long silent pause while running the integration tests.

@kimoonkim
Copy link
Member

@echarles wrote:

I think this can help to have a better view (or at least to be less worried as a manual tester) about the long silent pause while running the integration tests.

Thanks for looking into this. I also observed the same issue, which tests developer patience :-)

We recently fixed this in the integration-test repo in a different way. The stock LoggingBuildHandler used by the SparkDockerImageBuilder uses slf4j. And our integration-test uses log4j. So the solution was simply to include a slf4j-to-log4j bridge jar as maven dependency:

   <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-log4j12</artifactId>
      <version>${slf4j-log4j12.version}</version>
      <scope>test</scope>
    </dependency>

Can you try this approach in this PR? If it works, then we would not need the custom logging handler nor println.

@echarles
Copy link
Member Author

Hello @kimoonkim, thx for the review and suggestion. I have pushed an update which now uses slf4j-log4j without custom handler. In my previous tests, I remember I tried that and came with verbose output like:

2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=2fdfe1cd78c2, status=Extracting, stream=null, error=null, progress=[==================================================>] 2.065 MB/2.065 MB, progressDetail=ProgressDetail{current=2064911, start=0, total=2064911}}
2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=2fdfe1cd78c2, status=Pull complete, stream=null, error=null, progress=null, progressDetail=ProgressDetail{current=0, start=0, total=0}}
2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=82630fd6e5ba, status=Extracting, stream=null, error=null, progress=[==================================================>]    239 B/239 B, progressDetail=ProgressDetail{current=239, start=0, total=239}}
2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=001511eb3437, status=Downloading, stream=null, error=null, progress=[>                                                  ] 1.065 MB/70.23 MB, progressDetail=ProgressDetail{current=1065296, start=0, total=70227876}}
2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=82630fd6e5ba, status=Extracting, stream=null, error=null, progress=[==================================================>]    239 B/239 B, progressDetail=ProgressDetail{current=239, start=0, total=239}}
2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=001511eb3437, status=Downloading, stream=null, error=null, progress=[=>                                                 ] 1.606 MB/70.23 MB, progressDetail=ProgressDetail{current=1605968, start=0, total=70227876}}
2017-12-24 11:21:45 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=82630fd6e5ba, status=Pull complete, stream=null, error=null, progress=null, progressDetail=ProgressDetail{current=0, start=0, total=0}}
2017-12-24 11:21:46 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=001511eb3437, status=Downloading, stream=null, error=null, progress=[=>                                                 ] 2.147 MB/70.23 MB, progressDetail=ProgressDetail{current=2146640, start=0, total=70227876}}
2017-12-24 11:21:47 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=001511eb3437, status=Downloading, stream=null, error=null, progress=[=>                                                 ] 2.687 MB/70.23 MB, progressDetail=ProgressDetail{current=2687312, start=0, total=70227876}}
2017-12-24 11:21:48 INFO  LoggingBuildHandler:36 - build: ProgressMessage{id=001511eb3437, status=Downloading, stream=null, error=null, progress=[==>                                                ] 3.227 MB/70.23 MB, progressDetail=ProgressDetail{current=3226992, start=0, total=70227876}}
..

Seeing nothing is not fine, seeing too much is not better...

An combined solution would be to use log4j (not println) with the custom handler that prints only what we want.

# Set everything to be logged to the file target/integration-tests.log
log4j.rootCategory=INFO, file
# Set everything to be logged to the file target/integration-tests.log and the console
log4j.rootCategory=INFO, file, stdout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we want console here. On the upside, I understand most people actually do not know they can check target/integration-tests.log. On the downside, too many log messages could get people lost.

Anyone has a strong opinion on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping! @echarles or anyone else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I am good with this PR pending this question being resolved. Maybe we just want to remove console output part and move on so the PR can be merged?

@kimoonkim
Copy link
Member

Seeing nothing is not fine, seeing too much is not better...

I would say this is actually better because the long silence is much worse. These messages at least help us debug which image has failed to build.

An combined solution would be to use log4j (not println) with the custom handler that prints only what we want.

That sounds good to me, but @foxish were saying we want to retire this scala docker image builder code.

we are moving the docker image building step to a separate step which may not necessarily be in scala code in the near future.

So maybe we should first agree on what we want to do next, and this PR is good enough as is?

@kimoonkim
Copy link
Member

Hello @kimoonkim, thx for the review and suggestion. I have pushed an update which now uses slf4j-log4j without custom handler.

@echarles Thanks for the change!

@echarles
Copy link
Member Author

echarles commented Jan 2, 2018

@kimoonkim I just made a new push reintroducing the custom LoggingBuildHandler to have a medium level of messages on the console.

@throws[DockerException]
def progress(message: ProgressMessage) {
if (message.error != null) throw new DockerException(message.toString)
else if (message.status == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the null status mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not much in the spotify source. It just tells

  // In 1.8, the message instead looks like
  // {"status":"<some-tag>: digest: <digest> size: <size>"}

I came to this trying to find a correct level of output.

def progress(message: ProgressMessage) {
if (message.error != null) throw new DockerException(message.toString)
else if (message.status == null) {
println("build: {}", message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using println and use logging API instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just pushed a new commit using log4j instead of println.

@echarles
Copy link
Member Author

The last push takes into account all @kimoonkim comments (not logging anymore to console but to file + added a small scaladoc to make this information available to newcomers).

Copy link
Member

@kimoonkim kimoonkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for writing this change!

Does anyone else want to take a look at this PR?


trait Logging {

private val log: Logger = LogManager.getLogger(this.getClass)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Spark already has something similiar in core?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I have seen that also. Does it make sense to follow up with this PR and the apache repo?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible let's reuse what Spark has, the same question will arise when we try to merge back to Spark as well.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants