-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploy sonar prototype to Fox #57
Comments
If cron is not possible, it may be possible to trigger the sonar run remotely by ssh, every few minutes, for every node. Anyway we'll find it out. |
I think for the Fox meeting, better to propose that you would deploy this on 4 compute nodes (2 with GPU and 2 without). i.e. not all at the same time. Even on NRIS side (consultation with Radovan et all) we ask "some" nodes for the fist round. This gives the admins more assurance on a production system. |
Some results from a meeting
|
The analysis is live on Fox (on a small number of nodes). |
The analysis is now live on all Fox nodes. |
Lockfile cleanup can be accomplished by an |
The analysis is live on Fox cpu, gpu, interactive, and login nodes. Data are exfiltrated to the remote analysis host. All blocking bugs are really sonar bugs. |
Lockfile removal: Edit: Discussion with fox admins: We'll move the homedir on the int and login nodes, it was always the intention that it should have been /var/run/sonar, and if new nodes are created they will get that too. Edit^2: Except then we need a service to create /var/run/sonar on boot, probably, one can do something via tmpfiles.d(5) but it's becoming fairly elaborate, esp if we want to move toward making sonar a systemd service anyway. |
I'm kicking the lockfile issue down the road, issue #352. |
Since sonar and sonalyze now seem to be OK for multi-node systems, we should start collecting data on Fox.
There are some issues around whether Sonar is fast enough, which I'm addressing:
At the moment a Sonar invocation on a Fox node takes about 100ms (this is not a high quality measurement); as NordicHPC/sonar#86 shows, we should be able to cut this time by roughly half. Deploying to Fox probably does not depend on that fix, but it would be nice to get it done.
We would like to also implement the other three features for performance, reliability, and quality reasons, but it would be good to first measure their impact on Fox nodes with and without GPUs.
In addition, there's the deployment checklist:
The text was updated successfully, but these errors were encountered: