-
Notifications
You must be signed in to change notification settings - Fork 10
HPC_User_Best_Practices
fgeorgatos edited this page Aug 31, 2012
·
1 revision
TOP10 best practices for scalable HPC systems (For Uni.Lu and beyond)
- Be a good HPC-citizen: respect the current AUP & do report identified issues via ticketing system, on as needed basis
- Reuse existing -and tested- mechanisms for job submission in the system queues; read the FAQ thoroughly
- Read about and apply standard HPC techniques & practices (at least check the content index!): NCSA CI etc: http://www.citutor.org/login.php
- Reuse existing optimized libraries and applications where possible (MPI, compilers, libraries, modules)
- Ensure proper disk sizing/backup/redundancy level for your application situation; declare a "project" if your needs are special
- Make your scripts generic (respect Project Directory Structure); Use variable aliasing - no hardcoding of full path names
- Take advantage of modules, to manage multiple versions of software
- Take advantage of easybuild, to manage organizing software from many sources; either for own software or 3rd-party
- Identify the policy class your tasks belong to and try to make the most efficient work out of your allocation; avoid underutilization, this harms other users
- Consider sysadmin time planning: realize that all incoming issues have to be prioritized according to user community impact
Hints & Tips:
- Do code versioning for the sources or scripts you develop (ref: github/gforge); eg. do you have a history of all last month's revisions?
- Keep a standard eg. "Hello World" example ready, in case you need to do differential debugging on a suspected system problem.
- Opt for a scripting language for your code integration but, a faster optimized one for the "application kernel" (both maintainable & fast!)
- Do some form of checkpointing if your individual jobs run for more than 1 day; the advantages you get out of it are plenty; see FAQ on http://hpc.uni.lu
- Avoid looking for hacks to overcome existing policies; rather document your need and the rational behind it and propose it as a "project"
- Take advantage of GPU technology if applicable in your case; be careful with the GPU vs cores speedup ratios (ie. does it worth the trouble to employ the GPUs?)
- If you have a massive workflow of jobs to manage, do not reinvent the wheel: contact the sysadmins to poll for advice on your approach & collect ideas
- Report any plans to use HPC systems in a special way, as early as possible; it helps both sides to prepare nicely and avoids frustration
- If you have deadlines to adhere to, kindly notify about; the sysadmin service is always best-effort but we do try to keep our users happy
- If you find techniques that you consider elegant and relevant to other users' work, you are auto-invited to report them to common mailing list hpc-users!