Licensed Software Usage on HPCC
Recently, we received a few tickets from users about their job(s) crashing due to the license number limit and/or the license server not responding. Here is some information about how it could happen and how to resolve such issue.
How does a license shortage lead to a job crash? To run the licensed software, a job needs to obtain and hold the license of the software. If the job could not obtain a license at the startup of the software, it will be terminated with the error message reporting the license was unavailable. There are several reasons to cause this issue, such as licenses expired, all licenses being in use or malfunction of license server.
How could we prevent this type of job crash from happening? User could use the “licensecheck” powertool to help identify the source of the problem. To use this tool, the user needs to load module “powertools” first. It can be done by running “module load powertools” from any dev node. Run “licensecheck” command without parameters will get the list the names of all the licensed software on HPCC. Run “licensecheck
If the license is expired or the license server does not function correctly, the error message from running either “licensecheck” command or the licensed software would likely indicate the error. Report the problem by contacting us with the details and we will fix it accordingly.
If the problem is caused by the shortage of the number of licenses, users’ eorts may be needed to resolve it. The HPCC may contact and ask the current users of the software to share the licenses by limiting the usage for each user. Here are a few things that a user may do to limit the license usage: Try not to run the job that could take all the licenses of the software. For example, try to limit the parallel execution of the many instances of the program. Note that one single job may take many licenses if it runs in parallel. Try not to run many of jobs that need the license at a time which may take up to all licenses. If you would like to submit many jobs at a time, please add a job dependency so that all the jobs submitted at once could run in sequence. In this way, the total number of licenses in use could be limited.
In this way, this job will be held in queue until after job 123456 successfully completes.
Use job scheduler to reserve the license and launch the job only when a license is available. To do so, you could add the line “#PBS -W x=gres: