At the time of writing, two LPARs (a total of 32 CPUs) have been set up to support interactive use on the HPCx system.
Please note, the interactive queue is intended for tasks such as debugging, visualisations etc. It is not meant for production jobs. Production jobs must use the batch queues.
The main steps to running an interactive parallel job are:
Each of these steps is described in more detail in the following sections.
All interactive parallel jobs must use a LoadLeveler job command file. This file contains a number of LoadLeveler keyword statements which specify the various requirements of the interactive job. In practice, the script file is very little different from the equivalent background (batch) script file.
Here is an example of a simple LoadLeveler script file for an interactive MPI job:
Here is a sample script for an MPI application using 5 CPUs (the remaining CPUs will be available to other users, be nice):
#@ job_type = parallel #@ job_name = hello # #@ cpus = 5 # #@ node_usage = shared # #@ wall_clock_limit = 00:10:00 #@ account_no = z001 # #@ notification = never # #@ class = inter32_1 # #@ queue # # Submit this script from the command line as follows: # # poe ./my_executable -llfile ./my_script.ll
In contrast to batch jobs, interactive jobs have to specify a job class.
#@ class = inter32_1'' statement serves this purpose.
For interactive jobs ``
#@ node_usage = shared'' has to be specified.
Note that output and error files should not be specified in order that the output is directed to the screen. Also note that, unlike for batch jobs, neither poe nor the executable are run from within the scriptfile - these are specified at job submission time (see `Running a program interactively', below).
Any environment variables, whether specific to the Parallel Operating Environment or the parallel application should be defined and exported from within the interactive session before running poe. Note that, for interactive batch jobs, it is not possible to set environment variables from within the LoadLeveler script (unlike for standard batch jobs).
For example (using /bin/ksh format), to specify the use of shared memory for message passing between tasks on the same node type
export MP_SHARED_MEMORY=yesYou have to set all environment variables required by your job before moving onto the next step
Run the program interactively using poe:
poe ./my_executable -llfile ./my_script.llwhere:
my_script.ll'' is the name of the file containing the LoadLeveler
The following LoadLeveler attributes are set for interactive processes (default values in brackets):
The maximum wall clock (elapsed time) limit (1 hour)
The maximum stack size per task instance (20mb)
The maximum amount of physical memory per task instance (884mb = 26.96gb/num_tasks)
The maximum data segment size per task instance (884mb-20mb=864mb)
The maximum core size file per task instance (960mb).
If an interactive job not run and reports the following message
LoadL_negotiator: 2544-870 Step fNN.N.0 was not considered to be run in this scheduling cycle due to its relatively low priority or because there are not enough free resources
There are fewer free CPUs on in the interactive region than you have asked for.