Running on batch systems

This section walks you through launching HPX applications on various batch systems.

How to use HPX applications with PBS

Most HPX applications are executed on parallel computers. These platforms typically provide integrated job management services that facilitate the allocation of computing resources for each parallel program. HPX includes support for one of the most common job management systems, the Portable Batch System (PBS).

All PBS jobs require a script to specify the resource requirements and other parameters associated with a parallel job. The PBS script is basically a shell script with PBS directives placed within commented sections at the beginning of the file. The remaining (not commented-out) portions of the file executes just like any other regular shell script. While the description of all available PBS options is outside the scope of this tutorial (the interested reader may refer to in-depth documentation for more information), below is a minimal example to illustrate the approach. The following test application will use the multithreaded hello_world_distributed program, explained in the section Remote execution with actions: Hello world.

#!/bin/bash
#
#PBS -l nodes=2:ppn=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`

Caution

If the first application specific argument (inside $APP_OPTIONS) is a non-option (i.e., does not start with a - or a --), then the argument has to be placed before the option --hpx:nodes, which, in this case, should be the last option on the command line.

Alternatively, use the option --hpx:endnodes to explicitly mark the end of the list of node names:

pbsdsh -u $APP_PATH --hpx:nodes`cat $PBS_NODEFILE` --hpx:endnodes $APP_OPTIONS

The #PBS -l nodes=2:ppn=4 directive will cause two compute nodes to be allocated for the application, as specified in the option nodes. Each of the nodes will dedicate four cores to the program, as per the option ppn, short for “processors per node” (PBS does not distinguish between processors and cores). Note that requesting more cores per node than physically available is pointless and may prevent PBS from accepting the script.

On newer PBS versions the PBS command syntax might be different. For instance, the PBS script above would look like:

#!/bin/bash
#
#PBS -l select=2:ncpus=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`

APP_PATH and APP_OPTIONS are shell variables that respectively specify the correct path to the executable (hello_world_distributed in this case) and the command line options. Since the hello_world_distributed application doesn’t need any command line options, APP_OPTIONS has been left empty. Unlike in other execution environments, there is no need to use the --hpx:threads option to indicate the required number of OS threads per node; the HPX library will derive this parameter automatically from PBS.

Finally, pbsdsh is a PBS command that starts tasks to the resources allocated to the current job. It is recommended to leave this line as shown and modify only the PBS options and shell variables as needed for a specific application.

Important

A script invoked by pbsdsh starts in a very basic environment: the user’s $HOME directory is defined and is the current directory, the LANG variable is set to C and the PATH is set to the basic /usr/local/bin:/usr/bin:/bin as defined in a system-wide file pbs_environment. Nothing that would normally be set up by a system shell profile or user shell profile is defined, unlike the environment for the main job script.

Another choice is for the pbsdsh command in your main job script to invoke your program via a shell, like sh or bash, so that it gives an initialized environment for each instance. Users can create a small script runme.sh, which is used to invoke the program:

#!/bin/bash
# Small script which invokes the program based on what was passed on its
# command line.
#
# This script is executed by the bash shell which will initialize all
# environment variables as usual.
$@

Now, the script is invoked using the pbsdsh tool:

#!/bin/bash
#
#PBS -l nodes=2:ppn=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u runme.sh $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`

All that remains now is submitting the job to the queuing system. Assuming that the contents of the PBS script were saved in the file pbs_hello_world.sh in the current directory, this is accomplished by typing:

qsub ./pbs_hello_world_pbs.sh

If the job is accepted, qsub will print out the assigned job ID, which may look like:

$ 42.supercomputer.some.university.edu

To check the status of your job, issue the following command:

qstat 42.supercomputer.some.university.edu

and look for a single-letter job status symbol. The common cases include:

  • Q - signifies that the job is queued and awaiting its turn to be executed.

  • R - indicates that the job is currently running.

  • C - means that the job has completed.

The example qstat output below shows a job waiting for execution resources to become available:

Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
42.supercomputer          ...ello_world.sh joe_user               0 Q batch

After the job completes, PBS will place two files, pbs_hello_world.sh.o42 and pbs_hello_world.sh.e42, in the directory where the job was submitted. The first contains the standard output and the second contains the standard error from all the nodes on which the application executed. In our example, the error output file should be empty and the standard output file should contain something similar to:

hello world from OS-thread 3 on locality 0
hello world from OS-thread 2 on locality 0
hello world from OS-thread 1 on locality 1
hello world from OS-thread 0 on locality 0
hello world from OS-thread 3 on locality 1
hello world from OS-thread 2 on locality 1
hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 1

Congratulations! You have just run your first distributed HPX application!

How to use HPX applications with SLURM

Just like PBS (described in section How to use HPX applications with PBS), SLURM is a job management system which is widely used on large supercomputing systems. Any HPX application can easily be run using SLURM. This section describes how this can be done.

The easiest way to run an HPX application using SLURM is to utilize the command line tool srun, which interacts with the SLURM batch scheduling system:

srun -p <partition> -N <number-of-nodes> hpx-application <application-arguments>

Here, <partition> is one of the node partitions existing on the target machine (consult the machine’s documentation to get a list of existing partitions) and <number-of-nodes> is the number of compute nodes that should be used. By default, the HPX application is started with one locality per node and uses all available cores on a node. You can change the number of localities started per node (for example, to account for NUMA effects) by specifying the -n option of srun. The number of cores per locality can be set by -c. The <application-arguments> are any application specific arguments that need to be passed on to the application.

Note

There is no need to use any of the HPX command line options related to the number of localities, number of threads, or related to networking ports. All of this information is automatically extracted from the SLURM environment by the HPX startup code.

Important

The srun documentation explicitly states: “If -c is specified without -n, as many tasks will be allocated per node as possible while satisfying the -c restriction. For instance on a cluster with 8 CPUs per node, a job request for 4 nodes and 3 CPUs per task may be allocated 3 or 6 CPUs per node (1 or 2 tasks per node) depending upon resource consumption by other jobs.” For this reason, it’s recommended to always specify -n <number-of-instances>, even if <number-of-instances> is equal to one (1).

Interactive shells

To get an interactive development shell on one of the nodes, users can issue the following command:

srun -p <node-type> -N <number-of-nodes> --pty /bin/bash -l

After the shell has been opened, users can run their HPX application. By default, it uses all available cores. Note that if you requested one node, you don’t need to do srun again. However, if you requested more than one node, and want to run your distributed application, you can use srun again to start up the distributed HPX application. It will use the resources that have been requested for the interactive shell.

Scheduling batch jobs

The above mentioned method of running HPX applications is fine for development purposes. The disadvantage that comes with srun is that it only returns once the application is finished. This might not be appropriate for longer-running applications (for example, benchmarks or larger scale simulations). In order to cope with that limitation, users can use the sbatch command.

The sbatch command expects a script that it can run once the requested resources are available. In order to request resources, users need to add #SBATCH comments in their script or provide the necessary parameters to sbatch directly. The parameters are the same as with run. The commands you need to execute are the same you would need to start your application as if you were in an interactive shell.