Other schedulers

This chapter provides some instructions to use Flux on a cluster using another Batch Scheduler than PBS.

Run Flux in command line

Most of the requirements to use Flux through a Batch scheduler are actually the same as running Flux in command line. You can refer to this documentation: How to run Flux in command lines?. You’ll see how to set all the mandatory environment variables for:

Memory management (JVM_MEMORY, MEMSIZC3, MEMSIZN3)
Core management (FLUX_NCORES)

and the command line arguments to run Flux (to run a Python file or to run a specific application for example).

We recommend to run Flux using a python script and using non-interactive mode on a cluster. The command line should then looks like this:

# Run Flux
$PATH_TO_FLUX_EXECUTABLE -application $FLUX_APP -runPyInSilentModeAndExit
$YOUR_PYTHON_SCRIPT -batch

Batch scheduler specific settings

Regarding a batch scheduler use for Flux, some additional settings are needed.

First, as the scheduler will provide a node list for the job, the node file has to be defined. The corresponding environment variable is FLUX_NODEFILE. For example, PBS node file is accessible with the environment variable PBS_NODEFILE, therefore, one need to set:

export FLUX_NODEFILE=$PBS_NODEFILE

Parametric distribution

Some Flux projects contain parameters that can be solved independently. Flux comes with a feature that allows the distribution of these parameters to speed-up this kind of computations. To use that feature, the following environment variables must be defined:

FLUX_PARAMETRIC: Indicator for using parametric distribution. Must be set to true
FLUX_PARAM_AUTO: Indicator for using automatic or manual computation of number of jobs. Must be set to Automatic or Manual
- In Automatic, Flux will submit a maximum of 127 jobs (+1 for the master) to solve the project. The number of solved parameters per job is automatically computed, and if there are less than 127 parameters, each job will solve 1 parameter.
- In Manual mode, the environment variable FLUX_PARAM_MAXJOBS must be set to the maximum number of cores you want to allocate for the job. One of these cores will be used for the master, and the other will be jobs submitted by Flux to solve the project. The number of solved parameters per job is automatically computed, and if there are less parameters than remaining cores, each job will solve 1 parameter and the exceeding cores won’t be used and will be free for other jobs. On the other hand, the number of cores used by the secondary Flux may be setted with the following variable: FLUX_PARAM_NCORES.

# Use parametric distribution
export FLUX_PARAMETRIC=true
export FLUX_PARAM_AUTO=Manual
export FLUX_PARAM_MAXJOBS=32
export FLUX_PARAM_NCORES=1

To submit jobs for the secondary flux, the job submission command with name option needs to be defined, using the environment variable SUBMITNAME. For example, with PBS, we use qsub -N:

export SUBMITNAME=qsub -N

Some specific environment variables need to be passed through the job submission command line. Most of the batch schedulers use the -v option, but if needed, this option can be changed using the SUBMITVAR command line:

export SUBMITVAR=-v

Finally, as the different batch schedulers handle the order of the arguments differently, SUBMIT_OPT_PRE environment variable is used and take the value 1 or 2:

If SUBMIT_OPT_PRE = 1 (as for PBS), the environment variables are defined before the script and the command line will be like this:

qsub -N <job_name> -v <env_var_list> my_script.sh

If SUBMIT_OPT_PRE = 2 (for example for Slurm), the environment variables are defined before the script and the command line will be like this:

oarsub -N <job_name> my_script.sh -v <env_var_list>