Multi-node Jobs
Recommendations for running multi-node jobs.
- Check that a standard single node job runs correctly to validate the installation, paths, and submission line.
- A host must be defined, and then passed to mpirun/mpiexecwith--hostfile <hostfilename>.mpirun -np < numprocs > hostfile hostfilename > nfx_exe > i <casefile>Tip: Learn how to define a host file on the OpenMPI FAQ page.Note: Host files depend on the system topology. If PBS is used for scheduling jobs, it is aware of the topology and it is possible to usePBS_NODEFILEby using--hostfile $PBS_NODEFILE.
- Use a PBS or an equivalent job scheduler for multi-node runs.
- If launching directly from command line without using PBS or any equivalent
                        job scheduler, ssh access between nodes without a password prompt is
                            needed.Tip: Learn how to get ssh access without a password on the OpenMPI FAQ page.Important: Only use this method if you have an advanced understanding. Consult with your system admin for more information or recommendations.
2022.1 or Newer
It is advised to run general diagnostics on the system to make sure the infiniband connection (packages, connections, etc.) is working.
2021 or Older
It is advised to run general diagnostics on the system to make sure ibverbs (packages, connections, etc.) is working.
For recent versions of nanoFluidX, the ibverbs version of
                OpenMPI must be sourced. Starting from 2019.1, you can source
                    set_nFX_environment.sh ibverbs.