The way you want to run your program may be very specific to your problem: potentially you need to do preprocessing, run an actual simulation, do postprocessing etc. Moreover, it depends on if - and how - your program is parallelized.

In this section, we limit ourselves to the simplest scenario: running a single instance of a serial (i.e. non-parallel) program, taking a single input file as an argument. For that, you add a line like

my_program "$TMPDIR"/big_input_file

to your job script.

This is the simplest example, but it is not the way you should generally use a HPC system! Running only a single instance of a serial program, you will only use one core in a single node, leaving the other cores idle. This, of course, is a waste of computational power (and a waste of your budget). In practice, you will want to use parallelization to use all cores in a node, or even multiple nodes. 

Note: the need for parallelization depends on how 'heavy' your program is. If you have some simple pre- and post-processing steps, it is of course fine to run these as serial programs.


Temporary files

Some program may generate temporary files. If your program does, and if you can set the location where they store the temporary files (e.g. as an argument or through a configuration file), make your program use the (fast) scratch space (accessible using the "$TMPDIR" environment variable). Other programs will use the current directory (i.e. the directory your shell was in when you launched the program) to store temporary files. If that is the case, change directory to the "$TMPDIR" directory, before launching your program. E.g.

cd "$TMPDIR"
$HOME/my_program

Do not use the /tmp directory to store temporary files, it may cause the node to crash! On the nodes, /tmp  has a limited size and should only be used by system processes.


  • No labels