The scratch disk provides temporary storage that is much faster than the home file system. This is particularly important if, for example, you launch 16 processes on a single node, that each need to read in files (or worse, you launch an MPI program over 10 nodes, with 16 processes per node). Your input files will be read 16 (or, in the MPI case: 160) times. In case you have 1 or 2 tiny input files, it may be acceptable to read these directly from the home file system. If you have many files and/or very large files, however, this will put too high a load on the home file system - and slow down your job considerably.

The solution is to copy your input data to the local scratch disk of each node before starting your application. Then, each of the 16 processes on that node can read the input data from the local scratch disk. For the single-node example, you reduce the number of file reads from the home file system from 16 to 1 (i.e. only the copy operation).

Copying data to scratch for a single node job

The $TMPDIR environment variable points to a temporary folder on the local scratch disk and can be used to write files to scratch. For a single-node job, copying your input can be done simply by the cp command. For example, to copy single big input file from your home to the local scratch disk

cp $HOME/big_input_file "$TMPDIR"

Or, to copy a whole directory with input files

cp -r $HOME/input_dir "$TMPDIR"

Copying data to scratch for a multi-node job

For the MPI example involving 10 nodes, copying the data to each of the local scratch disks would still result in the input files being read 10 times. To avoid that, we have developed the mpicopy tool. Mpicopy reads the file from the home file system only on the first node, and from there, broadcasts it to all nodes that are assigned to you. To use the mpicopy tool you need to load the mpicopy and openmpi modules first. You can specify a target directory using the -o argument, but by default mpicopy copies to the $TMPDIR directory. For example

module load 2020
module load mpicopy
module load OpenMPI/4.0.3-GCC-9.3.0 
mpicopy $HOME/big_input_file "$TMPDIR"/

Note that mpicopy also copies directories recursively by default, you don't need to specify the -r option.

mpicopy $HOME/input_dir

An MPI program can then be started with the corresponding input file

mpiexec my_program "$TMPDIR"/big_input_file
  • No labels