dmftar is a utility designed by SURF specifically for the Data Archive. It is available on the HPC systems and can be used to largely automate the archive and restore processes. dmftar is a wrapper for the Linux tool gnutar and automatically creates multi-volume archive files of any size (but defaults to 100 GB) and can incorporate the transfer of the files to the archive file system if necessary.

Dmftar source code is available at https://gitlab.com/surfsara/dmftar.

Caveats to using dmftar

  • dmftar is currently only available on Snellius and the Data Archive service nodes (instructions for accessing nodes can be found on our wiki pages for Snellius and Data Archive).
  • For the remote archiving functionality (from system to system), key-based authentication needs to be set up.
  • When using dmftar please do not invoke too many processes at the same time since that can stress the storage facility. Instead, if your data can be slowly archived, stage jobs on Snellius or Lisa.

Help Function

You can access the help function using:

dmftar –help

Default Packing

Standard command to pack a folder into an archive file (or multivolume archive files if larger then 100GB):

dmftar -c -f <data>.dmftar <data>/

The resulting data.dmftar file can now be moved using standard commands like “cp” or “mv”.

Verify All Files Packed Successfully

After packing, you can verify whether all the files in the archive file are also present in the original directory structure and vice versa:

dmftar --verify-content -f <data>.dmftar

Preview the Contents of a Packed File

To see a list of the files that have been packed into an archive file: 

dmftar -t -f <data>.dmftar

Unpack All Files

To unpack all the files in an archive file:

dmftar -x -f <data>.dmftar

Unpack Specific Files and Folders

To unpack a single file (e.g. data1-1.dat) provide an extraction pattern containing the file name and path:

dmftar -x -f <data>.dmftar/ <data>/<data1>/data1-1.dat

To unpack specific subdirectories (e.g. data1 and data2):

dmftar -x -f <data>.dmftar/ <data>/data{1..2}

Wildcards can also be used if the option “-o=--wildcards" is used to indicate the presence of wildcards:

dmftar -x -o=--wildcards -f <data>.dmftar/ <data>/data*/data1-1.dat

This command will unpack the first file of every subfolder within the archive file.

Remote Packing

dmftar can also directly transfer the newly created archive file to another system (like the user home folder on the Data Archive) by adding user and system information (provided key-based authentication and key-forwarding have been enabled):      

dmftar -c -f <login>@archive.surfsara.nl:<data>.dmftar <data>/

Remote Unpacking

dmftar can also directly unpack and restore files remotely from the Data Archive (provided key-based authenticationand key-forwarding have been enabled):

dmftar -x -f <login>@archive.surfsara.nl:<data>.dmftar     

Removing Packed Files

As a security precaution, the archived files are read-only. To remove one from your local system after moving a copy to the Data Archive:

dmftar --delete-archive -f <data>.dmftar/

Advanced Options

To see a full list of the available advanced options:

dmftar

Table of contents