dmftar is a utility designed by SURF specifically for the Data Archive. It is available on the HPC systems and can be used to largely automate the archive and restore processes. dmftar is a wrapper for the Linux tool gnutar and automatically creates multi-volume archive files of any size (but defaults to 100 GB) and can incorporate the transfer of the files to the archive file system if necessary.
Dmftar source code is available at https://gitlab.com/surfsara/dmftar.
Caveats to using dmftar
- dmftar is currently only available on Snellius and the Data Archive service nodes (instructions for accessing nodes can be found on our wiki pages for Snellius and Data Archive).
- For the remote archiving functionality (from system to system), key-based authentication needs to be set up.
- When using dmftar please do not invoke too many processes at the same time since that can stress the storage facility. Instead, if your data can be slowly archived, stage jobs on Snellius or Lisa.
Help Function
You can access the help function using:
Default Packing
Standard command to pack a folder into an archive file (or multivolume archive files if larger then 100GB):
dmftar -c -f <data>.dmftar <data>/
The resulting data.dmftar file can now be moved using standard commands like “cp” or “mv”.
Verify All Files Packed Successfully
After packing, you can verify whether all the files in the archive file are also present in the original directory structure and vice versa:
dmftar --verify-content -f <data>.dmftar
Preview the Contents of a Packed File
To see a list of the files that have been packed into an archive file:
dmftar -t -f <data>.dmftar
Unpack All Files
To unpack all the files in an archive file:
dmftar -x -f <data>.dmftar
Unpack Specific Files and Folders
To unpack a single file (e.g. data1-1.dat) provide an extraction pattern containing the file name and path:
dmftar -x -f <data>.dmftar/ <data>/<data1>/data1-1.dat
To unpack specific subdirectories (e.g. data1 and data2):
dmftar -x -f <data>.dmftar/ <data>/data{1..2}
Wildcards can also be used if the option “-o=--wildcards" is used to indicate the presence of wildcards:
dmftar -x -o=--wildcards -f <data>.dmftar/ <data>/data*/data1-1.dat
This command will unpack the first file of every subfolder within the archive file.
Remote Packing
dmftar can also directly transfer the newly created archive file to another system (like the user home folder on the Data Archive) by adding user and system information (provided key-based authentication and key-forwarding have been enabled):
dmftar -c -f <login>@archive.surfsara.nl:<data>.dmftar <data>/
Remote Unpacking
dmftar can also directly unpack and restore files remotely from the Data Archive (provided key-based authenticationand key-forwarding have been enabled):
dmftar -x -f <login>@archive.surfsara.nl:<data>.dmftar
Removing Packed Files
As a security precaution, the archived files are read-only. To remove one from your local system after moving a copy to the Data Archive:
dmftar --delete-archive -f <data>.dmftar/
Advanced Options
To see a full list of the available advanced options: