Accessing the Archive

How can I gain access to the archive?

Please check this article here about accessing the archive:

Data Archive: Accessing the service

What is the price of the archive?

You can read about the cost of the archive and all other SURF services here in the SURF Services and Rates 2023 brochure in English, and in Dutch SURF Diensten en tarieven brochure here.

What is the ‘technical contact person’ and why do we need one?

A technical contact person is the Data Archive team’s main point of contact on the side of our members. This contact person is our key contact for managing the archive, and when new archive users need to be added to group accounts for example.

How can I increase my data limit?

The options to increase the data limit are the same as the initial options to access data storage within the archive. If you have an E-Infra or NWO grant, you should fill out a new request and refer to the original grant when doing so. Being awarded the grants means that: 1) the storage will be funded by NWO; 2) the storage exists under a contract duration (usually 1-2 years); and 3) the award depends on a computational requirement.

Alternatively, you can purchase a contract, which means that: 1) storage is purchased (not awarded); 2) storage is kept for up to 5 years per contract; and 3) the contract is based on purchasing blocks of 10TB. More information about the pricing of contracts can be read in the SURF Services and Rates 2023 brochure.

Why have I reached my data limit if my grant was for more data than I currently have archived?

Every user on the archive uses a budget from a grant or a contract. A budget may be used by multiple users. As a result it can be that your user group has reached the agreed data limit, even though you are well under the limit yourself. If you would like to inspect your current archive usage, you can refer to the article which explains how to use dsbudgetinfo to inspect your archive usage.

Using the Archive

Where can I find the acceptable use policy?

We describe some important practices to follow when using the Data Archive in the Data Archive: Acceptable Use Policy.

Do I need tape storage for my project?

The Data Archive is a tape library meant for long term storage of large quantities of static research data. That means that you will likely need tape storage to archive your data after your project (or your part of a project) ends. If you are looking for data services for data sharing or information storage of volatile data while working with the data during your project, you should probably consider Research Drive or Object Store.

When using Snellius or Research Cloud you may want to look into using some of the file systems that these services offer.  Here we describe the Snellius hardware and file systems.

If you are still debating whether you need tape storage for your project, you can discuss your project’s potential storage needs with a SURF advisor by contacting us through the service desk.

How should I prepare my files for optimal archiving?

We recommend that you tar package your files to avoid the situation where you have too many small files.

The dmftar tool was developed at SURF to help you optimally tar your files. Please use the following command to check how to use this tool:

dmftar –-help

What does it mean to stage the data?

The Data Archive consists of a disk pool and a tape library. Archived content can reside on our disk pool (online), in the tape library (offline) or both (dual). If your data is online or in dual mode, they are ‘staged’ for download on the disk pool. If they are on tape only (offline) they are not available on the disk pool and need to be staged from tape onto the disk pool.

You can identify the status of files using the following command:

dmls -l

It will give you a list of your files and their statuses. The documentation about DMF Statuses explains the differences in more depth.

Why are my files taking a long time to download?

Trying to download files which are offline (on tape) can take a long time. The standard and practice is to stage files before downloading them. It is likely that the files you are trying to download are not yet staged.

You can check the status of the files in a directory using the following: command

dmls -l

Please refer to the user documentation here which explains this:

Data Archive: DMF commands

Further you can follow the documentation here dmftar to learn how to stage your data, using the command:

dmget <file name>

I forgot my password. What do I do?

Forgotten passwords can be reset by following the instructions listed here. Please note that 3 successive failures to login will cause your IP address to be banned from the service for 10 minutes. Each successive login failure after this will double the ban time (e.g. 10 min, 20 min, 40 min, 80 min, etc). We strongly recommend double checking your credentials or resetting your password before attempting again after your ban. 

I tried to login and received a "Connection time out" error message. What is wrong?

You can check the service status page to see if the archive is currently up and running as expected. If something has happened, an estimation of the downtime will be provided in the remarks sections and communicated by email.

If the service is up and running, your IP address may be under a ban due to too many failed login attempts. As noted in section Failure to login, if you failed 3 times to log into your account or your institution uses NAT and all traffic is placed behind a single IP for which another user has triggered a ban, your IP address is likely banned from the service. The length of this ban varies based on how many unsuccessful attempts have been made. If your IP has been banned, you will receive a "REJECT" message from ICMP. If the ban is not lifted after 10 minutes, you are welcome to ask for help getting the ban lifted by creating a service desk ticket. 

Table of contents