In iCE, researchers are not dealing at all with backups, replication or worrying about storage resources and quotas. The storage tiering framework of iRODS has the capability to automatically move data between any number of identified tiers of storage within a configured tiering group. iCE implements a wide storage tiering policy across all projects which dictates how data moves through storage resources.
iCE consists of the following three storage resources:
- Tier 0 - based on Ceph storage
- Tier 1 - based on SURF Object Store
- Tier 2 - based on SURF Data Archive (tape)
At the time of this writing the is the following quota policy per project and per tier in place:
Tier | Quota |
---|
Tier 0 | 1 TB |
Tier 1 | 1 TB |
Tier 2 | 1 TB |
Tier 0 - based on Ceph storage
The iRODS resource name for Tier 0 is surfResc1. Tier 0 is a Ceph storage and is used for incoming data as the first default storage resource.
When the quota is more than 70% full the oldest files get moved from Tier 0 to Tier 1 until the quota is lower than 70%.
Tier 1 - based on Object Store
The iRODS resource name for Tier 1 is surfObjectStore1. Tier 1 is a SURF Object Store storage and is used as the main data storage. It is of the kind WORM (write once read many). Data here is distributed over 3 data centers and as such is stored very securely and is highly available. On top of that, the new version of SURF Object Store allows for versioning when files are overwritten. This means that for data which is stored on SURF Object Store you immediately have the revisions functionality.
The policy of Tier 1 is when that when the quota is more than 70% full the oldest files (if larger than 500 MB) get moved from Tier 1 to Tier 2 until the quota is lower than 70%.
Tier 2 - based on SURF Data Archive (tape)
The iRODS resource name for Tier 2 is surfArchive. Tier 2 is a SURF Data Archive based on DMF tape storage and is used as cold storage. Tier 2 is the most cost effective storage but only large files (> 500 MB) can be stored. Users are allowed to move data back to Tier 1 if they need to access data from Tier 2.
How to check which tier a file is in
To see which tier a file is in, you can execute the following iCommand:
$ ils -L
/snow/home/icepocsurf:
member1 0 surfResc1 1835 2023-04-05.14:46 & testFile1.txt
generic /data/home/icepocsurf/testFile1.txt
member1 2 surfArchive 1073741824 2023-03-08.16:59 & testFile2.txt
generic /nfs/archive/home/icepocsurf/testFile2.txt
member3 1 surfObjStore1 153600 2023-03-07.15:37 & testFile3.txt
generic /surfS3Resc1/home/icepocsurf/testFile3.txt
member5 1 surfObjStore1 1024 2023-03-02.12:44 & testFile4.txt
generic /surfS3Resc1/home/icepocsurf/testFile4.txt
member1 2 surfArchive 524288000 2023-03-07.17:14 & testFile5.txt
generic /nfs/archive/home/icepocsurf/testFile5.txt
In the above example you see that we have five files: testFile1.txt, testFile2.txt, testFile3.txt, testFile4.txt, testFile5.txt.
From the output of the command we see that testFile1.txt has a very small size and, as the Tier 0 quota limit is not over 70%, the file stays in the Ceph resource (surfResc1).
Files testFile3.txt and testFile4.txt are in Tier 1 so they are stored on SURF Object Store resource (surfObjectStore1). The file testFile5.txt has been moved to Tier 2 based on its creation time and its large size, so it is stored on SURF Data Archive resource (surfArchive).
How to check your quota
You can easily check your project's storage quota at any given time by executing the following iCommand:
$ iquota
Resource quotas for user member1:
None
Global (total) quotas for user member1:
None
Group quotas on resources:
Resource: surfObjStore1
Group: icepocsurf
Zone: snow
Quota: 10,000,000,000 (10 billion) bytes
Over: -9,999,845,250 (-9 billion) bytes (Nearing quota)
Resource: surfResc1
Group: icepocsurf
Zone: snow
Quota: 2,000,000,000 (2 billion) bytes
Over: -1,999,998,904 (-1 billion) bytes (Nearing quota)
Getting data back from the archive
Please refer to the iRODS instructions here.