This page contains the instructions relevant for the party that wishes to provide data in SANE, a role we shall further refer to as the 'data provider'. The data provider is the only role that can upload data, give access to the data, and release output results.
1. Accept the SANE Collaboration invite
The researcher has been given the task of initiating the request for a SANE collaboration (see Requesting a SANE project). This is done intentionally for two reasons: 1) to alleviate the effort from the data provider, 2) to make sure funding is available through the researcher (the data providers should not need to have their own budget) Once these steps are completed, you will receive an invitation to the SANE Collaboration, which you should accept.
2. Invite data provider(s)
As a data provider, you might want to add more data providers to the project. The first step for this is to invite them as Collaboration admins as shown in: Invite admins and members to a collaboration but with the extra step of inviting them to certain groups in the Collaboration to give them the correct role in the Research Cloud environment. The groups that need to be selected for data providers are src_co_admin and src_co_developer.
3. Prepare the upload environment
- Go to the SURF Research Cloud portal and log in
- Click on the "Networks (advanced)" tab and then click on the "+" button to create a new internal network
- Select the Collaboration from step 1 ("Starting a Collaboration")
- Select in the Cloud Provider tab: "SURF HPC Cloud Network"
- Give the network a name (e.g. SANE network) and finish the wizard
- Go to "Profile" → Expand the tile of the Collaboration from step 1
- Click on the "Secrets" tab
- Click on the "+" button to add a Secret with the name "SANE_SMBPW" and a random value
- Click on the "+" button to add a Secret with the name "SANE_SMBUSER" and a random value (the username must comply with the standards imposed by POSIX and Ubuntu as outlined here)
- Go back to the homepage
- Click on the "Create new" button in the 'Create new storage' tile
- Select the Collaboration from step 1 ("Starting a Collaboration")
- Select in the Cloud Provider tab: "SURF HPC Cloud volume"
- Select a flavour that is large enough to store the sensitive data (and any generated output results of the research)
- Give the volume a name (e.g. SANE volume) and finish the wizard
- Click on the "Create new" button in the 'Create new workspace' tile
- Search and select the "SANE Data server" catalog item
- In the "Options" tab select the internal network and storage you created in the step 2 and step 4
- Finish the wizard
- (Optional) If you want to upload data using ResearchDrive you need to connect your Research Drive account before starting the "SANE Data Provider Portal"
- Click on the "Create new" button in the 'Create new workspace' tile
- Search and select the "SANE Data Provider Portal" catalog item
- In the "Options" tab select the internal network you created in step 2
- Finish the wizard
4. (Optional) Add Python packages to Tinker SANE user workspace
Researchers often need additional packages to extend the functionality of their analysis software or to perform specific tasks. If they are using Tinker SANE, you can simplify this process by automatically installing a predefined list of Python packages in the researcher’s new environment.
We provide a default list of essential Python packages, commonly used for data analysis and scientific research. Researchers can also create their own custom lists of required packages by following the same file format.
If you decide to accept a user’s request to add a custom list of Python packages, please consult the detailed guide: Adding Python packages to Tinker SANE
We plan to introduce support for R packages by 2025, which will allow similar functionality for researchers working in R. This update will extend the flexibility of Tinker SANE environments to cover multiple programming languages.
- Adding packages from a list is only possible in newly created Tinker SANE environments. For existing environments, you can either create a new environment or manually install the required packages (see Step 6). This limitation exists because adding packages to an existing environment might interfere with the current package setup.
- This feature is currently only available for Tinker SANE environments, and is not supported in Blind SANE environments.
To manually add Python or other programming language packages (such as R), follow the instructions outlined in Step 6 for package installation.
5. Upload the sensitive data and packages
- From the Research Cloud portal, log into the Data Provider Portal. There are three options:
- Use the "Access" button in a browser with TOTP
- Copy the IP address to any Remote Desktop client
- Use a terminal to log in with SSH
- Use the Data Provider Portal to copy sensitive data and, if applicable, data received from the researcher (which you have accepted) as well as software packages to the data server.
- The SANE data server can be found in the Data Provider Portal at
/data/sane-data
- You can copy data to the
/data/sane-data/source
folder - You can copy the software packages to the
/data/sane-data/scripts
folder
- You can copy data to the
- Copy the data using for example Cyberduck, rsync, or ResearchDrive (other options can be winscp or scp on the command line)
- Cyberduck: Upload data to a workspace with Cyberduck
- ResearchDrive: Connect Research Drive
- The SANE data server can be found in the Data Provider Portal at
The data is now made available in the SANE data server, which can be attached to the SANE analysis environments (either Tinker or Blind) by the researcher. The researcher will namely connect their analysis environment to the private network that was created in Step 3. The researchers can now follow the Researcher instructions. Go ahead and notify the researcher that the data is ready to be analysed.
6. Invite researcher(s)
Now that the basic setup for a SANE project has been set, it is time to invite researchers who have the right to access the data in this SANE Collaboration. Exact instructions on adding researchers to the Collaboration can be found here: Invite admins and members to a collaboration
7. Review output results
The researcher (in the case of Tinker SANE) or the script (in the case of Blind SANE) will place the output results in the folder located in the directory /sane-data/results
and will inform the data provider of that outside of the SANE system (e.g. via e-mail or SURF Filesender). The data provider uses the SANE Data Provider Portal to review these output results. The data provider can download approved output data using the same procedure as upload data, and send them to the researcher outside of SANE (e.g. via e-mail).