In the Blind SANE environment, you cannot see the sensitive data you work with. You will have to prepare a script or container that will 'blindly' be executed in this environment. The script or container you will be executing will not have access to the internet. The data provider may choose to provide you with a sample and/or synthetic data set to prepare your analysis script. The data provider checks and releases the outputs of the script. In this document, you will find the steps to work with Blind SANE.
Your analysis can run via a Python script or a Docker container. Currently, the default method is a Python script. If you wish to use a Docker container, please contact the SURF servicedesk to enable this for your project.
A Python script
script.py
If the script is prepared by the data provider you will find the script located at /data/sane-data/scripts/
The repository should also contain a requirements.txt
containing a list of all the pip packages needed for the analysis. For more information, see: https://pip.pypa.io/en/stable/reference/requirements-file-format/#
The Python script will be called as such:
python3 script.py -i <input-dir> -o <output-dir> -t <temp-dir> 1> stdout.txt 2> stderr.txt |
All output that would be printed to stdout and stderr will be flushed to stdout.txt and stderr.txt respectively and will be made available in the output directory
A Docker container
When writing your container keep in mind that the sensitive dataset will be made available to you at the directory /input
within the container and that your results should be written to /output
within the container.
Go to SURF Research Cloud portal and log in with the identity that the data provider invited you to the Collaborative Organisation (CO).
python_script_source
' fill in the public git repository containing your script.py (ending with .git) OR the name of the folder within the /data/sane-data/scripts/
folder in which the data provider placed the script.py filedocker_repo_url
' fill in the public git repository URL to your Dockerfile OR in the field 'docker_image_name'
provide the name of your Docker imageYour analysis will run in the background and you will receive an e-mail when the analysis has been completed. Upon completion, it is your responsibility to delete the workspace. Not deleting the workspace will result in credits being unnecessarily consumed.
The results are written to the directory /results.
Ask the data provider outside of SANE (e.g. via e-mail) to review these results. The data provider will make the output results available outside of SANE. The data will not be deleted when you delete your workspace (as it is a shared filesystem).