Current

  • 2025.05.13 - 2025.09.13
    • The Research Drive (RD) service component is being updated
        • RD will migrate the software backend from ownCloud to Nextcloud
        • The migration of the different RD instances will be done in a staged process
        • Carefully assess the downtime for the instance you consume
        • Your RD environment is expected to be available by the end of the maintenance day
        • An user's migration documentation page describes the changes
        • The following communication was sent to all RD Community instance users. Its content is valid for users of remaining RD branded instances.

          Dear user,

          You receive this mail because you are a user on the SURF Community environment.

          We would like to inform you about the upcoming changes to the SURF Research Drive environment, which will be carried out on Tuesday 20th May. These changes are necessary to improve the scalability and security of the system and to continue to provide a good experience using Research Drive in the future. While we are trying to make this transition as transparent and smooth as possible, it is unfortunately inevitable that you may need to take some actions as well.

          In this email we would like to explain what you can expect.

          What will change
          During this maintenance, the underlying engine of Research Drive will be replaced by a new engine. Research Drive is currently using the ownCloud software as a software backend and we will replace this with Nextcloud. This will not have any impact on your data within Research Drive, but it may have an impact on the way you can access your data. For example, the layout of Research Drive will look slightly different, certain features will have to be accessed in a different way and you may have to setup a new connection via the Desktop Client or WebDAV. See our migration wiki for the full overview of these changes.

          During the time of maintenance
          During the maintenance, Research Drive will be completely inaccessible. Please note that the environment will not be accessible for several hours that day. If you need data on the day of the maintenance, please make this data available elsewhere prior to the date of the migration (for example, by storing a local copy of this data on your laptop or on an external hard drive).

          After the maintenance
          The environment is expected to be fully available again by the end of the day. Your files are still in the same place as you are used to. However, a number of things have changed compared to before the migration to Nextcloud. For example, the user interface has changed slightly, custom groups have been renamed to "Teams" and there are more options and permissions available when you want to share with another user or a team of users. In addition, you may need to re-establish an external connection.

          Desktop Client or WebDAV connection
          This is especially the case if you previously used the ownCloud Desktop Client or a WebDAV connection on your own system. These changes, and steps you may need to take to set up a new remote connection have been summarised by us and can be found on this page.

          If you have any further questions or uncertainties, contact us via the SURF Service Desk.

          Met vriendelijk groet/ Kind Regards,
          Team SURF Research Drive

    • SURF Research Cloud (SRC) platform users using RD will be impacted

  • Displayed storage usage may be inaccurate. Please check back for further notice.

Future 

  • n.a.


Past

  • 2025.07.02 10:00 - 2025.07.07 10:19
    • The HPC Cloud (HPCC) is less available due to issues with the network infrastructure
        • No new workspaces can be created on the HPCC provider
        • No Pause / Resume can be done on HPCC workspaces
        • Existing HPCC workspaces will continue to operate as normal
      • The issue is related to an overload on a set of network components
      • Recovery fixes are being applied but proving to be hard to fix the underlying issue(s)
      • The technical team has all-hands on deck to mitigate/fix the issue as soon as possible while looking for long-term solutions
      • We'll keep you posted
      • The HPCC's workspace creation functionality is stable enough to resume operations
      • All workspace status functionality (rebooting, pausing and resuming) should operate as normal
      • Please note that some workspaces may still fail to be created (estimated failure rate of less than 5%)
      • We are closely monitoring the platform.
      • There were no failures with new workspaces instantiated in the HPC Cloud provider
      • The component is fully operational & network issues are mitigated
      • The team will continue to work on a long term solution
  • 2025.07.02 - 2025.07.04 08:00
    • The AWS Cloud provider is unavailable
    • The investigation by the technical team was non conclusive
  • 2025.06.11 09:06 - 09:44

    • Access to the SRC portal was not possible due to an DoS to the SRAM component

  • 2025.06.10 00:30 - 09:35

    • The HPC Cloud provider will be less available

    • The HPC Cloud (HPCC) API will be unavailable due to updates on the network underlying infrastructure. Impact assessment:

      • No new workspaces can be created on the HPCC provider
      • No Pause / Resume can be done on HPCC workspaces
      • Existing HPCC workspaces will continue to operate as normal
  • 2025.05.15 06:00 - 15:00
    • Synchronisation issues between SRC platform and SRAM component after a version release
    • The SRC is displaying collaboration groups as separate collaborations

      New users in SRC without collaborations

      Issues updating collaboration membership and group changes in SRC

  • 2025.05.12
    • A medium score security vulnerability was disclosed
      • SANE users not part of the src_co_admin CO group were granted access to the Data Provider Portal
    • SANE Data Provider computing workspaces created between 7 April and 12 May 2025 are affected
    • Impact assessment and guidelines to secure and audit the issue shared with affected users
    • The issue is resolved for all computing resources created after 12 May 2025
  • 2025.04.23 08:00 - 09:16
    • The accounting / budgeting service component will be in maintenance.
    • Impact: wallet changes will be delayed.
  • 2025.03.24 16:55 - 2025.04.14 12:00
    • Access to Oracle cloud provider resources was paused due to an ongoing security incident investigation.

Summary

A security breach at Oracle Cloud has surfaced. While no direct evidence confirms SURF’s exposure, proactive security measures have been taken to mitigate any potential risk.

Impact

No workspaces can be created on Oracle cloud resources till further notice.

    • Access to Oracle was resumed.

Summary

After considerable research:

  • No active workloads, services, or sensitive data were hosted in the Oracle Cloud environment at the time of the breach
  • No evidence of unauthorised access was found
  • SRC's Oracle reseller has not issued further concerns
  • 2025.03.11 11:00 - 12:20
    • HPC Cloud (HPCC) API will be unavailable while certificates are updated in the underlying infrastructure
    • Impact:
      • No new workspaces can be created on the HPCC provider
      • No Pause / Resume can be done on HPCC workspaces
      • Existing HPCC machines can be accessed
  • 2025.02.26
    • A medium score security vulnerability was disclosed: users can gain elevated privileges to their workspaces
    • Computing workspaces created between 14 January and 19 February 2025 are affected
    • Impact assessment and guidelines to secure and audit the issue shared with affected users
    • The issue is resolved for all computing resources created after 19 February 2025
  •  2025.01.22
    • Known problem with the /etc/fstab file on some Ubuntu workspaces (WS). Fix made available.

Summary

SRC service has identified a problem with the /etc/fstab file on some Ubuntu WSs instantiated between 7 October and 19 December 2024.

Problem

New lines are appended to the /etc/fstab file every minute, for WS's that make use of Research Drive and/or WebDAV connection(s). This continuous growth leads to the file reaching its maximum length of 10,000 lines. Once this limit is exceeded, the WS will fail to boot.

Workaround

To resolve this problem, users can proceed in one of two ways:

  1. Run a Bash Script that fixes the issue on the WS
  2. Re-deploy and migrate to a new workspace
  • 2024.11.27 16:15 - 2024.11.28 9:45
    • Due to network issues during a network maintenance, no HPC Cloud workspaces can be started, paused, resumed or deleted. Communication with the workspaces can fail.
  • 2024.11.13
    • Root cause analysis of recent network outages.

Summary
SURF research services hosted at Amsterdam Data Center, experienced multiple interruptions in the last months. Namely, SRC service had limited access to the 'HPC Cloud' provider.

The reliability and availability of our services is paramount. After thorough investigation we have found the root cause of the issues and will implement a fix in the remainder of this year.

Root Cause Analysis
In recent years, SURF has experienced an exponential growth of hardware hosted at Amsterdam Data Center. This growth has led to scalability issues on our network which caused recent outage events. A solution was agreed to adapt our datacenter network to withstand with current and future loads. 

What to expect
In the coming weeks, SURF plans to implement network adjustments to avoid future outages. This will require maintenance work. We expect the services, namely SRC,  to continue operating as usual during the maintenance. Nevertheless, there is always the possibility that an outage will occur. You will always be kept informed through the usual channels, and any disruptive maintenance will be announced in advance as usual.

Should you have any questions, you can contact us via the Service Desk.

  • 2024.11.06 9:00 - 13:30
    • Due to issues in the underlying cloud infra-structure +/- 50 % of started workspaces were failing for SURF HPC Cloud. We have disabled starting new workspaces for this cloud for the moment so we can fix the issue.
  • 2024.10.10 17:00 - 17:30 
    • Emergency maintenance: SURF Cloud workspace actions (start, pause, resume, delete) are disrupted 
  • 2024.10.07 15:30 - 2024.10.09 9:15
    • Due to network issues no workspaces can be started, paused, resumed or deleted on SURF HPC Cloud.
  • 2024.09.26 - 2024.10.14
    • Due to an issue in the SRC accounting component, wallets were not charged for multiple weeks

    • A recalculation of credits was completed on Sep. 26th

    • Users might see a steep decrease in their wallet credits on this date

    • As of Oct. 14th, the balances are accurate

  • 2024.09.13 13:30 - 16:00
    • Network instability on SURF Cloud
    • Impact: some running workspaces weren't reachable
  • 2024.09.10 08:30 - 10:15
    • A technical problem on a SRC component impacted the regular users workflow
    • Impact: starting workspaces not possible; connection with Research Drive broken.
  • 2024.08.27 15:36 - 2024.08.28 13:37

    • Network outage at SURF's data centre in Amsterdam

    • Unavailability of The 'HPC Cloud' provider until the network incident was resolved

    • Impact: machines could not be started/stopped/paused in the 'HPC Cloud'; access to running workspaces was affected
  • 2024.08.23 09:00 - 16:00
    • The accounting / budgeting service component will have its annual maintenance. Impact: wallet creation will be delayed. Requests will be handled after the maintenance.
  • 2024.06.25
    • A preliminary evaluation of the network outage which caused the 'HPC Cloud' provider to be down, is publicly available. Executive summary:

On Wednesday evening, June 12, the SURF EVPN experienced an outage, caused by an internal broadcast storm. No external cause – or malicious intent – was detected. The network was fully recovered the next day. Eight services were impacted; two services were available the same night, with the last service fully recovered by Monday morning. The root cause is still unknown. An in-depth evaluation is planned, as we are awaiting more information from one of our vendors.

  • 2024.06.15 14:43 - 2024.06.17 9:30
    • HPC Cloud API is down due to cloud recovery due to earlier network issues
    • No new workspaces can be created on HPC Cloud
    • No Pause / Resume can be done on HPC Cloud workspaces
    • Existing machines can be accessed
  • 2024.06.14 10:15 - 12:00
    • HPC Cloud API is down due to recovery from earlier network outage.
    • No new workspaces can be created, no workspace states can be changed.
    • Existing machines can be accessed.
  • 2024.06.12 19:15 -  2024.06.14 9:45

    • SURF Research Cloud service has limited access to the 'HPC Cloud' provider

    • The 'HPC Cloud' is down due to network problems.

    • Impact: machines cannot be started/stopped/paused in the 'HPC Cloud'
  • 2023-12-15 10:00 am / 18:00 pm: Update network components, Impact: Possible short interruptions of portal functionality. I case of a glitch, please retry after 1-2 minutes. Workspaces will not be affected.
  • 2023-06-20 7:49 am / 11:14 pm: Apply Security updates to a batch of GPU & CPU Fat nodes. Impact: less availability of mentioned resources and no running workspaces on the hardware under maintenance.
  • 2023-06-13 08:37 am / 12:40 pm: Apply security updates to a batch of GPU & CPU Fat nodes. Impact: less availability of mentioned resources and no running workspaces on the hardware under maintenance. 
  • 2023-05-31 8:00 am / 03:51 pm: Update network components in our SURF HPC Cloud system. The portal is unavailable, workspaces cannot be created or paused/resumed in the Cloud provider 'HPC Cloud'. Running workspaces remain available.
  • 2023-04-12 9:00 am / 5:00 pm: Maintenance of the accounting service that manages the wallets. New wallets or changes will be processed after this maintenance window. Existing workspaces and wallets will not be affected.
  • 2023-04-04 7:00 am / 9:00 am: Network change 418 with expected network downtime of 2 minutes during this maintenance window. This might affect network traffic to and from VM's on SURF HPC Cloud.
  • 2023-03-28 22:00 CET / 01:00 CET: Internal database outage. Users could not perform any operations on either workspaces or catalog items. Any changes made to either workspaces or catalog items between 19:00 CET and 22:00 CET are lost and cannot be recovered.
  • 2023-03-21 / 2023-03-22: For a short period of time SURF ResearchCloud reported wrong usage amounts. This resulted in falsely depleted wallets and some users were unable to start/resume workspaces. This has been corrected and resolved.
  • 2023-03-20 12:15 / 14:55: Storage issue caused workspace creation to be unavailable
  • 2023-03-14 12:40 am / 12:50 am: Internal database upgrade, workspace cannot be created or paused/resumed.
  • 2023-03-02 7:00 am / 9:00 am: Network change 418 with expected network downtime of 2 minutes during this maintenance window. This might affect network traffic to and from VM's on SURF HPC Cloud. (rescheduled to: 2023-04-04)
  • 2023-02-16 5:00 am / 7:00 am: SRAM service dependency will be updated; Impact: SRC portal is not accessible; no impact for running workspaces
  • 2023-02-07 11:25 am / 11:50 am: Due to problems with our authentication service, it is currently not possible to log in to the Research Cloud portal. Running workspaces are unaffected.
  • 2023-02-06 2:00 pm / 8:26 pm: Ubuntu workspaces would fail sporadically due to failing to reach the package repository endpoint. (status.canonical.com)
  • 2023-01-25 09:00 am / 11:30 am: Update network components in our SURF HPC Cloud system. The portal will be unavailable, workspace cannot be created or paused/resumed in the Cloud provider 'HPC Cloud'.
  • 2022-12-20 08:00 / 17:00: Maintenance of all GPU & CPU Fat nodes; Impact: no running workspaces on the hardware under maintenance 
  • 2022-09-28 9:00 am / 19:00 pm: Updating the infrastructure supporting the SRC Portal.
  • 2022-09-15 5:00 am / 7:00 am: SRAM service dependency will be updated; Impact: SRC portal is not accessible; no impact for running workspaces
  • 2022-07-24 8:13 pm - 8:15 pm: Intermittent authentication service issues.  20:13-20:1520:13-20:15 20:13-20:15
  • 2022-06-03 / 2022-06-22: New workspaces can not be attached to existing reserved IPs and not be added to existing private networks.

  • 2022-06-02 04:00 pm / 2022-06-02 18:00: No new workspaces can be created.

  • 2022-05-25 1:00 pm / 6:30 pm:  new workspaces likely to fail due to network capacity. Running workspaces could be logged in to and worked with as usual.
  • 2022-05-10  11:30 am / 4:45 pm : portal.live.surfresearchcloud.nl blocked  
  • 2022-05-09 08:00 am / 8:00 pm
    • Plan: update network components in our SURF HPC Cloud system
    • Impact: workspaces cannot be created or paused/resumed in the Cloud provider 'HPC Cloud'
  • 2022-01-28 12:14 am: Communication to all service users about the PwnKit vulnerability and how to patch vulnerable workspaces
  • 2021-10-13 12:30 pm / 2021-10-15 10:56 am: Maintenance was extended due to unforeseen stability issues while deploying service components to new hardware
    • Impact
      • creation of new workspaces in the Cloud provider 'HPC Cloud' is not available
  • 2021-10-13 7:00 am / 12:30 pm: Hardware replacement
    • Impact
      • creation of new workspaces in the Cloud provider 'HPC Cloud' is not available
    • No impact
      • running workspaces will operate as usual
      • SRC Portal is available
  • 2021-09-29: SRC access might be less available
    • From 5:00 to 07:00: SRAM service dependency will be updated; Impact: SRC portal is not accessible; no impact for running workspaces
  • 2021-06-08: SRC access might be less available
    • From 5:00 to 07:00: SRAM service dependency will be updated; Impact: access to SRC is less available; no impact for running workspaces
  • 2021-05-26: Portal and connection to VMs unstable
    • From ca. 10:00 to 12:00: Limited portal and VM usage due to an internal failure in Research Cloud.
  • 2021-04-28: Creation of new workspaces fails
    • From ca. 9:45 to 10:45: Due to a Research Cloud internal failure, users could not start new workspaces.
  • 2021-03-15: Gitlab.com down
    • Between 13:00h and 15:00h, gitlab.com was unavailable, which rendered SRC unable to create workspaces.