Current

  • An in-depth evaluation of the network outage is underway & the findings will be made public coming days.
  • Displayed storage usage may be inaccurate. Please check back for further notice.

Future

 TBA

Past

2024

  • 2024-06-25
    • A preliminary evaluation of the network outage which caused the 'HPC Cloud' provider to be down, is publicly available. Executive summary:

 On Wednesday evening, June 12, the SURF EVPN experienced an outage, caused by an internal broadcast storm. No external cause – or malicious intent – was detected. The network was fully recovered the next day. Eight services were impacted; two services were available the same night, with the last service fully recovered by Monday morning. The root cause is still unknown. An in-depth evaluation is planned, as we are awaiting more information from one of our vendors.

  • 2024-06-15 14:43 until 2024-06-17 9:30
    • HPC Cloud API is down due to cloud recovery due to earlier network issues
    • No new workspaces can be created on HPC Cloud
    • No Pause / Resume can be done on HPC Cloud workspaces
    • Existing machines can be accessed
  • 2024-06-14 10:15 until 12:00
    • HPC Cloud API is down due to recovery from earlier network outage.
    • No new workspaces can be created, no workspace states can be changed.
    • Existing machines can be accessed.
  • 2024-06-12 19:15 until  2024-06-14 9:45

    • SURF Research Cloud service has limited access to the 'HPC Cloud' provider

    • The 'HPC Cloud' is down due to network problems.

    • Impact: machines cannot be started/stopped/paused in the 'HPC Cloud'
  • 2023-12-15 10:00 am / 18:00 pm: Update network components, Impact: Possible short interruptions of portal functionality. I case of a glitch, please retry after 1-2 minutes. Workspaces will not be affected.
  • 2023-06-20 7:49 am / 11:14 pm: Apply Security updates to a batch of GPU & CPU Fat nodes. Impact: less availability of mentioned resources and no running workspaces on the hardware under maintenance.
  • 2023-06-13 08:37 am / 12:40 pm: Apply security updates to a batch of GPU & CPU Fat nodes. Impact: less availability of mentioned resources and no running workspaces on the hardware under maintenance. 
  • 2023-05-31 8:00 am / 03:51 pm: Update network components in our SURF HPC Cloud system. The portal is unavailable, workspaces cannot be created or paused/resumed in the Cloud provider 'HPC Cloud'. Running workspaces remain available.
  • 2023-04-12 9:00 am / 5:00 pm: Maintenance of the accounting service that manages the wallets. New wallets or changes will be processed after this maintenance window. Existing workspaces and wallets will not be affected.
  • 2023-04-04 7:00 am / 9:00 am: Network change 418 with expected network downtime of 2 minutes during this maintenance window. This might affect network traffic to and from VM's on SURF HPC Cloud.
  • 2023-03-28 22:00 CET / 01:00 CET: Internal database outage. Users could not perform any operations on either workspaces or catalog items. Any changes made to either workspaces or catalog items between 19:00 CET and 22:00 CET are lost and cannot be recovered.
  • 2023-03-21 / 2023-03-22: For a short period of time SURF ResearchCloud reported wrong usage amounts. This resulted in falsely depleted wallets and some users were unable to start/resume workspaces. This has been corrected and resolved.
  • 2023-03-20 12:15 / 14:55: Storage issue caused workspace creation to be unavailable
  • 2023-03-14 12:40 am / 12:50 am: Internal database upgrade, workspace cannot be created or paused/resumed.
  • 2023-03-02 7:00 am / 9:00 am: Network change 418 with expected network downtime of 2 minutes during this maintenance window. This might affect network traffic to and from VM's on SURF HPC Cloud. (rescheduled to: 2023-04-04)
  • 2023-02-16 5:00 am / 7:00 am: SRAM service dependency will be updated; Impact: SRC portal is not accessible; no impact for running workspaces
  • 2023-02-07 11:25 am / 11:50 am: Due to problems with our authentication service, it is currently not possible to log in to the Research Cloud portal. Running workspaces are unaffected.
  • 2023-02-06 2:00 pm / 8:26 pm: Ubuntu workspaces would fail sporadically due to failing to reach the package repository endpoint. (status.canonical.com)
  • 2023-01-25 09:00 am / 11:30 am: Update network components in our SURF HPC Cloud system. The portal will be unavailable, workspace cannot be created or paused/resumed in the Cloud provider 'HPC Cloud'.
  • 2022-12-20 08:00 / 17:00: Maintenance of all GPU & CPU Fat nodes; Impact: no running workspaces on the hardware under maintenance 
  • 2022-09-28 9:00 am / 19:00 pm: Updating the infrastructure supporting the SRC Portal.
  • 2022-09-15 5:00 am / 7:00 am: SRAM service dependency will be updated; Impact: SRC portal is not accessible; no impact for running workspaces
  • 2022-07-24 8:13 pm - 8:15 pm: Intermittent authentication service issues.  20:13-20:1520:13-20:15 20:13-20:15
  • 2022-06-03 / 2022-06-22: New workspaces can not be attached to existing reserved IPs and not be added to existing private networks.

  • 2022-06-02 04:00 pm / 2022-06-02 18:00: No new workspaces can be created.

  • 2022-05-25 1:00 pm / 6:30 pm:  new workspaces likely to fail due to network capacity. Running workspaces could be logged in to and worked with as usual.
  • 2022-05-10  11:30 am / 4:45 pm : portal.live.surfresearchcloud.nl blocked  
  • 2022-05-09 08:00 am / 8:00 pm
    • Plan: update network components in our SURF HPC Cloud system
    • Impact: workspaces cannot be created or paused/resumed in the Cloud provider 'HPC Cloud'
  • 2022-01-28 12:14 am: Communication to all service users about the PwnKit vulnerability and how to patch vulnerable workspaces
  • 2021-10-13 12:30 pm / 2021-10-15 10:56 am: Maintenance was extended due to unforeseen stability issues while deploying service components to new hardware
    • Impact
      • creation of new workspaces in the Cloud provider 'HPC Cloud' is not available
  • 2021-10-13 7:00 am / 12:30 pm: Hardware replacement
    • Impact
      • creation of new workspaces in the Cloud provider 'HPC Cloud' is not available
    • No impact
      • running workspaces will operate as usual
      • SRC Portal is available
  • 2021-09-29: SRC access might be less available
    • From 5:00 to 07:00: SRAM service dependency will be updated; Impact: SRC portal is not accessible; no impact for running workspaces
  • 2021-06-08: SRC access might be less available
    • From 5:00 to 07:00: SRAM service dependency will be updated; Impact: access to SRC is less available; no impact for running workspaces
  • 2021-05-26: Portal and connection to VMs unstable
    • From ca. 10:00 to 12:00: Limited portal and VM usage due to an internal failure in Research Cloud.
  • 2021-04-28: Creation of new workspaces fails
    • From ca. 9:45 to 10:45: Due to a Research Cloud internal failure, users could not start new workspaces.
  • 2021-03-15: Gitlab.com down
    • Between 13:00h and 15:00h, gitlab.com was unavailable, which rendered SRC unable to create workspaces.