What is the AI-Hub?

The AI-Hub is a sovereign AI platform being developed based on the public values agreed upon within the sector (autonomy, humanity, justice). On the AI-Hub, you can run various language models on premises in SURF's datacenter. The key aim of the AI-Hub is to provide an API where researchers, developers and system administrators can use AI-Inference with publicly available AI models in a trusted environment.

Pilot phase

The AI-Hub is in active development and not generally available for institutes yet. Interested in joining the pilot? Let us know by filling out the Pilot registration form.


Models

During the Pilot we have various open-weights models running for different modalities:

  • Text: Llama70b (default-text-large),
  • Mistral Small: EU. alternative
  • Transcriptions: Whisper-Large-v2
  • Image analysis: Qwen 2.5-VL-32B-Instruct
  • FLUX.1-schnell (default-image): image generation
  • Code: Qwen 2.5-Coder-32B-Instruct (limited availability)
  • GPT-OSS: OpenAI's recently released open-weight model (will be available soon)
  • Qwen3-embedding-8B: will replace Llama70b in embedding mode.

We have defined aliases (default-text-large, default-text-medium, default-image, default-sst) that link to specific models for a modality. For example, default-text-medium can point to Llama70b. These aliases can be used if you are prefer automated upgrades of models. Login to the backoffice and look at the models table to find an overview of the aliases.

Models can be deprecated and replaced by better alternatives.

Latency mode

The API supports different latency modes.

ModelatencyModelsExample use-cases
Always-on< 10 secOnly models marked with as always-onChat interfaces
On-demand< 30 minAll models available to youPlayground
Batch< 24 hrsAll models available to you

Research, automatic workflows

On-demand models will be loaded (if there is hardware available), as soon as the request arrives. It will then be unloaded after 15 minutes of inactivity to make space for other on-demand requests.

Integrations

The AI-Hub can be used as a backend for various applications that support the OpenAI API  standard. We have some documentation to get started, but it's up you to enroll this responsibly within your institute:

 OpenWebUI: https://openwebui.com/

browser-based interface for running and managing local or remote large language models.

> Getting started (behind login)


Nvidia Nemo Guardrails: https://github.com/NVIDIA-NeMo/Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational applications.

> Getting started (behind login)

PyCharm: https://www.jetbrains.com/pycharm/

A popular IDE from JetBrains designed specifically for Python development, with strong debugging and productivity tools.

> Getting started (behind login)

Elastic: https://www.elastic.co/security

Using the ai-hub as an inference prodiver, you can also use the Elastic Security AI Assistant.

> Getting started

 LibreChat: https://www.librechat.ai/

An open-source, self-hostable alternative to ChatGPT that lets you run and customize conversational AI.

> Getting started (behind login)

 

NextCloud has also successfully tested in an experimental setup, but you are welcome to try other integrations as well.

Development

The AI-Hub exposes an API to develop your applications or workflows against.

Onboarding

Check our onboarding documentation to get from zero to your first successful request.

> Onboarding

Python code examples

Shows examples when using different modalities and functions such as tool calling.

> Python code examples (behind login)

API documentation

The API-documentation can be used for detailed information on the AI-Hub endpoints

> API Swagger Documentation (behind login)


Whats next

Please read our Docs and F.A.Q. and start building.


Happy coding! rocket 


  • No labels