What is the AI-Hub?
The AI-Hub is a sovereign AI platform being developed based on the public values agreed upon within the sector (autonomy, humanity, justice). On the AI-Hub, you can run various language models on premises in SURF's datacenter. The key aim of the AI-Hub is to provide an API where researchers, developers and system administrators can use AI-Inference with publicly available AI models in a trusted environment.
Pilot phase
The AI-Hub is in active development and not generally available for institutes yet. Interested in joining the pilot? Let us know by filling out the Pilot registration form.
Models
During the Pilot we have various open-weights models running for different modalities:
- Text: Llama70b (default-text-large),
- Mistral Small: EU. alternative
- Transcriptions: Whisper-Large-v2
- Image analysis: Qwen 2.5-VL-32B-Instruct
- FLUX.1-schnell (default-image): image generation
- Code: Qwen 2.5-Coder-32B-Instruct (limited availability)
- GPT-OSS: OpenAI's recently released open-weight model (will be available soon)
- Qwen3-embedding-8B: will replace Llama70b in embedding mode.
We have defined aliases (default-text-large, default-text-medium, default-image, default-sst) that link to specific models for a modality. For example, default-text-medium can point to Llama70b. These aliases can be used if you are prefer automated upgrades of models. Login to the backoffice and look at the models table to find an overview of the aliases.
Models can be deprecated and replaced by better alternatives.
Latency mode
The API supports different latency modes.
| Mode | latency | Models | Example use-cases |
|---|---|---|---|
| Always-on | < 10 sec | Only models marked with as always-on | Chat interfaces |
| On-demand | < 30 min | All models available to you | Playground |
| Batch | < 24 hrs | All models available to you | Research, automatic workflows |
On-demand models will be loaded (if there is hardware available), as soon as the request arrives. It will then be unloaded after 15 minutes of inactivity to make space for other on-demand requests.
Integrations
The AI-Hub can be used as a backend for various applications that support the OpenAI API standard. We have some documentation to get started, but it's up you to enroll this responsibly within your institute:
OpenWebUI: https://openwebui.com/ browser-based interface for running and managing local or remote large language models. > Getting started (behind login) | Nvidia Nemo Guardrails: https://github.com/NVIDIA-NeMo/Guardrails NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational applications. > Getting started (behind login) | PyCharm: https://www.jetbrains.com/pycharm/ A popular IDE from JetBrains designed specifically for Python development, with strong debugging and productivity tools. > Getting started (behind login) |
Elastic: https://www.elastic.co/security Using the ai-hub as an inference prodiver, you can also use the Elastic Security AI Assistant. | LibreChat: https://www.librechat.ai/ An open-source, self-hostable alternative to ChatGPT that lets you run and customize conversational AI. > Getting started (behind login) |
|
NextCloud has also successfully tested in an experimental setup, but you are welcome to try other integrations as well.
Development
The AI-Hub exposes an API to develop your applications or workflows against.
Onboarding Check our onboarding documentation to get from zero to your first successful request. | Python code examples Shows examples when using different modalities and functions such as tool calling. > Python code examples (behind login) | API documentation The API-documentation can be used for detailed information on the AI-Hub endpoints > API Swagger Documentation (behind login) |
Whats next
Please read our Docs and F.A.Q. and start building.
Happy coding!