To pretrain large language models on Snellius, we have prepared some code to get you started!
Below are different pretraining frameworks and their advantages
Megatron-LM
Please find the GitHub link to get started on running Megatron-LM at scale on Snellius using FineWeb:
https://github.com/SURF-ML/Megatron-LM-Snellius
Pros:
- Perfect for hyperparameter tuning and running initial experiments
- Minimal installation required
Cons:
- Not flexible for model architectural changes
- No tokenizer training