This is the official repository of "GraphPFN: A Prior-Data Fitted Graph Foundation Model" paper (arXiv). In this repository, we provide code for reproducing our experiments with GraphPFN, both pretraining and evaluation.
Note
GraphPFN-1.2 is out! The next release with refactored code and extended evaluation is also on the way, so stay tuned!
- 2026-02-13: GraphPFN-1.2 released, featuring improved ICL performance, end-to-end dataset generation and more!
- 2025-09-25: GraphPFN-1.0 released!
This project uses modified versions of third-party components (TabICL and LimiX). See the NOTICE file and LICENSES/ directory for details. LimiX serves as the backbone for GraphPFN, and its weights have a separate license - see the LimiX repository.
Prerequisites
- Install uv
- Install dependencies
uv sync
- For experiments on GraphLand, download datasets and place them in the
datadirectory
Running the evaluation
You can execute a minimal evaluation run (GraphPFN finetuning with a single ensemble member) with the following command:
uv run bin/go.py exp/graphpfn-eval/finetune/01/tolokers-2/tuning.toml --force
Running the pretraining
To run GraphPFN pretraining you can use the following command:
DGLBACKEND=pytorch uv run -m torch.distributed.run --nproc-per-node 8 bin/go.py exp/graphpfn-stage-1/pretrain.toml
bin/- Training and evaluation scriptsexp/- Experiment configurations and resultsdata/- Dataset directorylib/- Common utilities and tools
Experiments are configured using TOML files located in the exp/ directory. Each configuration specifies:
- Dataset path and preprocessing
- Model hyperparameters
- Training settings
- Evaluation metrics
Evaluation results are saved in the same directory as the configuration file:
report.json- Evaluation metrics- Model checkpoints
- Training logs