Skip to content

yandex-research/graphpfn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GraphPFN

This is the official repository of "GraphPFN: A Prior-Data Fitted Graph Foundation Model" paper (arXiv). In this repository, we provide code for reproducing our experiments with GraphPFN, both pretraining and evaluation.

News

Note

GraphPFN-1.2 is out! The next release with refactored code and extended evaluation is also on the way, so stay tuned!

  • 2026-02-13: GraphPFN-1.2 released, featuring improved ICL performance, end-to-end dataset generation and more!
  • 2025-09-25: GraphPFN-1.0 released!

Licenses

This project uses modified versions of third-party components (TabICL and LimiX). See the NOTICE file and LICENSES/ directory for details. LimiX serves as the backbone for GraphPFN, and its weights have a separate license - see the LimiX repository.

Reproducing Experiments

Prerequisites

  1. Install uv
  2. Install dependencies
uv sync
  1. For experiments on GraphLand, download datasets and place them in the data directory

Running the evaluation

You can execute a minimal evaluation run (GraphPFN finetuning with a single ensemble member) with the following command:

uv run bin/go.py exp/graphpfn-eval/finetune/01/tolokers-2/tuning.toml --force

Running the pretraining

To run GraphPFN pretraining you can use the following command:

DGLBACKEND=pytorch uv run -m torch.distributed.run --nproc-per-node 8 bin/go.py exp/graphpfn-stage-1/pretrain.toml

Project Structure

  • bin/ - Training and evaluation scripts
  • exp/ - Experiment configurations and results
  • data/ - Dataset directory
  • lib/ - Common utilities and tools

Configuration

Experiments are configured using TOML files located in the exp/ directory. Each configuration specifies:

  • Dataset path and preprocessing
  • Model hyperparameters
  • Training settings
  • Evaluation metrics

Results

Evaluation results are saved in the same directory as the configuration file:

  • report.json - Evaluation metrics
  • Model checkpoints
  • Training logs