This directory contains tests for the TabPFN project.
test_classifier_interface.py: Tests for the TabPFNClassifier interfacetest_regressor_interface.py: Tests for the TabPFNRegressor interfacetest_utils.py: Tests for utility functionstest_consistency.py: Tests to ensure prediction consistency across code changes
The consistency tests verify TabPFN models produce consistent predictions across code changes, ensuring:
- Changes don't unexpectedly alter model behavior
- Core algorithms remain stable and reproducible
- Intentional behavior changes are explicitly acknowledged
Tests use small, fixed datasets with reproducible random seeds to ensure consistency:
- Creates a TabPFN model with fixed settings
- Fits it to a reproducible dataset
- Gets predictions using a standardized process
- Compares predictions to previously saved reference values
Models can produce slightly different predictions across platforms due to:
- Different CPU architectures (x86 vs ARM)
- Different operating systems (Linux, macOS, Windows)
- Different Python versions
For this reason:
- Reference predictions are platform-specific (stored in
reference_predictions/) - Platform information is tracked in metadata
- Tests only run on matching platforms by default
We test against specific CI platform configurations:
- Linux, Windows, and macOS
- Python 3.9 and 3.13
To ensure reliable CI testing:
- Reference values should be generated on a CI-compatible platform
- Tests will skip with warnings if reference platform doesn't match
Check if your platform is CI-compatible:
python tests/test_consistency.py --print-platformImportant: If creating reference values on a non-compatible platform, you must manually edit the platform metadata to match the closest CI platform. Otherwise, tests will fail in CI environments.
Run tests on a different platform:
FORCE_CONSISTENCY_TESTS=1 pytest tests/test_consistency.pyModel changes should:
- Be intentional and well-understood
- Improve performance on standard benchmarks
- Maintain backward compatibility when possible
- Be clearly documented with evidence of improvement