Skip to content

BUG: fix compatibility with NumPy 2.3#616

Merged
paulbkoch merged 5 commits intointerpretml:mainfrom
DerWeh:bugfix/numpy
Jun 17, 2025
Merged

BUG: fix compatibility with NumPy 2.3#616
paulbkoch merged 5 commits intointerpretml:mainfrom
DerWeh:bugfix/numpy

Conversation

@DerWeh
Copy link
Copy Markdown
Collaborator

@DerWeh DerWeh commented Jun 15, 2025

The C++ data type BoolEBM is defined as int32_t, thus an integer not a np.bool_ should be passed to these functions.

NumPy 2.3 removes following behavior:

  • ‘np.bool’ scalars can no longer be interpreted as an index (deprecated since 1.19)
    (gh-28254)

I assume that due to this pending deprecation previously it was allowed to pass a bool as an int32_t as it could be interpreted as index and therefore as number.


Note the subtle difference: Python's bool is a numbers.Integral and numbers.Number:

>> import numbers
>> isinstance(True, numbers.Integral)
True
>> isinstance(True, numbers.Number)
True
>> True + True
2

But np.bool_ is not:

>> import numpy as np
>> isinstance(np.bool_(True), numbers.Integral)
False
>> isinstance(np.bool_(True), numbers.Number)
False

Neither, Python's bool nor NumPy's np.bool_ are NumPy integers or numbers:

>> isinstance(np.True_, np.integer)
False
>> isinstance(np.True, np.number)
False
>> np.True_ + np.True_
np.True_

NumPy's bool corresponds to bool defined in stdbool.h.


FYI: as you work with cytpes: NumPy offers also some convenience functions to handle pointers and arrays, as well as a platform independent load_library implementation.
https://numpy.org/doc/stable/reference/routines.ctypeslib.html

@codecov
Copy link
Copy Markdown

codecov bot commented Jun 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 58.40%. Comparing base (be87fd6) to head (87cf2df).
Report is 1 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (be87fd6) and HEAD (87cf2df). Click for more details.

HEAD has 41 uploads less than BASE
Flag BASE (be87fd6) HEAD (87cf2df)
sdist_mac_311_python 2 1
sdist_mac_310_python 2 0
sdist_mac_39_python 2 0
sdist_mac_312_python 2 0
bdist_mac_310_python 2 1
sdist_linux_311_python 2 0
sdist_linux_39_python 2 1
sdist_linux_310_python 2 1
bdist_mac_39_python 2 0
bdist_mac_311_python 2 0
sdist_linux_312_python 2 1
sdist_win_310_python 2 0
sdist_win_311_python 2 0
bdist_win_39_python 2 0
sdist_win_39_python 2 0
bdist_linux_39_python 2 0
bdist_mac_312_python 2 0
bdist_linux_311_python 2 0
sdist_win_312_python 2 0
bdist_linux_312_python 2 1
bdist_linux_310_python 2 1
bdist_win_310_python 2 0
bdist_win_311_python 2 0
bdist_win_312_python 2 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #616       +/-   ##
===========================================
- Coverage   72.47%   58.40%   -14.07%     
===========================================
  Files          73       73               
  Lines       10318    10322        +4     
===========================================
- Hits         7478     6029     -1449     
- Misses       2840     4293     +1453     
Flag Coverage Δ
bdist_linux_310_python 58.30% <100.00%> (-13.90%) ⬇️
bdist_linux_311_python ?
bdist_linux_312_python 58.30% <100.00%> (-13.85%) ⬇️
bdist_linux_39_python ?
bdist_mac_310_python 58.31% <100.00%> (-14.07%) ⬇️
bdist_mac_311_python ?
bdist_mac_312_python ?
bdist_mac_39_python ?
bdist_win_310_python ?
bdist_win_311_python ?
bdist_win_312_python ?
bdist_win_39_python ?
sdist_linux_310_python 58.24% <100.00%> (-13.92%) ⬇️
sdist_linux_311_python ?
sdist_linux_312_python 58.17% <100.00%> (-13.96%) ⬇️
sdist_linux_39_python 58.22% <100.00%> (-13.91%) ⬇️
sdist_mac_310_python ?
sdist_mac_311_python 52.49% <100.00%> (-19.79%) ⬇️
sdist_mac_312_python ?
sdist_mac_39_python ?
sdist_win_310_python ?
sdist_win_311_python ?
sdist_win_312_python ?
sdist_win_39_python ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@paulbkoch
Copy link
Copy Markdown
Collaborator

Thanks @DerWeh -- This is great; however I would prefer to convert from bool and np.bool_ into np.int32 inside the _native.py file so that callers to measure_feature and fill_feature do not need to understand these lower-level concepts.

@DerWeh
Copy link
Copy Markdown
Collaborator Author

DerWeh commented Jun 16, 2025

In this case, I need more information how exactly you want to handle this. A cast to np.int32 is something quite liberal, which might hide underlying issues:

>> import numpy as np
>> np.int32(3.0)
np.int32(3)
>> np.int32(np.pi)
np.int32(3)
>> np.int32("3")
np.int32(3)

Do you want to accept such inputs? And if we automatically cast, why cast to integer and not to boolean? Boolean is a different conversion as 'something' equals to True, and 'nothing' to False:

>> bool(None)
False
>> bool([])
False
>> bool("Hello world")
True
>> bool(0)
False
>> bool([0])
True

Do you really want to silently accept all kind of inputs that can be converted to an integer or boolean? Or should we issue warnings upon conversion? Or manually check for allowed inputs?

From the name BoolEBM, I would in fact assume that the content should be a boolean. If that is correct, what is the reason you are using int32_t (instead of e.g. int8_t or uint8_t, don't know if bool can cause troubles as its size is unspecified)? Wouldn't it semantically make more sense to use a boolean type? Personally, I would be more conservative and only accept boolean.

How do you want this behavior? Can/should we separate such larger change from the current bug fix? Currently, I need to downgrade NumPy and explicitly demand numpy<2.3 as interpret-core is incompatible at the moment. It would be great if can a fast bug fix release (as soon as everything is thoroughly tested).


Concerning the CI:

  • The output of format-ruff looks a bit strange to me. Do I have some files to format, or is the CI in fact broken?
  • Against what version of dependencies do you test? The newest available? The oldest supported? (Both would of course be great ;-) ) Sadly, the tests are too resource-consuming for me to run them locally, but we should make sure, that they in fact pass with NumPy 2.3.
  • It seems to me, that you don't test against Python 3.13. Is there any reason, or did I simply miss them? interpret seems to work perfectly fine on Python 3.13, and one should probably always at least test the newest and oldest supported versions.

@paulbkoch
Copy link
Copy Markdown
Collaborator

C++ has a bool type, but C does not, so the type passed between python and C via ctypes needs to be something like int32. int32 tends to be the most cross-language compatible type, so that's why it was chosen instead of uint32, int8, int64, etc. R for example supports int32, but not the others.

The _native.py file was designed as an interop layer such that code within _native.py is allowed to do low-level unsafe things and is allowed to look more C-like. The functions exposed by _native.py should accept python types. So, fill_feature and measure_feature should accept python bool types like bool or np.bool_, but not integers. fill_feature and measure_feature should convert the python bool types they receive into the int32 type expected by the C layer.

I would do something like:

if not isinstance(is_missing, (bool, np.bool_)):
    raise ...
is_missing = 1 if is_missing else 0

above this (and similar code within _native.py):

is_missing,
is_unseen,
is_nominal,

For the ruff error, run:

ruff format

@DerWeh
Copy link
Copy Markdown
Collaborator Author

DerWeh commented Jun 17, 2025

@paulbkoch I applied ruff format only to the python files, as it would have formatted all notebooks in the documentation folder else. The only file formatted was python/interpret-core/tests/glassbox/ebm/test_ebm_exact.py which cause a conflict now.
I assume, the file was formatted on main in the meantime. Probably it's best to drop the format commit again and instead rebase against main. Or do you want to resolve the conflict?

For the rest, this PR should be good now.

@paulbkoch paulbkoch merged commit bba912d into interpretml:main Jun 17, 2025
15 checks passed
@DerWeh DerWeh deleted the bugfix/numpy branch June 17, 2025 20:15
@paulbkoch
Copy link
Copy Markdown
Collaborator

The new release just out v0.6.12 has this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants