-
-
Notifications
You must be signed in to change notification settings - Fork 37
Comparing changes
Open a pull request
base repository: darvid/python-hyperscan
base: main
head repository: darvid/python-hyperscan
compare: fix/simde-backend-x86-perf-253
- 6 commits
- 3 files changed
- 1 contributor
Commits on Feb 11, 2026
-
fix(build): only enable SIMDE_BACKEND for non-x86 architectures (#253)
- SIMDE_BACKEND was unconditionally enabled for all vectorscan builds, which disables native x86 CPU feature detection and caps performance at SSE2 level - on x86-64, this caused a ~2.5-13x throughput regression vs v0.7.21 because vectorscan's runtime dispatch to SSE4.2/AVX2/AVX512 code paths was completely bypassed - now only enables SIMDE_BACKEND on ARM and other non-x86 architectures where vectorscan genuinely needs the SIMD emulation layer - add benchmark script for reproducing and validating the regression
Configuration menu - View commit details
-
Copy full SHA for 9fe502a - Browse repository at this point
Copy the full SHA 9fe502aView commit details -
ci(build): replace deprecated macos-13 runners with macos-15
- GitHub deprecated macos-13 (Intel) runners - macOS x86_64 wheels are now cross-compiled on ARM runners via Rosetta 2, which cibuildwheel handles natively
Configuration menu - View commit details
-
Copy full SHA for 3b9a0d2 - Browse repository at this point
Copy the full SHA 3b9a0d2View commit details -
build: patch vectorscan x86-64-v2 march for older GCC compat
- vectorscan 5.4.12 uses -march=x86-64-v2 in cflags-x86.cmake and archdetect.cmake, but GCC <11 (manylinux2014 devtoolset) does not recognize this value - patch source at build time to use -march=nehalem which provides the same SSE4.2 baseline and is supported by all GCC versions - only applied when using native x86 backend (not SIMDE_BACKEND)
Configuration menu - View commit details
-
Copy full SHA for ecfa543 - Browse repository at this point
Copy the full SHA ecfa543View commit details -
build: fix macOS x86_64 cross-compilation on ARM runners
- use CMAKE_OSX_ARCHITECTURES (target arch) instead of CMAKE_SYSTEM_PROCESSOR (host arch) for SIMDE_BACKEND decision on macOS, so cross-compiling x86_64 on ARM correctly disables SIMDE and builds native x86 vectorscan - forward CMAKE_OSX_ARCHITECTURES to ExternalProject_Add so vectorscan builds for the correct target architecture - handle BSD sed -i syntax difference on macOS for the x86-64-v2 → nehalem patch
Configuration menu - View commit details
-
Copy full SHA for 1e4045c - Browse repository at this point
Copy the full SHA 1e4045cView commit details -
build: use perl for x86-64-v2 patch to fix macOS sed compat
- CMake's list handling drops empty string in sed -i "" causing BSD sed to fail with "rename(): No such file or directory" - perl -pi -e works identically on Linux and macOS
Configuration menu - View commit details
-
Copy full SHA for 5a81c77 - Browse repository at this point
Copy the full SHA 5a81c77View commit details -
ci(build): pin uv to 0.9.x to fix Windows build failures
- uv 0.10.2 leaks host Python 3.12 stdlib into cibuildwheel venvs on Windows, causing SRE module mismatch and import errors for non-3.12 Python targets
Configuration menu - View commit details
-
Copy full SHA for 3edefbd - Browse repository at this point
Copy the full SHA 3edefbdView commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff main...fix/simde-backend-x86-perf-253