Fix bitsandbytes ROCm install by using pip instead of uv#4966
Fix bitsandbytes ROCm install by using pip instead of uv#4966danielhanchen merged 7 commits intounslothai:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request modifies the install.sh script to use pip instead of uv for installing bitsandbytes on AMD ROCm, as uv reportedly causes issues with this package. The feedback suggests extending this change to the fallback installation path and moving the pip bootstrapping logic to the beginning of the function to ensure it is available for all installation scenarios.
| # uv destroys bitsandbytes, use pip instead | ||
| # Ensure pip is installed, then use it for this install. | ||
| if ! "$_venv_py" -m pip --version >/dev/null 2>&1; then | ||
| uv pip install --python "$_venv_py" pip >/dev/null 2>&1 || true | ||
| fi |
There was a problem hiding this comment.
The comment states that uv is problematic for bitsandbytes on ROCm, but the PyPI fallback at line 128 still uses uv. To ensure the fix is complete and consistent with the PR's objective, the fallback should also be updated to use pip.
Additionally, the pip bootstrap logic (lines 119-121) is currently nested within the if [ -n "$_bnb_whl_url" ] block. If this block is skipped (e.g., on architectures where _bnb_whl_url is empty) or if the pre-release installation fails, the fallback will be reached without ensuring pip is available in the virtual environment. It is recommended to move the pip check to the beginning of the _install_bnb_rocm function so it covers all installation paths.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d96bc44c19
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # uv destroys bitsandbytes, use pip instead | ||
| # Ensure pip is installed, then use it for this install. | ||
| if ! "$_venv_py" -m pip --version >/dev/null 2>&1; then | ||
| uv pip install --python "$_venv_py" pip >/dev/null 2>&1 || true |
There was a problem hiding this comment.
Do not ignore pip bootstrap failure for ROCm bnb install
If the venv does not have pip yet, the new bootstrap step swallows any error with || true, then immediately invokes python -m pip install for the ROCm wheel. In environments where fetching pip fails (for example, PyPI-restricted networks), this makes the pre-release install fail even when the GitHub wheel URL is reachable, and the function falls back to the older PyPI bitsandbytes path instead. Please fail fast (or use ensurepip) rather than silently continuing after a pip bootstrap error.
Useful? React with 👍 / 👎.
The original fix correctly switched the pre-release wheel install from uv to pip, but left the PyPI fallback path on uv. If uv breaks bnb on ROCm, the fallback would hit the same issue. Move pip bootstrap before the branch so both paths use pip consistently.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 97b2ca9d2c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # uv destroys bitsandbytes on ROCm, use pip instead. | ||
| # Ensure pip is installed first. | ||
| if ! "$_venv_py" -m pip --version >/dev/null 2>&1; then | ||
| uv pip install --python "$_venv_py" pip >/dev/null 2>&1 || true |
There was a problem hiding this comment.
Do not ignore pip bootstrap failure in ROCm bnb path
When $_venv_py -m pip is missing, the new bootstrap command uv pip install --python "$_venv_py" pip is allowed to fail via || true, but the function immediately switches to python -m pip install for both the pre-release wheel and fallback. In environments where seeding pip fails (e.g., restricted index access or transient network failures), this guarantees both installs fail and aborts the ROCm flow; the bootstrap should fail fast or use an offline-safe seeding path instead of silently continuing.
Useful? React with 👍 / 👎.
- Try ensurepip --upgrade before falling back to uv pip install pip. ensurepip works offline and does not need PyPI, making the bootstrap robust when the network or index is unavailable. - If both ensurepip and uv fail, emit a visible warning instead of silently swallowing the error (which previously led to a cryptic "No module named pip" downstream). - Use run_maybe_quiet so --verbose users see bootstrap output. - Update comment to document the actual root cause: uv rejects the wheel because filename version and metadata version disagree.
uv pip install ignores pip.conf and PIP_* env vars, but python -m pip reads them. Without --isolated, users with PIP_INDEX_URL pointing to a private mirror that does not carry bitsandbytes would see the PyPI fallback fail where it previously worked under uv. --isolated restores parity with the old uv behavior.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d022501228
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if run_install_cmd "$_label (pre-release)" uv pip install --python "$_venv_py" \ | ||
| --force-reinstall --no-cache-dir --no-deps "$_bnb_whl_url"; then | ||
| if run_install_cmd "$_label (pre-release)" "$_venv_py" -m pip install \ | ||
| --isolated --force-reinstall --no-cache-dir --no-deps "$_bnb_whl_url"; then |
There was a problem hiding this comment.
Remove isolated mode from ROCm bitsandbytes installs
Using python -m pip install --isolated here causes pip to ignore environment variables and user pip configuration (as documented in pip --help), so mirror/proxy/certificate settings from PIP_* or user config are dropped in this ROCm path. In environments that rely on those settings (for example, internal package mirrors or custom trust config), this newly added flag makes bitsandbytes installation fail even though the rest of the installer still works via uv pip; please remove --isolated or pass equivalent index/trust options explicitly.
Useful? React with 👍 / 👎.
--isolated suppresses PIP_INDEX_URL, PIP_EXTRA_INDEX_URL, and pip.conf. This is correct for the pre-release path (hardcoded GitHub URL, no index consulted), but breaks the PyPI fallback for users in corporate or air-gapped environments whose only route to bitsandbytes is a private mirror configured via those mechanisms. Keep --isolated on the direct-URL pre-release install; drop it from the index-dependent fallback.
Aligns install_python_stack.py with unslothai/unsloth#4966. uv's installer corrupts the bitsandbytes continuous-release_main wheel on ROCm even when the command reports success, leaving the venv with a broken bnb import at runtime. The install.sh fix in #4966 switched to python -m pip install for the Linux shell installer; this commit does the same for the Python installer that runs from unsloth studio update. Changes - pip_install_try and pip_install gain a force_pip: bool = False parameter. When True, the uv attempt is skipped entirely and the call goes straight to python -m pip install via the existing pip_cmd builder. - _ensure_rocm_torch passes force_pip=True for both the bnb pre-release URL install AND the PyPI fallback. Both code paths install bitsandbytes and both are affected by the uv corruption bug, so keeping them consistent matches the gemini-code-assist review comment on #4966 (the fallback in #4966 itself is still uv-backed). - Non-bnb calls (torch install, base packages, extras, etc.) keep the default force_pip=False and continue to prefer uv for speed.
The previous Windows AMD install was incomplete in two material ways, both surfaced while verifying whether PyTorch has upstream Windows ROCm wheels (it does not -- pytorch.org's get-started page states "ROCm is not available on Windows" and every wheel under download.pytorch.org /whl/rocm7.x is manylinux_2_28_x86_64 only; upstream work is tracked in pytorch/pytorch#159520 and targeted for a future release). repo.radeon.com therefore remains the only source for Windows ROCm torch until that RFC lands, and AMD's install docs at rocm.docs.amd.com/projects/radeon-ryzen/.../install-pytorch.html document a two-step pip procedure that we were only implementing half of. Fixing both bugs here so the PR actually produces a working torch.version.hip import on a fresh Windows host. Bug 1: missing ROCm SDK wheels install step AMD's procedure first installs rocm_sdk_core, rocm_sdk_devel, rocm_sdk_libraries_custom, and rocm-<ver>.tar.gz (about 1.4 GB total). These wheels ship the ROCm runtime libraries that torch links against at import time. Without them `import torch` fails with missing DLL errors even when the torch wheels themselves are installed. Both 7.1.1 and 7.2.1 need this step; 7.1.1 stamps its SDK wheels with the `0.1.dev0` version string while 7.2.1 uses `7.2.1`. This adds the 4 SDK URLs per release to the wheel map and installs them as Step 1 ahead of the existing torch install (Step 2). Both steps are passed in a single pip call each so pip's dep resolver does not reset torch between wheels (matches AMD's troubleshooting guidance for the same failure mode). Bug 2: HIP SDK was a hard prerequisite, should be optional install.ps1 / setup.ps1 / install_python_stack.py all errored out when $env:HIP_PATH was absent, pointing users at the HIP SDK download page. But the HIP SDK developer toolkit at C:\Program Files\AMD\ROCm\ is for people compiling HIP kernels, not for running PyTorch. AMD's docs list only (a) the AMD graphics driver 26.2.2+ and (b) Python 3.12 as prerequisites. Gating on HIP_PATH was blocking the exact audience that #4280 is about -- regular Radeon users running Unsloth. HIP_PATH is now an optional version hint. When present and valid we use it to select the matching ROCm release; when absent or unsupported (e.g. HIP 6.4) we fall back to the newest stable release (_DEFAULT_WINDOWS_ROCM_VERSION = (7, 2)) and print a visible note pointing at the graphics driver download page. The HIP SDK install prompts have been removed from all three files. Use pip (not uv) for the Radeon wheels Both the SDK and torch install steps call `python -m pip install` directly via the new force_pip=True path added in the previous commit. AMD's documented procedure uses pip, uv has known wheel-corruption issues on similar large ROCm wheels (unslothai/unsloth#4966 for bitsandbytes), and pip is the combination AMD validates. This matches the fix applied to bitsandbytes on Linux ROCm.
--isolated suppresses pip.conf cert/proxy/CA settings in addition to index config. For the direct GitHub URL, index config is irrelevant but cert/proxy settings matter in corporate SSL-inspection environments. Without this fix, users with pip.conf-based CA bundles get a TLS error on the pre-release download and silently fall back to the broken PyPI version -- the exact outcome the PR is trying to prevent. Also fix the fallback warning: "unreachable" is too specific since the pre-release install can fail for reasons other than network reachability.
|
Auto-review verdict: Approved All review iterations completed successfully. Fixes were applied and verified. |
Supersedes the pip-bootstrap approach from unslothai#4966. Drops the ensurepip block and reverts both install paths to uv; UV_SKIP_WHEEL_FILENAME_CHECK handles the continuous-release_main wheel's filename/metadata mismatch without mixing package managers in the studio venv.
Fixes install issue with bitsandbytes on ROCm: