Skip to content

ci: build wheels also for linux/aarch64#189

Merged
kyamagu merged 1 commit intoskia-python:mainfrom
lucach:main
Apr 27, 2023
Merged

ci: build wheels also for linux/aarch64#189
kyamagu merged 1 commit intoskia-python:mainfrom
lucach:main

Conversation

@lucach
Copy link
Contributor

@lucach lucach commented Apr 20, 2023

Wheels for linux/aarch64 were excluded from the CI workflow: the emulation makes the build extremely slow, and the job exceeds the 6 hours limit.

I started investigating several possible ways to work around this limitation. Experiments reveal that building gn and skia takes roughly 3 hours, and each wheel roughly 1 hour. It's common for large projects (see matplotlib here) to split wheels across jobs, creating one wheel per job.

I've experimented with several different configurations, and the one proposed in this PR seems to work consistently. Although it would be feasible for a job to build skia and one wheel (3hrs + 1hr = ~4hrs), sometimes GitHub workers are extremely slow and that would still exceed the timeout.
As a solution, I've split also the step of building gn and skia into a separate job, which get then injected into the container used by cibuildwheels.

This makes the CI setup a bit more complicated, but I think it's a small enough price to pay for getting the extra wheels.

I've also updated deprecated GitHub actions (which are going eventually to fail soon) and clarified for which versions wheels are built (so that we are also ready whenever the upcoming Python 3.12 will be out).

- Prebuild skia in a separate job for linux/aarch64 and cache the result
- Build one wheel per job for linux/aarch64
- Clarify that we are building wheels for CPython versions 3.7-3.11
- Do not stop other jobs if one fails
- Workaround for slow `git clone` with large repositories
- Update OSes and cibuildwheels
- Update deprecated GitHub actions
@lucach
Copy link
Contributor Author

lucach commented Apr 27, 2023

Hi @kyamagu, have you had the chance to look at this? It should be almost zero effort on your side.

I have a project that depends on your (fantastic!) work, and I wouldn't want to set up a needless fork.

Copy link
Collaborator

@kyamagu kyamagu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really seems a workaround but ok

@kyamagu
Copy link
Collaborator

kyamagu commented Apr 27, 2023

@lucach Hi, thanks for the PR. Sorry for the late response. I am currently inactive for OSS projects.

As for this PR, I believe ultimately this should be resolved on the runner side and personally do not like a workaround for a specific configuration (linux/aarch64 here), but maybe this cannot be avoided as of now.

@kyamagu kyamagu merged commit cd03964 into skia-python:main Apr 27, 2023
@lucach
Copy link
Contributor Author

lucach commented Apr 28, 2023

Thanks a lot @kyamagu ! (I know how maintaining OSS can be a burden...)

I totally agree that it's a dirty workaround. At least, it will also be easy to revert it when it won't be needed anymore.

Any chance of releasing the current version as 87.6 so that the wheels get published? Thanks! 🙏

@kyamagu
Copy link
Collaborator

kyamagu commented Apr 28, 2023

@lucach It's already on PyPI as ver 87.5. I'm not changing the release version this time since the wheel itself has no change and we are only adding wheels. Good luck

@HinTak
Copy link
Collaborator

HinTak commented Aug 9, 2023

@lucach hi - with m116, which I assume is bigger, I am actually finding the "gn + skia" step itself taking worst case 5 hours 52 minutes. :-(. So i am splitting it into two, "gn +bundled 3rd party" and "skia alone/finally" further. Would like your eyes on this. (Ignore the title / history - all of it is already merged) .
https://github.com/HinTak/skia-m1xx-python/pull/1/files

Don't like the "telepathic" approach with setting environment variable in one place and reading it in another place, so the build script simply interrupt itself and insists on being run a 2nd time on aarch64 linux, and continues when it is ran a 2nd time. On aarch64, you just run the script twice instead of once. (And can split the two runs into two jobs, if needed)

That should split the 6 hours into two fragments of 4 and 2.

Still not happy it is the first job and blocking the more interesting/informative faster ones from running for 6 hours... any chance of moving it side ways in-parallel to the x86_64 ones? I think delaying the other platforms from building for 6 hours is quite bad. If one spot a mistake in the faster builds, one can just cancel, fix and re-schedule.

@lucach
Copy link
Contributor Author

lucach commented Aug 11, 2023

@HinTak Hi, I don't think I have the right expertise to comment on your changes. Just two notes:

  • there seems to be a mismatch in the keys for the cache (linux-aarch64-skia-${{ github.sha }}--3rd-party is used to restore -- note the two dashes)
  • GitHub expects to introduce macOS arm64 runners in Q4 2023: I wonder if those could be then used to build the wheels without emulation and without complicating the build step

@HinTak
Copy link
Collaborator

HinTak commented Aug 11, 2023

Argh, thanks for pointing out the typo. I wonder how that actually seems to work...

@HinTak
Copy link
Collaborator

HinTak commented Aug 11, 2023

As for the mac os x arm64 runner - that might be interesting if qemu bundles the mac equivalent of kvm at that point. Kvm allows virtual machines to run at near-native speed.

@kyamagu
Copy link
Collaborator

kyamagu commented Nov 1, 2023

@kyamagu
Copy link
Collaborator

kyamagu commented Mar 24, 2025

Github-hosted runners now include Linux/ARM64 arch. Maybe it's good to think about simplifying the build process again
https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

@HinTak
Copy link
Collaborator

HinTak commented Mar 24, 2025

Good. I wonder if Google has started providing aarch64 linux gn binary. (That's the other complication we have on aarch64 linux)

@HinTak
Copy link
Collaborator

HinTak commented Apr 3, 2025

Ci is failling on aarch64 linux #313 since the last time, two days ago, and the changes since then was just adding readme's and bump version... look like github may have changed the qemu / aarch64 linux set up in the last two days (on 1st April, I guess).

@HinTak
Copy link
Collaborator

HinTak commented Apr 3, 2025

The cache management is broken, and on aarch64 linux it does not seem to be restoring the cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants