ci: build wheels also for linux/aarch64#189
Conversation
- Prebuild skia in a separate job for linux/aarch64 and cache the result - Build one wheel per job for linux/aarch64 - Clarify that we are building wheels for CPython versions 3.7-3.11 - Do not stop other jobs if one fails - Workaround for slow `git clone` with large repositories - Update OSes and cibuildwheels - Update deprecated GitHub actions
|
Hi @kyamagu, have you had the chance to look at this? It should be almost zero effort on your side. I have a project that depends on your (fantastic!) work, and I wouldn't want to set up a needless fork. |
kyamagu
left a comment
There was a problem hiding this comment.
This really seems a workaround but ok
|
@lucach Hi, thanks for the PR. Sorry for the late response. I am currently inactive for OSS projects. As for this PR, I believe ultimately this should be resolved on the runner side and personally do not like a workaround for a specific configuration ( |
|
Thanks a lot @kyamagu ! (I know how maintaining OSS can be a burden...) I totally agree that it's a dirty workaround. At least, it will also be easy to revert it when it won't be needed anymore. Any chance of releasing the current version as 87.6 so that the wheels get published? Thanks! 🙏 |
|
@lucach It's already on PyPI as ver 87.5. I'm not changing the release version this time since the wheel itself has no change and we are only adding wheels. Good luck |
|
@lucach hi - with m116, which I assume is bigger, I am actually finding the "gn + skia" step itself taking worst case 5 hours 52 minutes. :-(. So i am splitting it into two, "gn +bundled 3rd party" and "skia alone/finally" further. Would like your eyes on this. (Ignore the title / history - all of it is already merged) . Don't like the "telepathic" approach with setting environment variable in one place and reading it in another place, so the build script simply interrupt itself and insists on being run a 2nd time on aarch64 linux, and continues when it is ran a 2nd time. On aarch64, you just run the script twice instead of once. (And can split the two runs into two jobs, if needed) That should split the 6 hours into two fragments of 4 and 2. Still not happy it is the first job and blocking the more interesting/informative faster ones from running for 6 hours... any chance of moving it side ways in-parallel to the x86_64 ones? I think delaying the other platforms from building for 6 hours is quite bad. If one spot a mistake in the faster builds, one can just cancel, fix and re-schedule. |
|
@HinTak Hi, I don't think I have the right expertise to comment on your changes. Just two notes:
|
|
Argh, thanks for pointing out the typo. I wonder how that actually seems to work... |
|
As for the mac os x arm64 runner - that might be interesting if qemu bundles the mac equivalent of kvm at that point. Kvm allows virtual machines to run at near-native speed. |
|
Github-hosted runners now include Linux/ARM64 arch. Maybe it's good to think about simplifying the build process again |
|
Good. I wonder if Google has started providing aarch64 linux gn binary. (That's the other complication we have on aarch64 linux) |
|
Ci is failling on aarch64 linux #313 since the last time, two days ago, and the changes since then was just adding readme's and bump version... look like github may have changed the qemu / aarch64 linux set up in the last two days (on 1st April, I guess). |
|
The cache management is broken, and on aarch64 linux it does not seem to be restoring the cache. |
Wheels for linux/aarch64 were excluded from the CI workflow: the emulation makes the build extremely slow, and the job exceeds the 6 hours limit.
I started investigating several possible ways to work around this limitation. Experiments reveal that building gn and skia takes roughly 3 hours, and each wheel roughly 1 hour. It's common for large projects (see matplotlib here) to split wheels across jobs, creating one wheel per job.
I've experimented with several different configurations, and the one proposed in this PR seems to work consistently. Although it would be feasible for a job to build skia and one wheel (3hrs + 1hr = ~4hrs), sometimes GitHub workers are extremely slow and that would still exceed the timeout.
As a solution, I've split also the step of building gn and skia into a separate job, which get then injected into the container used by cibuildwheels.
This makes the CI setup a bit more complicated, but I think it's a small enough price to pay for getting the extra wheels.
I've also updated deprecated GitHub actions (which are going eventually to fail soon) and clarified for which versions wheels are built (so that we are also ready whenever the upcoming Python 3.12 will be out).