allow ssh to be used for debugging github test runners#1276
allow ssh to be used for debugging github test runners#1276BrentBaccala wants to merge 9 commits intoSingular:spielwiesefrom
Conversation
|
Does this reliably work? We might have a use for it in @sagemath |
|
I haven't used it recently, but it worked fine this spring. It's a bit of a pain to configure. To get it working, you need to have the changes made on the github repository's default branch (having it on another branch doesn't work). You can either change the default branch on the repository settings, or merge the patch into the default branch. Then you can trigger the workflow manually, as documented here. This also means you can't test it normally as a PR. You can change the git default branch on your own copy of the Singular repository, or merge into it, as I just described, and then make your own PR on your own repository, and run the ssh runner there. I've done that; it should work. If the maintainers accept it into the main repository, and put it on the default branch, it will be easier to use. I was also thinking of using it to diagnose the sagemath test suite failures, as I think you are, but am working on a different project. Could you try it and see if it works for you? |
a410f5f to
406c2c0
Compare
action-tmate itself has no timeout for how long it will wait for someone to ssh in — it waits indefinitely. On a forgotten manual dispatch across a 6-job matrix that can burn up to 36 hours of runner time before GitHub's default 6-hour job timeout fires on each job. Cap the wait at the step level with timeout-minutes, and expose it as a workflow_dispatch input (default 10 minutes).
The workflow file has to live on a branch of the repo where the run executes, but actions/checkout can pull source from anywhere. Exposing repo and ref as workflow_dispatch inputs lets a tmate-enabled workflow on one branch debug a build of any other branch (or upstream repo) without having to merge workflow changes into the branch under test. Defaults preserve existing behavior: without inputs (push/pull_request events or manual dispatch without overrides), checkout uses the same repository and ref it would have used before.
Without limit-access-to-actor, the 25-character session token embedded in the printed ssh URL is the only credential — anyone who sees the workflow log (public for public repos) can connect to the runner and read its environment, including any secrets the job has access to. limit-access-to-actor=true fetches the dispatching user's public keys from github.com/<user>.keys and writes them to the session's authorized_keys, so a valid private key is required in addition to the token. Expose it as an input so a user without github-registered ssh keys can still opt into token-only access by setting it false.
Explains the workflow_dispatch inputs added by the preceding commits (repo, ref, tmate, tmate_timeout_minutes, tmate_limit_access_to_actor), how to dispatch a manual run from the web UI or gh CLI, how to connect to and exit a tmate session, the security implications of running with or without limit-access-to-actor, and the tmate upstream deprecation. Linked from doc/How-To-Contribute.md so a contributor hitting a CI-only failure can find it. A comment at the top of runtests.yml points at the doc for anyone reading the workflow directly.
NAT/firewall rationale: GitHub-hosted runners have no inbound connectivity, so direct ssh is impossible and a relay is necessary. Also note the self-hosted tmate-ssh-server option for projects that don't want to trust the public tmate.io relay.
|
This PR has been rebased onto current New workflow inputs (available when dispatching manually from the Actions tab or
Push and pull_request triggers are unchanged — the tmate step is skipped and the checkout behaves as before. Documentation added in Tested on the fork:
This comment was researched and written by an AI assistant (Claude) on behalf of Brent Baccala ([email protected]). |
|
Thanks, we will have a look again, but please don't be disappointed if we currently go for something more standard. |
|
I didn't realize it had been reviewed at all. |
This PR adds to the existing github test harness to provide an option, when the action is triggered manually, to wait at the end for an ssh connection into the test runner, that can be used for debugging.