cuda.bindings: Fix segfault when converting char* NULL to bytes#497
cuda.bindings: Fix segfault when converting char* NULL to bytes#497leofang merged 9 commits intoNVIDIA:mainfrom
char* NULL to bytes#497Conversation
…ad of segfaulting when converting char* NULL to bytes.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
This comment has been minimized.
This comment has been minimized.
|
@leofang The two test failures are these flakes: |
|
Changes LGTM at the moment but I see that this is still a draft |
Thanks!
I'm looking into adding more tests, to ideally cover all changes. And we need to do something about the flakes, even though they are unrelated. |
Observed failures: ``` > assert delay_seconds * 1000 <= elapsed_time_ms < delay_seconds * 1000 + 2 # tolerance 2 ms E assert 503.1598205566406 < ((0.5 * 1000) + 2) ``` ``` > assert delay_seconds * 1000 <= elapsed_time_ms < delay_seconds * 1000 + 2 # tolerance 2 ms E assert 505.0583190917969 < ((0.5 * 1000) + 2) ```
…tring() CUDA_ERROR_INVALID_VALUE
…ve segfaulted before.
|
/ok to test |
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Woah, looks like I'm good at generating segfaults ... @leofang should I just remove that new test and leave that for another PR? I don't think I'll get to the bottom of the newly discovered segfault today. Tomorrow pretty sure. |
|
Just isolate out the cuda.core changes from those for bindings. Let's merge core-related things first; bindings can wait. |
|
/ok to test |
|
The cuda.core change is to resolve the flakes. (Without it I'd (maybe) have to hit rerun a few times to deflake.) I just backed out the test that triggers the segfaults: commit a39720b |
Ah, I meant we can separate it out and merge it first. Mainly if we want to auto-backport the fix, the cuda-11 branch does not have any cuda.core code, so the apto-backport would fail. I am curious if it could work without any cuda-core changes involved. |
This reverts commit 9dd2630.
|
/ok to test |
|
@rwgk could you update the commit message and avoid (I noticed this because of a notification 😂) |
…ested by at-leofang
7814469 to
ab6db22
Compare
Oh ... sorry, done. (I did this a lot in the pybind11 repo, wanting to give proper credit. I didn't realize this can lead to spammy notifications.) |
|
/ok to test |
|
@leofang The tests passed previously with the exact same code. The only change was to the commit message. I.e. admin merge would be ideal. |
|
ok |
|
Backport failed because this pull request contains merge commits. You can either backport this pull request manually, or configure the action to skip merge commits. |
Should I take care of that? (Happy to.) |
|
Yes, please. My bad in updating your branch, sorry... |
|
I'll backport this together with #499, after that is merged. |
* PR #497 squash-merged * Bring back test_nvrtcGetLoweredName_failure() (it was originally under PR #497). * Add code for debugging * test_all_CUresult_codes(): max_code = int(max(cuda.CUresult)) as suggested by at-leofang * Change pytest options, mostly to disable output capturing (of both stdout and stderr) * Undo debugging changes in nvrtc.pyx.in * Revert "Change pytest options, mostly to disable output capturing (of both stdout and stderr)" This reverts commit b0464e7. * Skip new test if nvrtc version < 12.1
The segfault was discovered while working on PR #458.
Full test coverage for changed code.