[CP-stable][Impeller] Terminate the fence waiter but do not reset it during ContextVK shutdown#174647
Conversation
…extVK shutdown (flutter#173085) ContextVK::Shutdown was clearing the context's fence_waiter_. This could cause crashes if a pending task such as image decoding is still holding a reference to the context after ContextVK::Shutdown is called. The image decode task will submit a command buffer through the context, and CommandQueueVK::Submit will get a null pointer deference when it tries to use the fence waiter. This PR changes ContextVK::Shutdown to terminate the fence waiter instead of clearing it. FenceWaiterVK::Terminate will now stop the waiter thread and wait for the thread to exit. This ensures that the fence waiter thread is stopped in ContextVK::Shutdown even if something else is holding a reference to the FenceWaiterVK. Tasks like image decoding will now get an error result instead of a crash when submitting a command buffer after context shutdown.
|
@jason-simmons please fill out the PR description above, afterwards the release team will review this request. |
|
This pull request was opened from and to a release candidate branch. This should only be done as part of the official Flutter release process. If you are attempting to make a regular contribution to the Flutter project, please close this PR and follow the instructions at Tree Hygiene for detailed instructions on contributing to Flutter. Reviewers: Use caution before merging pull requests to release branches. Ensure the proper procedure has been followed. |
There was a problem hiding this comment.
Code Review
This pull request refactors the shutdown sequence for FenceWaiterVK. In ContextVK::Shutdown, the call to fence_waiter_.reset() is replaced with fence_waiter_->Terminate(). This stops the fence waiter's thread but does not destroy the FenceWaiterVK object. The FenceWaiterVK::Terminate method is updated to be idempotent and now joins its worker thread, moving this logic from the destructor. This makes the termination explicit and blocking. A unit test is updated to reflect the new blocking nature of Terminate.
|
I believe we should cherry pick this PR. It's been on master for 3 weeks without reported problems, has tests, and solves a real crash reported by customers. |
|
autosubmit label was removed for flutter/flutter/174647, because - The status or check suite Windows windows_unopt has failed. Please fix the issues identified (or deflake) before re-applying this label. |
fec1efd
into
flutter:flutter-3.35-candidate.0
…eset it during ContextVK shutdown (flutter/flutter#174647)
…eset it during ContextVK shutdown (flutter/flutter#174647)
…eset it during ContextVK shutdown (flutter/flutter#174647)
This pull request is created by automatic cherry pick workflow
Please fill in the form below, and a flutter domain expert will evaluate this cherry pick request.
Issue Link:
What is the link to the issue this cherry-pick is addressing?
#171691
Changelog Description:
Fixes a race that can cause crashes in the Impeller Vulkan back end.
Impact Description:
Crashes in apps running on Impeller/Vulkan.
Workaround:
Disable Impeller
Risk:
What is the risk level of this cherry-pick?
Test Coverage:
Are you confident that your fix is well-tested by automated tests?
Validation Steps:
The crash does not happen consistently. One way that I have been able to reproduce it is:
Without the fix, the app will often crash if an image decode tries to access the Vulkan context after it was partially shut down while exiting the app.