[Impeller] Do not wait for a frame's acquire fence if the frame was never presented#183288
Conversation
…ever presented This could happen if a previous call to vkAcquireNextImageKHR failed. If that occurs, then AcquireNextDrawable will not return a surface. The caller will not render a frame or call Present, and the fence will never be signaled. See internal issue b/488786379
There was a problem hiding this comment.
Code Review
This pull request addresses a potential hang in the Vulkan backend. If vkAcquireNextImageKHR fails, the associated fence is never signaled, which could cause a deadlock on a subsequent frame. The fix introduces a flag, acquire_fence_pending, to track whether a wait on this fence is necessary. This flag is only set when a frame is successfully presented, ensuring that the wait is skipped if the image acquisition failed. The changes also include updates to the Vulkan mocks to allow for testing this scenario, and a new unit test is added to verify that the fence wait is correctly skipped after an acquisition failure.
engine/src/flutter/impeller/renderer/backend/vulkan/test/swapchain_unittests.cc
Outdated
Show resolved
Hide resolved
| return VK_SUCCESS; | ||
| } | ||
|
|
||
| static thread_local std::function<std::remove_pointer_t<PFN_vkWaitForFences>> |
There was a problem hiding this comment.
What would be the benefit of NoDestructor here?
Other similar state in the mock Vulkan implementation does not use NoDestructor. Skipping the destructor call could cause leaks during thread shutdown.
There was a problem hiding this comment.
It's a requirement of the c++ style guide. The tradeoffs and decision are enumerated here: https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables
There was a problem hiding this comment.
you could zero it out in a teardown method for your test suite if you are concerned about the leak
There was a problem hiding this comment.
These variables are thread-local and do not have static/global duration.
See https://google.github.io/styleguide/cppguide.html#thread_local - specifically the discussion of why removing the destructor call will result in resource leaks when the thread terminates.
There was a problem hiding this comment.
I know that the destructor will be called when the thread is killed. The concerns about statics still apply here.
Prefer trivial types, or types that provably run no user-provided code at destruction
The absl::NoDestructor makes the code satisfy that suggestion.
Relying on the thread destruction to clean up your variable is unreliable because the timing is difficult to manage. If instead of relying on the thread destruction to do your cleanup, you have a RAII wrapper or clear it out in the tearDown you have a clear deterministic time when it will happen and any temporal dependencies can be negotiated. It also makes it easier to know there is no leakage between tests.
There was a problem hiding this comment.
The thread-local variables used by the Vulkan mocks will be predictably destructed when the thread running the test terminates. The destructors of these objects do not have any dependencies on the order in which they are invoked.
This pattern has been used safely in the Vulkan mocks since your original implementation in flutter/engine@30d0cf8. This is how the parameters of the MockVulkanContextBuilder are made available to the functions exported by the mock Vulkan API resolver.
I agree that refactoring the lifecycle of the mock Vulkan state might be worthwhile (for example, creating a test fixture for mock Vulkan tests that cleans this state after each test case). But that's out of scope for this PR.
There was a problem hiding this comment.
Chatted with jason offline. We are in agreement in what a better setup for mock_vulkan would be and outlined it in a followup issue: #183323
| delete reinterpret_cast<MockSemaphore*>(semaphore); | ||
| } | ||
|
|
||
| static thread_local std::function< |
engine/src/flutter/impeller/renderer/backend/vulkan/swapchain/khr/khr_swapchain_impl_vk.cc
Show resolved
Hide resolved
engine/src/flutter/impeller/renderer/backend/vulkan/swapchain/khr/khr_swapchain_impl_vk.cc
Show resolved
Hide resolved
|
@jason-simmons fyi I'm getting impeller validation errors when running this new test locally. |
|
Yes - that is intentional. The test is simulating an unrecoverable error in Is this causing an issue? Should the style of logging be changed for this class of errors? |
|
I noticed it when I was refactoring the mockvulkan. I thought I introduced it. I've usually seen it used for non-recoverable errors so it is surprising if we continue executing fine. At a minimum maybe we should add a log statement that shows up in that test that those errors are expected. That's going to trip people up who are looking at logs trying to figure out what went wrong. |
|
Another option would be changing Android appears to be using this error code to represent situations that may be recoverable. In the issue that led to this PR, Android's implementation of Android logged the error as |
|
Yea, that sounds good to me if it sounds good to you. |
…AcquireNextImageKHR Android may use this error code to indicate situations that are recoverable (such as a temporary failure to dequeue a buffer). This PR removes this error from the validation log to reduce noise in the logs. See flutter#183288
…rame was never presented (flutter/flutter#183288)
…rame was never presented (flutter/flutter#183288)
…rame was never presented (flutter/flutter#183288)
…rame was never presented (flutter/flutter#183288)
…rame was never presented (flutter/flutter#183288)
…rame was never presented (flutter/flutter#183288)
flutter/flutter@d182143...2ec61af 2026-03-08 [email protected] Roll Fuchsia Linux SDK from P9D6h4D2ks99Ct7TO... to giLoee6arX5CRHuRh... (flutter/flutter#183366) 2026-03-07 [email protected] Roll Skia from 6643c1bd93bb to af994ae4d990 (1 revision) (flutter/flutter#183359) 2026-03-07 [email protected] refactor: remove material import from animated_cross_fade, physical_model_test, pinned_header_sliver_test, spell_check_test (flutter/flutter#183234) 2026-03-07 [email protected] Roll Skia from a69ef43650ee to 6643c1bd93bb (13 revisions) (flutter/flutter#183346) 2026-03-07 [email protected] Roll pub packages (flutter/flutter#183344) 2026-03-07 [email protected] Roll Dart SDK from 0c24edc41e09 to 1604910613c7 (2 revisions) (flutter/flutter#183342) 2026-03-07 [email protected] Roll Fuchsia Test Scripts from nR2ESa1Gd8yPcWo06... to R2EllDf4DgBXVNuiN... (flutter/flutter#183341) 2026-03-07 [email protected] Support BGRA textures in BlitCopyTextureToBufferCommandGLES::Encode and fix improper mapping of BGRA to RGBA in blit_command_gles and texture_gles (flutter/flutter#182397) 2026-03-06 [email protected] Roll Fuchsia Linux SDK from 8ay15_eQOEgPHCypm... to P9D6h4D2ks99Ct7TO... (flutter/flutter#183329) 2026-03-06 [email protected] [doc]add discord channel to ios triage meeting (flutter/flutter#183285) 2026-03-06 [email protected] Roll Dart SDK from 7c7c1e3d024d to 0c24edc41e09 (2 revisions) (flutter/flutter#183328) 2026-03-06 [email protected] Revert "Make HCPP upgrading work for vd/tlhc (#181024)" (flutter/flutter#183310) 2026-03-06 [email protected] Add AI contribution guidelines (flutter/flutter#183326) 2026-03-06 [email protected] [Impeller] Do not wait for a frame's acquire fence if the frame was never presented (flutter/flutter#183288) 2026-03-06 [email protected] Add back in accidentally removed line from `create_updated_flutter_deps.py` (flutter/flutter#183314) 2026-03-06 [email protected] Fix typo in README (flutter/flutter#183245) 2026-03-06 [email protected] Roll pub packages (flutter/flutter#183319) 2026-03-06 [email protected] Add displayCornerRadii support to predictive back transitions. (flutter/flutter#181326) 2026-03-06 [email protected] refactor: remove material from widget_inspector_test, sliver_cross_axis_group_test, editable_text_show_on_screen_test, scrollable_fling_test, selection_container_test (flutter/flutter#182702) If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-packages Please CC [email protected],[email protected] on the revert to ensure that a human is aware of the problem. To file a bug in Packages: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://issues.skia.org/issues/new?component=1389291&template=1850622 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
…ever presented (flutter#183288) This could happen if a previous call to vkAcquireNextImageKHR failed. If that occurs, then AcquireNextDrawable will not return a surface. The caller will not render a frame or call Present, and the fence will never be signaled. See internal issue b/488786379
…AcquireNextImageKHR (flutter#183338) Android may use this error code to indicate situations that are recoverable (such as a temporary failure to dequeue a buffer). This PR removes this error from the validation log to reduce noise in the logs. See flutter#183288
…ever presented (flutter#183288) This could happen if a previous call to vkAcquireNextImageKHR failed. If that occurs, then AcquireNextDrawable will not return a surface. The caller will not render a frame or call Present, and the fence will never be signaled. See internal issue b/488786379
…ever presented (flutter#183288) This could happen if a previous call to vkAcquireNextImageKHR failed. If that occurs, then AcquireNextDrawable will not return a surface. The caller will not render a frame or call Present, and the fence will never be signaled. See internal issue b/488786379
This could happen if a previous call to vkAcquireNextImageKHR failed. If that occurs, then AcquireNextDrawable will not return a surface. The caller will not render a frame or call Present, and the fence will never be signaled.
See internal issue b/488786379