Skip to content

Prevent exception being thrown on hasScrolledBody#31485

Merged
dnfield merged 22 commits intoflutter:masterfrom
kpsroka:NestedScrollPosition_toDouble_exception_31195
Jun 5, 2019
Merged

Prevent exception being thrown on hasScrolledBody#31485
dnfield merged 22 commits intoflutter:masterfrom
kpsroka:NestedScrollPosition_toDouble_exception_31195

Conversation

@kpsroka
Copy link
Contributor

@kpsroka kpsroka commented Apr 23, 2019

Description

Fixes #31195, by introducing a null check in hasScrolledBody of _NestedScrollCoordinator.

The root cause of the issue is that the position.minScrollExtent returns null, and the comparison operator trying to cast the value to toDouble on it causes exception to be thrown.

This occurs as a result of scheduleWarmUpFrame causing a cascade of updateChild calls. Eventually this reaches a NestedScrollViewState's build method.

The bug is likely more due to the scheduleWarmUpFrame being run with some assumptions that do not hold. The _minScrollExtent of _NestedScrollPosition that back NestedScrollView is initialized with null, and updated to a double value in applyContentDimensions. That, in turn, is executed by cascaded layout calls, which happens only after the scheduleWarmUpFrame finishes.

It is likely that the bug has a better solution, but I'd need some guidance on resolving this in another way.

Related Issues

Fixes #31195
Fixes #28351

Tests

I added the following tests:

I'd love to know how to reproduce this in a test environment (including fake 'warmUp' frame).

Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I signed the CLA.
  • I read and followed the Flutter Style Guide, including Features we expect every widget to implement.
  • I updated/added relevant documentation (doc comments with ///).
  • All existing and new tests are passing.
  • The analyzer (flutter analyze --flutter-repo) does not report any problems on my PR.
  • I am willing to follow-up on review comments in a timely manner.

Breaking Change

Does your PR require Flutter developers to manually update their apps to accommodate your change?

  • Yes, this is a breaking change (Please read Handling breaking changes). Replace this with a link to the e-mail where you asked for input on this proposed change.
  • No, this is not a breaking change.

@dnfield dnfield added f: scrolling Viewports, list views, slivers, etc. framework flutter/packages/flutter repository. See also f: labels. labels Apr 23, 2019
@dnfield
Copy link
Contributor

dnfield commented Apr 24, 2019

Thanks for this!

I'm wondering if this is really the right fix. There are a lot of places in the framework that assume minScrollExtent is not null, and would throw a similar exception if it were.

I suspect the intent of marking minScrollExtent as required in _NestedScrollViewController's constructor was to mean it also shouldn't be null, but there's no assert. What if we were to add an assert, and figure out who is calling it with null, and make sure they call it with something sensible (like perhaps 0.0)?

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the comment, I suspect we just shouldn't let this be null rather than allowing it to be null in this one special case.

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 24, 2019

@dnfield Both _minScrollExtent and _maxScrollExtent are being uninitialized in ScrollPosition. The documentation for these can be found in ScrollMetrics.

Given that the permitted values can be also -/+ infinity, it might be actually good to change the getters of these fields to return -/+ infinity if they are not being set?* I'll check what effect this will have, but after looking at the code, I believe this will retain the current behaviour of null where nulls are accepted, and otherwise won't change the behaviour. WDYT?

* or perhaps initialize them with infinities -- I'm not sure what's the code style around these cases

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 24, 2019

I've updated the PR with the changes I suggested above, PTAL.

Looking through the code I've also noticed that FixedScrollMetrics should also probably have an assertion on the min-/maxScrollExtent params. However, this can be breaking for some clients (though the usage with nulls might not necessarily be useful).

@dnfield
Copy link
Contributor

dnfield commented Apr 24, 2019

They get set in the consturctor for _NestedScrollView no?

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 24, 2019

They get set in the consturctor for _NestedScrollView no?

I'm afraid that I don't know which they you are referring to. EDIT: If you're thinking of _min/_maxScrollExtent, they are private to scroll_position.dart, and not initialized there.

@dnfield
Copy link
Contributor

dnfield commented Apr 24, 2019

Sorry, not being clear. I'll take a look when not on mobile :)

@dnfield
Copy link
Contributor

dnfield commented Apr 24, 2019

I was mixing up the constructor for _NestedScrollMetrics with _NestedScrollView.

I suspect the fix here is to default the min and max scroll extents to something, but I suspect 0.0 is a safer default than double.infinity (or double.negativeInfinity). It also may be the case that we don't need to default it, we just need to figure out why someone is failing to properly initialize it. Either way, it seems allowing it to ever be null could potentialy cause problems all over the place.

@goderbauer might know better but he's out right now. I'll poke at it a little more in the mean time.

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 25, 2019

Two nits/thoughts:

  • wrt FixedScrollMetrics constructor (and inheriting _NestedScrollMetrics): the *extent params are @required, but their value is not asserted upon (the analyzer doesn't require the @required args to be non-null, right?). That's why adding an assertion might be breaking (though I doubt that it serves a purpose to initialize them with null).
  • In common case of finite scroll content, the minScrollExtent will be most likely set to zero upon first layout. However, I don't think there's going to be any difference in the way the NestedScrollView works if it the field gets initialized with 0.

@dnfield
Copy link
Contributor

dnfield commented Apr 25, 2019

Had a quick chat with @Hixie about this - he's suggesting we shouldn't default it, as that could cause some bad behavior in other places.

Basically, we need to figure out who is initializing or trying to use NestedScrollController before it's actually ready for use. Then we need to either make sure it's properly initialized at that point, or we need to make sure the lifecycle is making sense and not trying to use something it's not supposed to.

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 26, 2019

I've added a test for the bug in nested_scroll_view_test.dart. You can find the minimal repro I could achieve at https://gist.github.com/kpsroka/02d1bfd1ce9faedb92f4aa56f3e23b4a

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 26, 2019

OK, After investigating the repro I think I know what's happening:

  1. NestedScrollView makes use -- through _NestedScrollCoordinator -- of two instances of _NestedScrollController: inner, and outer. Both of them hold a list of ScrollPositions. These lists are being mutated through attach and detach methods.
  2. The inner instance is passed in the _buildSlivers method to the body of NestedScrollView using the PrimaryScrollController inherited widget.
  3. If the body of the NestedScrollView extends Scrollable, upon first build (on attachRootWidget) it will execute didChangeDependencies and _updatePosition(), which in turn calls attach() on the supplied inner controller, with a newly created ScrollPosition. That position's min/maxScrollExtent are initially set to null.
  4. The attached position's min/maxScrollExtent are going to be updated in the next layout() phase. However, it will not happen before the scheduleWarmUpFrame executes another build method.
  5. If the widget tree contains _ModalScope (e.g. via MaterialPageRoute), it is being marked as dirty after the initial build (through NavigatorState.didUpdateWidget()).
  6. The next build phase (before the first layout phase) triggers the crash when NestedScrollView reaches its build().

Now, the questions is: what shouldn't happen? :)

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 26, 2019

Note: I realized that the test that I wrote doesn't follow the same path that production code does -- it might be related to the placeholder being present in tests, or some binding overrides. Hence, I reverted it.

@dnfield
Copy link
Contributor

dnfield commented Apr 26, 2019

Thanks for the detailed explanation!

This part seems strange to me:

The attached position's min/maxScrollExtent are going to be updated in the next layout() phase. However, it will not happen before the scheduleWarmUpFrame executes another build method.

Are you saying that the build method call here skips a layout?

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 26, 2019

Thanks for the detailed explanation!

This part seems strange to me:

The attached position's min/maxScrollExtent are going to be updated in the next layout() phase. However, it will not happen before the scheduleWarmUpFrame executes another build method.

Are you saying that the build method call here skips a layout?

I assumed it doesn't does, since it looks like on layout, the scroll extents should get updated unconditionally. I'll verify that later today.

@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 26, 2019

Thanks for the detailed explanation!

This part seems strange to me:

The attached position's min/maxScrollExtent are going to be updated in the next layout() phase. However, it will not happen before the scheduleWarmUpFrame executes another build method.

Are you saying that the build method call here skips a layout?

The initial build (caused by attachRootWidget) is not triggered via frame callbacks, but rather by this mount call, which causes a series of update -> inflate -> mount -> build calls. Afterwards the warm up frame proceeds with another build cycle before it gets to the layout phase.

EDIT: This also explains why the widgetTest is unable to reproduce this: the test placeholder's existence causes any subsequent attachRootWidget calls to end up executing the else branch instead, which schedules build in the regular frame cycle.

@kpsroka kpsroka changed the title Prevent exception being thrown on hasScrolledBody Prevent exception being thrown on hasScrolledBody (#31485) Apr 27, 2019
@kpsroka kpsroka changed the title Prevent exception being thrown on hasScrolledBody (#31485) Prevent exception being thrown on hasScrolledBody Apr 27, 2019
@kpsroka
Copy link
Contributor Author

kpsroka commented Apr 30, 2019

@dnfield Any idea on how to proceed with that?

@Hixie
Copy link
Contributor

Hixie commented Apr 30, 2019

If there's any part of the framework or test framework that causes us to go through a build cycle before the corresponding layout cycle, that's the bug we should track down and fix. The only times that should happen is when you've explicitly told the test framework to only go as far as build, and that's quite rare.

@dnfield
Copy link
Contributor

dnfield commented May 30, 2019

Ok. Here's the sequence that seems to cause this:

  1. The root widget is attached and fires an initial build.
  2. A warm up frame is scheduled. It starts its work and flushes the microtask queue.
  3. The engine finds out the platform brightness and text scale settings, and sends those down to the window (adding to the microtask queue).
  4. The microtask flush in the middle of scheduling the warmup frame again marks the tree as dirty.
  5. As the warmup frame continues to make progress, the scroll controller is in a bad state. It thinks it's been sucessfully built and laid out previously, and so it should be safe to access the metrics - however, the layout hasn't happened yet even though we're marked dirty again.

So our two options are:

  • Make scroll metrics tolerant of this case. The risk is that we end up hiding lifecycle issues this way.
  • Somehow make the warmup frame finish before processing the setState call for the platformbrightness/textscale notification(s).

@dnfield
Copy link
Contributor

dnfield commented May 30, 2019

Discussed this elsewhere, but one possibility for this is to update the RendererBinding to override the scheduleWarmupFrame. It could then wrap the super.scheduleWarmupFrame with something to let the renderer binding know that it should defer callbacks coming from the engine to a queue. Once the super.scheduleWarmUpFrame() is done, it could then flush that queue and go back to sending them immediately.

super.handleMetricsChanged();
for (WidgetsBindingObserver observer in _observers)
observer.didChangeMetrics();
afterWarmUpFrame(() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this won't be robust enough - we'll eventually add some new handle call, or another implementer of the binding would potentially dirty the tree in some unexpected way. Can we somehow push this up to the RendererBinding?

@Hixie
Copy link
Contributor

Hixie commented Jun 3, 2019

The engine finds out the platform brightness and text scale settings, and sends those down to the window (adding to the microtask queue).

That seems like the bug. Why are these ending up on the microtask queue instead of the task queue?

@dnfield
Copy link
Contributor

dnfield commented Jun 3, 2019

Hm. It may not actually be the microtask queue at all - the problem is they mark the widgets as needing a build again, but they're getting marked as needing a build while a build is already going on which leads to the confusion.

@kpsroka
Copy link
Contributor Author

kpsroka commented Jun 4, 2019

The engine finds out the platform brightness and text scale settings, and sends those down to the window (adding to the microtask queue).

That seems like the bug. Why are these ending up on the microtask queue instead of the task queue?

Actually, they are not. I've been a bit misled by the comment in scheduleWarmUpFrame:

    // We use timers here to ensure that microtasks flush in between.

The engine calls that end up dirtying the state before layout are actually regular Tasks, rather than microtasks. So I tried a different approach now, in which instead of using Timer.run to call handle*Frame within scheduleWarmUpFrame, I'm using scheduleMicrotask.

@dnfield
Copy link
Contributor

dnfield commented Jun 4, 2019

This seems reasonable to me but I'm afraid I may be lacking in my understanding of what's going on here. I suspect @jonahwilliams @goderbauer, @yjbanov or @Hixie would be better suited to review this. Could one of you take a look?

@dnfield dnfield dismissed their stale review June 4, 2019 17:19

This seems reasonable but I'd defer to someone with a better understanding of the inner workings here.

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. I've talked with a couple people and no one has been able to give me a strong reason not to do this, and it seems to work fine. We'll keep an eye out for it breaking anything.

@dnfield dnfield merged commit ab707ac into flutter:master Jun 5, 2019
@dnfield
Copy link
Contributor

dnfield commented Jun 5, 2019

Thanks for sticking with this @kpsroka ! It definitely was a tricky one.

@Hixie
Copy link
Contributor

Hixie commented Jun 6, 2019

The PR as it stands now seems correct, but I'm still confused as to why it didn't work before when calling Timer.run twice back-to-back synchronously. I would have expected the result to be identical (modulo microtask counts).

It would be good to have a test that specifically tests the original bug that led to this, though I admit I don't really know how one would write that test.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

f: scrolling Viewports, list views, slivers, etc. framework flutter/packages/flutter repository. See also f: labels.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exception: 'toDouble' was called on null' on IOS NestedScrollCoordinator: The method '>' was called on null

5 participants