Skip to content

Do not eagerly allocate inherited widget caches when initializing element tree#95596

Merged
fluttergithubbot merged 7 commits intoflutter:masterfrom
jonahwilliams:lazy_init_inherited
Jan 18, 2022
Merged

Do not eagerly allocate inherited widget caches when initializing element tree#95596
fluttergithubbot merged 7 commits intoflutter:masterfrom
jonahwilliams:lazy_init_inherited

Conversation

@jonahwilliams
Copy link
Contributor

Currently, creating an inherited element causes it to allocate a hashmap containing itself and all parent elements, by copying the parent inherited element cache. This has roughly quadradic complexity with respect to the number of inherited elements in the widget tree, and quadradic space for the hashmap allocations. The purpose of this cache is to ensure that the of pattern and similar remain as fast as possible, since widgets frequently look up many inherited widgets in their build methods.

However, I believe that most of these caches are ending up unused, and hence we are wasting space/cpu time allocating that space. Suppose you have a handful of inherited elements at the top level of your widget tree (Directionality, Localizations, State management something). The only inherited element cache that will be used is the "last" one, and all of the others are wasted.

We should be able to improve the performance by making this inherited element cache lazy. Instead of eagerly copying the map, each element will store a parent pointer and cache the inherited element at the lowest level inherited element the first time a type is requested. This should have roughly the same complexity, but much closer to linear space.

As a bit of a sidecar, I remove the hashCode override on element. As far as I can tell, the goal of this is avoid big ints/mints due to boxing concerns and allocation - but the default hashCode is identity hashCode, which is stored on the object itself and requires no additional allocations. Removing this further improved the benchmark below.

---- Performance Measurements ---

To test this theory I ran a benchmark supplied by @dnfield which repeatedly runs first builds.

On a Pixel 4 without this patch, this ran at 774 ms with a standard deviation of 31 ms. With this patch it ran in 685 ms with a standard deviation of 24ms.

For web builds, the time improved from roughly 2.4 seconds to 2.1 seconds. I did not compute any further stats for this.

import 'dart:ui' as ui show window;
import 'package:flutter/material.dart';

void main() {
  final binding = WidgetsFlutterBinding.ensureInitialized();
  ui.window.onPointerDataPacket = null;

  var root = MediaQuery.fromWindow(
      child: Directionality(
    textDirection: TextDirection.ltr,
    child: Material(
      child: Column(
        children: List<Widget>.generate(9, (index) {
          return ListTile(
              leading: CircleAvatar(child: Text(index.toString())),
              title: Text('Content $index'));
        }),
      ),
    ),
  ));

  var sw = Stopwatch()..start();
  for (var count = 0; count < 1000; count++) {
    var renderViewElement = RenderObjectToWidgetAdapter(container: binding.renderView, child: root)
      .attachToRenderTree(binding.buildOwner!, null);
  }
  sw.stop();
  print(sw.elapsed);
}

@flutter-dashboard flutter-dashboard bot added the framework flutter/packages/flutter repository. See also f: labels. label Dec 20, 2021
@jonahwilliams
Copy link
Contributor Author

FYI @dnfield / @goderbauer

@jonahwilliams
Copy link
Contributor Author

I bet you could get even better performance by using IdentityHashMaps, which IIRC are just regular JS maps when compiled for that?

@goderbauer
Copy link
Member

Interesting approach and those numbers do look like a nice improvement.

Do the benchmark numbers in the PR description include the improvements achieved by removing hashCode? Or does that improve the stated numbers even more?

Did you run this by any chance against one of our benchmarks that represents the real world a little better (e.g. the transitions perf benchmark, for example)? Curious to see if this would show up there as well.

@jonahwilliams
Copy link
Contributor Author

It includes hashCode, I could break that out too though.

Did you run this by any chance against one of our benchmarks that represents the real world a little better (e.g. the transitions perf benchmark, for example)? Curious to see if this would show up there as well.

no not yet - do we have the ability to run the devicelab on a PR yet 😅 ? The biggest issue is that my pixel 4 is too fast and I don't have a slower device

@goderbauer
Copy link
Member

no not yet - do we have the ability to run the devicelab on a PR yet 😅 ?

@CaseyHillers Does that exit? :)

@jonahwilliams
Copy link
Contributor Author

That said I can still try - what is your favorite benchmark? I would expect this would improve worst dart frame times, but not much else

@jonahwilliams
Copy link
Contributor Author

ahh you said transitions perf, will try

@jonahwilliams
Copy link
Contributor Author

From running the new_gallery transition perf on a pixel 4:

With change: (x3)

"average_frame_build_time_millis": 3.23 / 3.37 / 3.46
"90th_percentile_frame_build_time_millis": 5.222, / 5.8 /5.56
"99th_percentile_frame_build_time_millis": 23.842 / 23.15 / 26.097,
"worst_frame_build_time_millis": 98.947 / 95.621 / 98.716,
"missed_frame_build_budget_count": 41 / 39 / 34,

ToT (x3)

"average_frame_build_time_millis": 3.32 / 3.25 / 3.31
"90th_percentile_frame_build_time_millis": 5.33 / 5.214 / 5.45,
"99th_percentile_frame_build_time_millis": 25.302 / 24.736 / 25.142,
"worst_frame_build_time_millis": 89.851 / 110.829 / 118.102,
"missed_frame_build_budget_count": 44 / 71 / 22,

I would run it more but it takes a while. So the numbers here aren't super clear - the 99th time is sometimes better, though average times aren't much different. I'd also let @dnfield have a go at it to see if it is worth anything on the low end device

@CaseyHillers
Copy link
Contributor

no not yet - do we have the ability to run the devicelab on a PR yet sweat_smile ?

@CaseyHillers Does that exit? :)

You can run the hot reload tests if you make changes to dev/. Feel free to update the runIf in ci.yaml if there's other paths that should trigger the on device tests, or update some documentation in dev/.

// expensive map allocations until they are needed.
class _InheritedLookup {
_InheritedLookup? parent;
final HashMap<Type, InheritedElement> current = HashMap<Type, InheritedElement>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of wonder if there's some way we can rework this so that there's only a single HashMap per depth of the tree, rather than having every element get one of these. It could live on the build owner and you could basically say "get me the inherited element for my current depth". You'd need to update it when an inherited element gets a new depth.

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have fewer hashmaps than per depth of the tree today? since we push down the hashmap created by an inherited element to all non-inherited elements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right - I was thinking about cases where you might have a multi-child widget that has multiple sibling children that are inherited widgets, but in that case my suggestion doesn't even make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm realizing now tht this will completely avoid copying the hash maps and my comment below doesn't make sense either. Sorry!

I'm trying to think if there's some way we can test this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT of making this class public and just adding some unit tests for it?

That way we could have some tests that are more targetted and won't just break because someone refactors this and just starts getting errors about some weird inherited widget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can do!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// trees that contain a significant number of inherited widgets by deferring
// expensive map allocations until they are needed.
class _InheritedLookup {
_InheritedLookup? parent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can parent change? it's marked mutable here but there's no effort to invalidate the cache when it changes.

I don't see any code that changes it after the initial creation, which suggests maybe it should be final and set in the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Hixie
Copy link
Contributor

Hixie commented Jan 10, 2022

This looks great!

Given the sensitivity of these changes, it might make sense to separate out the sidecar into its own PR so that it gets validated against all the benchmarks independently.

@jonahwilliams
Copy link
Contributor Author

Removed hashcode change, will split that out elsewhere

@dnfield
Copy link
Contributor

dnfield commented Jan 14, 2022

The VM has a new fast path for copying LinkedHashMap (https://dart-review.googlesource.com/c/sdk/+/222249). That hasn't rolled into the framework yet, but once it does we might get more benefit from changing all the HashMap instances to Map.

@dnfield
Copy link
Contributor

dnfield commented Jan 14, 2022

When It ried ot test that patch locally I still saw profiling get hung up in list copies, but it's possible the VM can further optimize those as well...

@jonahwilliams
Copy link
Contributor Author

quadadic space is quadradic space

@jonahwilliams
Copy link
Contributor Author

but actually, we could improve web pretty dramatically with LinkedHashMap since it is possible to get an es6 map out of it. Follow up patch

@dnfield
Copy link
Contributor

dnfield commented Jan 14, 2022

I don't mean instead of this patch, I mean in addition :)

@jonahwilliams
Copy link
Contributor Author

With hashCode removed, the benchmark runs in 725ms instead of 685 ms. still an improvement, but a smaller one

@jonahwilliams jonahwilliams marked this pull request as ready for review January 14, 2022 21:29
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat.

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@jonahwilliams jonahwilliams changed the title [WIP] Do not eagerly allocate inherited widget caches when initializing element tree Do not eagerly allocate inherited widget caches when initializing element tree Jan 14, 2022
@jonahwilliams
Copy link
Contributor Author

hashcode change is #96644

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@fluttergithubbot
Copy link
Contributor

This pull request is not suitable for automatic merging in its current state.

@fluttergithubbot fluttergithubbot merged commit ba2f2c6 into flutter:master Jan 18, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request Jan 18, 2022
@jonahwilliams jonahwilliams deleted the lazy_init_inherited branch January 18, 2022 19:09
engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request Feb 4, 2022
clocksmith pushed a commit to clocksmith/flutter that referenced this pull request Mar 8, 2022
jonahwilliams pushed a commit that referenced this pull request Mar 15, 2022
Comment on lines +5256 to +5257
_inheritedLookup = InheritedTreeCache(_parent?._inheritedLookup)
..[widget.runtimeType] = this;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: In case we try this again, every time you make an InheritedTreeCache you add an item. We might as well do that as part of the constructor instead of calling 2 methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

framework flutter/packages/flutter repository. See also f: labels.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants