Do not eagerly allocate inherited widget caches when initializing element tree#95596
Conversation
|
FYI @dnfield / @goderbauer |
|
I bet you could get even better performance by using IdentityHashMaps, which IIRC are just regular JS maps when compiled for that? |
|
Interesting approach and those numbers do look like a nice improvement. Do the benchmark numbers in the PR description include the improvements achieved by removing hashCode? Or does that improve the stated numbers even more? Did you run this by any chance against one of our benchmarks that represents the real world a little better (e.g. the transitions perf benchmark, for example)? Curious to see if this would show up there as well. |
|
It includes hashCode, I could break that out too though.
no not yet - do we have the ability to run the devicelab on a PR yet 😅 ? The biggest issue is that my pixel 4 is too fast and I don't have a slower device |
@CaseyHillers Does that exit? :) |
|
That said I can still try - what is your favorite benchmark? I would expect this would improve worst dart frame times, but not much else |
|
ahh you said transitions perf, will try |
|
From running the new_gallery transition perf on a pixel 4: With change: (x3) "average_frame_build_time_millis": 3.23 / 3.37 / 3.46 ToT (x3) "average_frame_build_time_millis": 3.32 / 3.25 / 3.31 I would run it more but it takes a while. So the numbers here aren't super clear - the 99th time is sometimes better, though average times aren't much different. I'd also let @dnfield have a go at it to see if it is worth anything on the low end device |
You can run the hot reload tests if you make changes to |
| // expensive map allocations until they are needed. | ||
| class _InheritedLookup { | ||
| _InheritedLookup? parent; | ||
| final HashMap<Type, InheritedElement> current = HashMap<Type, InheritedElement>(); |
There was a problem hiding this comment.
I kind of wonder if there's some way we can rework this so that there's only a single HashMap per depth of the tree, rather than having every element get one of these. It could live on the build owner and you could basically say "get me the inherited element for my current depth". You'd need to update it when an inherited element gets a new depth.
Does that make sense?
There was a problem hiding this comment.
Don't we have fewer hashmaps than per depth of the tree today? since we push down the hashmap created by an inherited element to all non-inherited elements
There was a problem hiding this comment.
I think you're right - I was thinking about cases where you might have a multi-child widget that has multiple sibling children that are inherited widgets, but in that case my suggestion doesn't even make sense.
There was a problem hiding this comment.
Also, I'm realizing now tht this will completely avoid copying the hash maps and my comment below doesn't make sense either. Sorry!
I'm trying to think if there's some way we can test this.
There was a problem hiding this comment.
WDYT of making this class public and just adding some unit tests for it?
That way we could have some tests that are more targetted and won't just break because someone refactors this and just starts getting errors about some weird inherited widget.
There was a problem hiding this comment.
Sure, can do!
| // trees that contain a significant number of inherited widgets by deferring | ||
| // expensive map allocations until they are needed. | ||
| class _InheritedLookup { | ||
| _InheritedLookup? parent; |
There was a problem hiding this comment.
Can parent change? it's marked mutable here but there's no effort to invalidate the cache when it changes.
I don't see any code that changes it after the initial creation, which suggests maybe it should be final and set in the constructor?
|
This looks great! Given the sensitivity of these changes, it might make sense to separate out the sidecar into its own PR so that it gets validated against all the benchmarks independently. |
|
Removed hashcode change, will split that out elsewhere |
|
The VM has a new fast path for copying LinkedHashMap (https://dart-review.googlesource.com/c/sdk/+/222249). That hasn't rolled into the framework yet, but once it does we might get more benefit from changing all the |
|
When It ried ot test that patch locally I still saw profiling get hung up in list copies, but it's possible the VM can further optimize those as well... |
|
quadadic space is quadradic space |
|
but actually, we could improve web pretty dramatically with LinkedHashMap since it is possible to get an es6 map out of it. Follow up patch |
|
I don't mean instead of this patch, I mean in addition :) |
|
With hashCode removed, the benchmark runs in 725ms instead of 685 ms. still an improvement, but a smaller one |
|
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat. If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
|
hashcode change is #96644 |
|
This pull request is not suitable for automatic merging in its current state.
|
| _inheritedLookup = InheritedTreeCache(_parent?._inheritedLookup) | ||
| ..[widget.runtimeType] = this; |
There was a problem hiding this comment.
Nit: In case we try this again, every time you make an InheritedTreeCache you add an item. We might as well do that as part of the constructor instead of calling 2 methods.
Currently, creating an inherited element causes it to allocate a hashmap containing itself and all parent elements, by copying the parent inherited element cache. This has roughly quadradic complexity with respect to the number of inherited elements in the widget tree, and quadradic space for the hashmap allocations. The purpose of this cache is to ensure that the
ofpattern and similar remain as fast as possible, since widgets frequently look up many inherited widgets in their build methods.However, I believe that most of these caches are ending up unused, and hence we are wasting space/cpu time allocating that space. Suppose you have a handful of inherited elements at the top level of your widget tree (Directionality, Localizations, State management something). The only inherited element cache that will be used is the "last" one, and all of the others are wasted.
We should be able to improve the performance by making this inherited element cache lazy. Instead of eagerly copying the map, each element will store a parent pointer and cache the inherited element at the lowest level inherited element the first time a type is requested. This should have roughly the same complexity, but much closer to linear space.
As a bit of a sidecar, I remove the hashCode override on element. As far as I can tell, the goal of this is avoid big ints/mints due to boxing concerns and allocation - but the default hashCode is identity hashCode, which is stored on the object itself and requires no additional allocations. Removing this further improved the benchmark below.
---- Performance Measurements ---
To test this theory I ran a benchmark supplied by @dnfield which repeatedly runs first builds.
On a Pixel 4 without this patch, this ran at 774 ms with a standard deviation of 31 ms. With this patch it ran in 685 ms with a standard deviation of 24ms.
For web builds, the time improved from roughly 2.4 seconds to 2.1 seconds. I did not compute any further stats for this.