Made flag for debugging build time of user created widgets#100926
Made flag for debugging build time of user created widgets#100926gaaclarke merged 9 commits intoflutter:masterfrom
Conversation
943dc64 to
d9196f9
Compare
There was a problem hiding this comment.
Faster would be an identity hash from
Map<Uri, bool>
That will be significantly faster and as these objects are all const, you can trust identity hashes to be safe to use.
There was a problem hiding this comment.
This would require changing the kernel transform to write Uri's into the _Location fields. We wouldn't want to parse the Uri in order to look up the cached result. That would potentially wipe out our wins from the identityHash. Since this is a complicated change, the kernel transform doesn't have any tests, and the performance written this way isn't bad, I'll leave that as a future optimization.
There was a problem hiding this comment.
A couple notes:
there are multiple tests in widget_inspector_test.dart that E2E test the kernel transformer. Adding some more tests there would be good. I'll add one to ensure the _Location fields are still const. If they are not, that is a serious bug.
Actually I was confused. The URI class isn't const so that wouldn't work as the _Location objects need to be const.
|
I measured the BUILD times of the Flutter Gallery app to make sure we aren't introducing a lot of overhead by turning on track-widget-creation on in profile builds. The Flutter Gallery has 3 BUILD's before the first LAYOUT. Here are the results:
This was tested on an iOS device. At first blush there doesn't appear to be a significant change in BUILD times as a result of tracking widget creation. |
|
I ran a performance test with
(Tested with the flutter gallery on an iOS device) |
|
I switched to a HashMap since that should technically be faster and take up less memory. With the same gallery test I got 6.5ms 2.1ms 2.3ms. |
d9196f9 to
4f940c7
Compare
36b9514 to
193c845
Compare
|
@Hixie I chose you to review this since you've done work on the Timeline events and we talked about this feature in the past. |
3d4f903 to
a59bf80
Compare
There was a problem hiding this comment.
what performance impact does this have?
it changes the behaviour of constant widgets, which seems significant, for one
There was a problem hiding this comment.
Please do not land this until we've answered this question. Ideally, we'd know how much overhead just this line adds, compared to the rest of this patch, as measured on a low end Android device with a large application like customer: money.
There was a problem hiding this comment.
it changes the behaviour of constant widgets, which seems significant, for one
Good point. As far as the Gallery is concerned, the change was negligible in local testing. I'm open to suggestions to answering this question more definitively. I can write a microbenchmark but what would be a reasonable way to exercise this? A widget tree of 1000 const Widgets?
If this is a problem we can tweak track-widget-creation to store its data outside of the Widget.
There was a problem hiding this comment.
The change to const behavior would be much smaller if you stored a single bool with each widget indicating whether it is created from the local object or not. That way the only changes in const behavior would be cases where the same exact const widget instance was created by two packages one that happened to be local and one that didn't.
You can't track widget creation data outside the widgets and have it work for const widgets. You could track it outside the widgets for non-const widgets but the performance impact would be significantly higher due to the extra expando/hashmap lookup.
|
I implemented the approximation and set the sampling rate to 2,000 hz. Then I ran the stocks microbenchmark with the flag on comparing it with baseline dart:
Before the change we were at Here is the patch to dart: diff --git a/runtime/vm/os_macos.cc b/runtime/vm/os_macos.cc
index eed635df0d2..5a4c52dbe79 100644
--- a/runtime/vm/os_macos.cc
+++ b/runtime/vm/os_macos.cc
@@ -105,7 +105,14 @@ int64_t OS::GetCurrentMonotonicMicros() {
return GetCurrentMonotonicTicks() / kNanosecondsPerMicrosecond;
}
-int64_t OS::GetCurrentThreadCPUMicros() {
+namespace {
+struct ThreadTimeInfo {
+ int64_t threshold = 0;
+ int64_t last_thread_time = 0;
+ int64_t last_clock_time = 0;
+};
+
+int64_t ExpensiveGetCurrentThreadCPUMicros() {
mach_msg_type_number_t count = THREAD_BASIC_INFO_COUNT;
thread_basic_info_data_t info_data;
thread_basic_info_t info = &info_data;
@@ -121,6 +128,28 @@ int64_t OS::GetCurrentThreadCPUMicros() {
return thread_cpu_micros;
}
+int64_t approximate(const ThreadTimeInfo* info, int64_t clock_time) {
+ return (clock_time - info->last_clock_time) + info->last_thread_time;
+}
+
+thread_local ThreadTimeInfo* s_info = nullptr;
+} // namespace
+
+int64_t OS::GetCurrentThreadCPUMicros() {
+ int64_t clock_time = GetCurrentTimeMicros();
+ if (s_info == nullptr) {
+ s_info = new ThreadTimeInfo();
+ }
+ if (clock_time > s_info->threshold) {
+ s_info->last_clock_time = clock_time;
+ s_info->last_thread_time = ExpensiveGetCurrentThreadCPUMicros();
+ s_info->threshold = clock_time + 500;
+ return s_info->last_thread_time;
+ } else {
+ return approximate(s_info, clock_time);
+ }
+}
+
int64_t OS::GetCurrentThreadCPUMicrosForTimeline() {
return OS::GetCurrentThreadCPUMicros();
} |
|
@dnfield nevermind the approximation question, let's land this turned off and I can follow up later about turning it on with improvements to Timeline in a different PR. |
@rmacnak-google might have a better way of doing this that doesn't involve approximating thread time and is quite a bit faster (quick results show ~3x improvement on some simple C programs when using the current |
c5290c9 to
4dc7094
Compare
4dc7094 to
23ff72e
Compare
|
I just tested It didn't lower the percentage of overhead. It just dropped the overall numbers. I'm going to walk away from this now that the low hanging fruit has been explored. |

Turning on widget creation tracking for profile builds allows us to do performance evaluation techniques that wouldn't scale across all widgets. In this case we are using it to track build time for widgets on the CPU but it will also allow us to start tracking raster performance of widgets (cc @iskakaushik)
Added overhead (profile builds):
constto reduce this)_isLocalCreationLocation, ~5 bytes per source file used in the app that constructs widgets.buildfixes #100928
Pre-launch Checklist
///).If you need help, consider asking for advice on the #hackers-new channel on Discord.