Work around the glibc bug that causes rare Chrome crashes by yjbanov · Pull Request #67466 · flutter/flutter

yjbanov · 2020-10-06T23:16:06Z

Description

Due to https://sourceware.org/bugzilla/show_bug.cgi?id=19329 once every few thousands times Chrome fails to launch, crashing with exit code 127.

This PR works around that issue by checking for an error message specific to that bug and relaunching Chrome. With this change a test run that launched Chrome 18000 times was 100% green, when previously that many attempts would be 100% red.

Other changes:

When in verbose mode print the version of Chromium.
Change try_builders.json to always run tests whenever anything under packages/ changes.

Related Issues

Fixes #67454

christopherfujino · 2020-10-07T19:35:20Z

dev/try_builders.json

         "task_name":"linux_web_tests",
         "enabled":true,
-         "run_if":["dev/", "packages/flutter/", "packages/flutter_goldens_client/", "packages/flutter_goldens/", "packages/flutter_test/", "packages/flutter_tools/lib/src/test/", "packages/flutter_web_plugins/", "bin/"]
+         "run_if":["dev/", "packages/", "bin/"]


is this right?

Depends on the definition of "right" :)

ahh, I read this as "dev/packages/bin", lol, disregard

jonahwilliams · 2020-10-07T19:35:08Z

packages/flutter_tools/lib/src/web/chrome.dart


+  Future<Process> _spawnChromiumProcess(List<String> args) async {
+    // Keep attempting to launch the browser until one of:
+    // - We successfully launched it, in which case we just return from the loop.


jonahwilliams · 2020-10-07T19:35:35Z

packages/flutter_tools/lib/src/web/chrome.dart

+        .transform(const LineSplitter())
+        .map((String line) {
+          _logger.printTrace('[CHROME]:$line');
+          if (line.contains('Inconsistency detected by ld.so')) {


Consider hoisting this into a constant, along with the documentation you added above.

jonahwilliams · 2020-10-07T19:35:58Z

packages/flutter_tools/lib/src/web/chrome.dart

+              'Will try launching browser again.',
+            );
+            return null;
+          } else {


else isn't necessary due to return above

jonahwilliams · 2020-10-07T19:37:19Z

packages/flutter_tools/lib/src/web/chrome.dart

+        // A precaution that avoids accumulating browser processes, in case the
+        // glibc bug doesn't cause the browser to quit and we keep looping and
+        // launching more processes.
+        process.kill();


Check if this is successful. What happens if the tool fails to kill it?

How does the tool normally do this check? I can't find an example. I'm not sure what to do if it fails to kill it. I added it only as a precaution. I think it's OK if we fail. That would probably indicate that the system is in a completely busted and unrecoverable state.

I don't have any advice :)

Instead of using the exited flag, it's a little more idiomatic to put a timeout out process.exitCode, and then do process.kill() if it doesn't finish in time. Following sending the kill, if stdout might be useful for debugging, you'll want to explicitly drain that stream.

jonahwilliams · 2020-10-07T19:37:45Z

packages/flutter_tools/lib/src/web/chrome.dart

+
+      if (!hitGlibcBug) {
+        return process;
+      } else if (!exited) {


if (!exited) { on the next line instead of else if

jonahwilliams · 2020-10-07T19:38:25Z

packages/flutter_tools/lib/src/web/chrome.dart

+            );
+            return null;
+          } else {
+            throw ToolExit('Failed to spawn: ${args.join(' ')}');


throwToolExit.

Generally I would stick with a more actionable message here, the args themselves is more like debugging info.

jonahwilliams

LGTM with nit

jonahwilliams · 2020-10-07T21:14:23Z

packages/flutter_tools/lib/src/web/chrome.dart

+      final Process process = await _processManager.start(args);
+
+      bool exited = false;
+      unawaited(process.exitCode.then((int exitCode) {


since you don't use the exitCode, use whenComplete(() { ... })

jonahwilliams · 2020-10-07T21:14:59Z

packages/flutter_tools/lib/src/web/chrome.dart

+            return null;
+          }
+          _logger.printTrace('Failed to launch browser. Command used to launch it: ${args.join(' ')}');
+          throw ToolExit(


throwToolExit

Can't use throwToolExit because the function must end with a return or throw.

Would it help if we made throwToolExit return Never?

We cannot do that until we migrate to null safety

zanderso · 2020-10-07T21:41:48Z

packages/flutter_tools/lib/src/web/chrome.dart

+        })
+        .firstWhere((String line) => line.startsWith('DevTools listening'), orElse: () {
+          if (hitGlibcBug) {
+            _logger.printStatus(


The user probably can't do too much with this message, so maybe it should just be a printTrace.

zanderso · 2020-10-07T22:01:49Z

packages/flutter_tools/lib/src/web/chrome.dart

+        // A precaution that avoids accumulating browser processes, in case the
+        // glibc bug doesn't cause the browser to quit and we keep looping and
+        // launching more processes.
+        process.kill();


Instead of using the exited flag, it's a little more idiomatic to put a timeout out process.exitCode, and then do process.kill() if it doesn't finish in time. Following sending the kill, if stdout might be useful for debugging, you'll want to explicitly drain that stream.

zanderso · 2020-10-07T22:04:02Z

packages/flutter_tools/lib/src/web/chrome.dart

+      // Wait until the DevTools are listening before trying to connect. This is
+      // only required for flutter_test --platform=chrome and not flutter run.
+      bool hitGlibcBug = false;
+      await process.stderr


This can throw dart:io exceptions if the process fails in certain ways/at certain times. Ditto for process.stdout which is another reason to do .listen().asFuture() and await the Future in a try {} catch {} for it.

This sounds like a good future enhancement. I'd prefer to wait until it actually happens, so we can recreate the exact situation in a test. I ran this code 18000 times and it worked fine.

yjbanov · 2020-10-07T22:52:57Z

@zanderso

Instead of using the exited flag, it's a little more idiomatic to put a timeout out process.exitCode, and then do process.kill() if it doesn't finish in time. Following sending the kill, if stdout might be useful for debugging, you'll want to explicitly drain that stream.

Done.

yjbanov · 2020-10-08T02:29:31Z

Landing on stuck "Google testing". Google is using ddr instead of flutter tools for this anyway.

dnfield · 2020-10-08T05:39:36Z

packages/flutter_tools/lib/src/web/chrome.dart

+      // glibc bug doesn't cause the browser to quit and we keep looping and
+      // launching more processes.
+      unawaited(process.exitCode.timeout(const Duration(seconds: 1), onTimeout: () {
+        process.kill();


drive by comment: if this code ever has to run on windows, it might fail to kill the process and fail to block execution, leading to accumulation of processes.

What's a good way to do this on Windows?

There isn't one unfortunately. As far as I can tell, Dart does not offer a way to reliably kill processes on windows - if you want to be really paranoid about it you can try to sleep/poll for when the process actually dies, but if the process is really stuck (like, you have to reboot the machine stuck), that might never end.

Windows doesn't quite have a "kill -9" idea.

yjbanov · 2020-10-14T16:56:25Z

To follow-up. This fix seems to have been effective:

stress test chrome launch

b78c4fc

flutter-dashboard bot added tool Affects the "flutter" command-line tool. See also t: labels. work in progress; do not review labels Oct 6, 2020

google-cla bot added the cla: yes label Oct 6, 2020

yjbanov added 9 commits October 6, 2020 16:22

always run tests

4e64187

disable uninteresting builders/

d6522e3

headless

575b169

watch browser process

b1a625e

retry chrome launch

e4a440f

add tests

54c1004

undo test command hack

9e6c1c5

undo more testing stuff

0063f5b

remove stray await

4369dcb

yjbanov changed the title ~~WIP: watch premature chrome exits~~ Work around the glibc bug that causes rare Chrome crashes Oct 7, 2020

flutter-dashboard bot added the c: contributor-productivity Team-specific productivity, code health, technical debt. label Oct 7, 2020

yjbanov requested a review from jonahwilliams October 7, 2020 19:31

yjbanov marked this pull request as ready for review October 7, 2020 19:31

yjbanov mentioned this pull request Oct 7, 2020

Web test timing out from after two hours #67454

Closed

yjbanov requested a review from godofredoc October 7, 2020 19:33

christopherfujino reviewed Oct 7, 2020

View reviewed changes

jonahwilliams reviewed Oct 7, 2020

View reviewed changes

address comments

88e6db5

jonahwilliams approved these changes Oct 7, 2020

View reviewed changes

address comments

8a5fadd

zanderso reviewed Oct 7, 2020

View reviewed changes

yjbanov requested review from christopherfujino and zanderso October 7, 2020 23:02

address comments

ca6a779

yjbanov added waiting for tree to go green and removed work in progress; do not review labels Oct 7, 2020

yjbanov merged commit 0b78110 into flutter:master Oct 8, 2020

dnfield reviewed Oct 8, 2020

View reviewed changes

yjbanov mentioned this pull request Dec 4, 2020

Copy the glibc bug fix to devicelab/integration tests #71705

Merged

yjbanov mentioned this pull request Feb 24, 2021

[web] integration tests flaky due to chrome failing to start #71508

Closed

yjbanov mentioned this pull request Oct 10, 2023

Inconsistency detected by ld.so: ../elf/dl-tls.c: 493: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].gen <= GL(dl_tls_generation)' failed! #67761

Closed

Conversation

yjbanov commented Oct 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonahwilliams left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yjbanov commented Oct 7, 2020

Uh oh!

yjbanov commented Oct 8, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yjbanov commented Oct 14, 2020

Uh oh!

Reviewers

yjbanov commented Oct 6, 2020 •

edited

Loading