Use thread names in logs and deadlock diagnostics#13099
Use thread names in logs and deadlock diagnostics#13099jamesob wants to merge 3 commits intobitcoin:masterfrom
Conversation
src/threadnames.cpp
Outdated
There was a problem hiding this comment.
Is there somewhere more general or useful I should put the two functions in this namespace?
src/threadnames.cpp
Outdated
There was a problem hiding this comment.
Remove or uncomment this block before merge.
957d3aa to
fcfba12
Compare
src/threadnames.cpp
Outdated
src/threadnames.cpp
Outdated
There was a problem hiding this comment.
This could be a static function or a function in a anonymous namespace.
There was a problem hiding this comment.
Argument could be const std::string& name?
There was a problem hiding this comment.
I'd like to be able to unittest the *Process* functions eventually, so maybe I should just make them public.
src/threadnames.cpp
Outdated
There was a problem hiding this comment.
This could be a static function or a function in a anonymous namespace.
src/threadnames.cpp
Outdated
There was a problem hiding this comment.
Shoud also add to m_name_to_id?
There was a problem hiding this comment.
Good catch, thanks.
src/threadnames.h
Outdated
|
Thoughts on pulling the |
|
Concept ACK |
fcfba12 to
8379557
Compare
|
How does the use of boost::thread::id interact with eventual boost::thread removal? IIRC there was something about no guarantees being made that you can interact with boost::thread if the thread was started by std, though it usually works in practice. We do have a HAVE_THREAD_LOCAL defined, which should be set everywhere but OSX, so maybe we could just turn it on there if boost::thread::id is an issue? |
src/threadnames.h
Outdated
| private: | ||
| std::mutex m_map_lock; | ||
| std::map<boost::thread::id, std::string> m_id_to_name; | ||
| std::map<std::string, boost::thread::id> m_name_to_id; |
There was a problem hiding this comment.
Bit of a nit, but maybe m_name_to_id could instead be a map<string, int> - where it just stores the occurrences of that name/prefix, then you don't have to call CountMapPrefix, but can just update the count each time.
There was a problem hiding this comment.
Thanks for the look and great suggestion. I think maintaining a map of counts will also allow me to remove all of the utility functions in the anonymous namespace - nice catch!
|
Concept ACK- just a comment on how the maps are used. |
1d1ce9a to
bddf05a
Compare
|
@TheBlueMatt good points, thanks for looking. I've pushed changes which
(jamesob/threadnames.1 -> jamesob/threadnames.3) This changeset is ready for review-for-merge IMO. |
|
@theuni What's the status of |
bddf05a to
c065728
Compare
|
I've pushed a small change removing $ diff -uw <(git diff master..threadnames.3) <(git diff master..threadnames.4)
--- /proc/self/fd/12 2018-05-01 14:56:59.561313752 -0400
+++ /proc/self/fd/13 2018-05-01 14:56:59.549313568 -0400
@@ -2215,10 +2215,10 @@
void reset();
diff --git a/src/threadnames.cpp b/src/threadnames.cpp
new file mode 100644
-index 0000000..305c3b4
+index 0000000..92e3f75
--- /dev/null
+++ b/src/threadnames.cpp
-@@ -0,0 +1,135 @@
+@@ -0,0 +1,133 @@
+// Copyright (c) 2018 The Bitcoin Core developers
+// Distributed under the MIT software license, see the accompanying
+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
@@ -2345,8 +2345,6 @@
+
+#if defined(PR_GET_NAME)
+ ::prctl(PR_GET_NAME, pthreadname_buff);
-+#elif (defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__DragonFly__))
-+ pthread_get_name_np(pthread_self(), pthreadname_buff);
+#elif defined(MAC_OSX)
+ pthread_getname_np(pthread_self(), pthreadname_buff, sizeof(threadname_buff));
+#else |
75e9902 to
66fcbdc
Compare
|
After spending the day figuring out that |
66fcbdc to
911f54e
Compare
Abstracts system thread name setting and tracks thread names in memory. Allows automatic thread numbering for reused names (e.g. http.0, http.1, ...). NB: as-written, this removes the `bitcoin-` prefix from child threads. Incorporates feedback from @conscott @TheBlueMatt @sipa.
911f54e to
b09e25b
Compare
|
Per @sipa's advice, I'm now using |
|
If platform-specific functions are being used anyway, why not just use pthread_getspecific/pthread_setspecific, which are meant for exactly this? That also makes it a trivial transition when we're finally ready to use thread_local. |
|
@theuni what are you talking about replacing at this point? We're now platform agnostic with |
|
@jamesob See the top two commits here for a quick attempt: https://github.com/theuni/bitcoin/commits/thread_names That addresses a few things that make me hesitate about this PR:
I realize that my changes are oversimplified, it's just a quick POC for discussion. |
|
@theuni I think your simple implementation looks great. Very much agree with your concern re: map lookup and locking overhead, and I think there's a way to combine our approaches to avoid it. Let me know if this sounds right: IMO lock acquisition during the single If that sounds good to you, I'll cherry-pick your two commits onto this branch and reorganize accordingly. Otherwise if you think the numeric suffix isn't worthwhile, I'm happy to close this out and proceed on your commits only. |
|
@jamesob Looking again, I think handling the suffix at this layer is unnecessary and adds a significant amount of complication. Why not just add a suffix before passing the name into TraceThread() ? |
|
@theuni ugh, you're right, that's dumb. Let's go off of your commits. |
|
From IRC:
|
While trying to debug the apparent deadlock revealed in #12873, I realized it'd be nice to have the
POTENTIAL DEADLOCK DETECTEDmessage display which thread acquired each lock. I also think it'd be generally useful to have log lines display the name of the originating thread.This changeset does both of those things by introducing a class which manages thread naming,
ThreadNameRegistry. The class abstracts process-control calls responsible for thread naming and provides automatic number suffixing to threads which use the same name (e.g.httpworker.0,httpworker.1, ...).With this patch, thread names look like this
and the debug log looks like this
Note that child thread names have changed; I'm no longer using the
bitcoin-prefix. Because we're limited to 16 characters for thread name (on Linux, anyway), that prefix was causing the numeric suffix some names have to be hidden. If it's desirable to keep the prefix, I can revert this change.Implementation considerations
A basic version of this change could be done pretty trivially with something like
thread_local std::string threadname;but per @theuni (#11722 (review)), we can't rely on having
thread_local. Also note that with a basic implementation like that, we wouldn't be able to do the automatic numbering scheme to differentiate between threads with the same basename (e.g.httpworker,scriptch).We could also rely solely on thread-related system calls. I don't like this much either because of how poorly defined or unavailable
getnameis on some platforms, e.g. OS X, FreeBSD; not to mention the 16 character limit.Tests
If this gets a Concept ACK or two, I'll implement some.Unittests attached.