MINOR: Standardize KRaft logging, thread names, and terminology#13390
MINOR: Standardize KRaft logging, thread names, and terminology#13390cmccabe merged 2 commits intoapache:trunkfrom
Conversation
9f8b554 to
8eec949
Compare
There was a problem hiding this comment.
We typically separate the node id via a dash. Have you tried to be consistent with what we do outside of kraft? That would help when debugging the system.
There was a problem hiding this comment.
I fired up this branch and yea BrokerLifecycleManager1EventHandler is not pretty :)
I'm not sure we need node ID in the thread name since we will presumably know which node a thread dump came from. It might actually be confusing since the convention we have is {thread name}-{thread number in pool} like "metrics-meter-tick-thread-1", "metrics-meter-tick-thread-2", "data-plane-kafka-request-handler-0", "data-plane-kafka-request-handler-1", etc. What about something like broker-lifecycle-manager-event-handler or BrokerLifecycleManager-EventHandler
I also noticed somewhere we are not prefixing the event queue thread name, there is just an EventHandler thread.
There was a problem hiding this comment.
Check the replica fetcher threads for an example where we do include source id in the thread name:
val threadName = s"${prefix}ReplicaFetcherThread-$fetcherId-${sourceBroker.id}-${fetcherPool.name}"
There was a problem hiding this comment.
What is name here and where is it supplied?
There was a problem hiding this comment.
I fired up this branch and yea BrokerLifecycleManager1EventHandler is not pretty :)
I'm not sure we need node ID in the thread name since we will presumably know which node a thread dump came from. It might actually be confusing since the convention we have is {thread name}-{thread number in pool} like "metrics-meter-tick-thread-1", "metrics-meter-tick-thread-2", "data-plane-kafka-request-handler-0", "data-plane-kafka-request-handler-1", etc. What about something like broker-lifecycle-manager-event-handler or BrokerLifecycleManager-EventHandler
I also noticed somewhere we are not prefixing the event queue thread name, there is just an EventHandler thread.
There was a problem hiding this comment.
I think the idea of the LogContext stuff was to use key/value pairs so it's easy to enrich the log context. cc @jason for additional thoughts.
There was a problem hiding this comment.
Yes, this use-case is a bit of a hack. (A hack that this PR didn't add!)
Basically we're forced to use the old Scala Logging trait unless we want to rewrite all the BrokerServer stuff. But we want a nice-looking log prefix.
|
So about the thread names thing. I’m open to changing the thread names to “kebab case” (i.e. I do think in the context of JUnit we definitely need to have So then the big question becomes whether we would want the prefixes in prod or not. The "pro" case is that it simplifies the code to just unconditionally do that, and avoid cases where someone accidentally forgets to set the prefix. The con case is that we should know what node we’re on, so the information is redundant. Although I’ve seen people do weird things like combine several process backtraces into one file or send ZK and Kafka logs all to the same file. So I don’“t truly believe the “we’ll never need it” case. Maybe “we rarely need it” or “we won’t need it if people are reasonable” |
|
@cmccabe, I see your point about the node ID when debugging tests -- it can be annoying to not know which broker instance a thread belongs to. You're kebab case examples look good to me 👍 Did you find where the lone |
4194393 to
88d25c8
Compare
88d25c8 to
9978458
Compare
|
With the latest code, I got the following threads in KRaft mode: |
Standardize KRaft thread names. - Always use kebab case. That is, "my-thread-name". - Thread prefixes are just strings, not Option[String] or Optional<String>. If you don't want a prefix, use the empty string. - Thread prefixes end in a dash (except the empty prefix). Then you can calculate thread names as $prefix + "my-thread-name" - Broker-only components get "broker-$id-" as a thread name prefix. For example, "broker-1-" - Controller-only components get "controller-$id-" as a thread name prefix. For example, "controller-1-" - Shared components get "kafka-$id-" as a thread name prefix. For example, "kafka-0-" - Always pass a prefix to KafkaEventQueue, so that threads have names like "broker-0-metadata-loader-event-handler" rather than "event-handler". Prior to this PR, we had several threads just named "EventHandler" which was not helpful for debugging. - QuorumController thread name is "quorum-controller-123-event-handler" - Don't set a thread prefix for replication threads started by ReplicaManager. They run only on the broker, and already include the broker ID. Standardize KRaft slf4j log prefixes. - Names should be of the form "[ComponentName id=$id] ". So for a ControllerServer with ID 123, we will have "[ControllerServer id=123] " - For the QuorumController class, use the prefix "[QuorumController id=$id] " rather than "[Controller <nodeId] ", to make it clearer that this is a KRaft controller. - In BrokerLifecycleManager, add isZkBroker=true to the log prefix for the migration case. Standardize KRaft terminology. - All synonyms of combined mode (colocated, coresident, etc.) should be replaced by "combined" - All synonyms of isolated mode (remote, non-colocated, distributed, etc.) should be replaced by "isolated".
9978458 to
229bce6
Compare
|
Thanks, @mumrah . I did another quick sweep of the thread names, like you did, and deleted the extra dash in one case, and added a dash in another. There are still a few ugly and/or unprefixed names, but this is at least a good start. I also spot checked a log file. I do think the new way is more readable |
Sorry, I meant to respond to this comment earlier. But for completeness, the lone EventHandler is gone now, last I checked. Might have been from MetadataLoader, but I fixed a few so I can't recall |
Standardize KRaft thread names.
Always use kebab case. That is, "my-thread-name".
Thread prefixes are just strings, not Option[String] or Optional.
If you don't want a prefix, use the empty string.
Thread prefixes end in a dash (except the empty prefix). Then you can
calculate thread names as $prefix + "my-thread-name"
Broker-only components get "broker-$id-" as a thread name prefix. For example, "broker-1-"
Controller-only components get "controller-$id-" as a thread name prefix. For example, "controller-1-"
Shared components get "kafka-$id-" as a thread name prefix. For example, "kafka-0-"
Always pass a prefix to KafkaEventQueue, so that threads have names like
"broker-0-metadata-loader-event-handler" rather than "event-handler". Prior to this PR, we had
several threads just named "EventHandler" which was not helpful for debugging.
QuorumController thread name is "quorum-controller-123-event-handler"
Don't set a thread prefix for replication threads started by ReplicaManager. They run only on the
broker, and already include the broker ID.
Standardize KRaft slf4j log prefixes.
Names should be of the form "[ComponentName id=$id] ". So for a ControllerServer with ID 123, we
will have "[ControllerServer id=123] "
For the QuorumController class, use the prefix "[QuorumController id=$id] " rather than
"[Controller <nodeId] ", to make it clearer that this is a KRaft controller.
In BrokerLifecycleManager, add isZkBroker=true to the log prefix for the migration case.
Standardize KRaft terminology.
All synonyms of combined mode (colocated, coresident, etc.) should be replaced by "combined"
All synonyms of isolated mode (remote, non-colocated, distributed, etc.) should be replaced by
"isolated".