Use strconv.AppendQuote instead of strconv.Quote for message formatting#413
Conversation
|
Welcome @astef! |
|
/assign This looks useful. I'll have a look when time permits. |
|
/triage accepted |
| // message and put the multi-line output into a value. | ||
| b.WriteString(strconv.Quote(msg)) | ||
| qMsgArr := [1024]byte{} | ||
| qMsg := qMsgArr[:0] |
There was a problem hiding this comment.
I tried a similar optimization elsewhere recently and learned that make([]byte, 0, 1024) automatically avoids allocations. Apparently the compiler is doing some escape analysis and then uses the stack. This makes the code simpler. I don't know whether that is something that can be relied upon.
What do you think?
I tried the patch below on top of yours. It gave me the exact same allocations and similar runtime.
diff --git a/klog.go b/klog.go
index 63b12df..a5997e0 100644
--- a/klog.go
+++ b/klog.go
@@ -812,8 +812,7 @@ func (l *loggingT) printS(err error, s severity.Severity, depth int, msg string,
// The message is always quoted, even if it contains line breaks.
// If developers want multi-line output, they should use a small, fixed
// message and put the multi-line output into a value.
- qMsgArr := [1024]byte{}
- qMsg := qMsgArr[:0]
+ qMsg := make([]byte, 0, 1024)
qMsg = strconv.AppendQuote(qMsg, msg)
// Only create a new buffer if we don't have one cached.
diff --git a/klogr_slog.go b/klogr_slog.go
index e61bce1..0010e93 100644
--- a/klogr_slog.go
+++ b/klogr_slog.go
@@ -63,8 +63,7 @@ func slogOutput(file string, line int, now time.Time, err error, s severity.Seve
}
// See printS.
- qMsgArr := [1024]byte{}
- qMsg := qMsgArr[:0]
+ qMsg := make([]byte, 0, 1024)
qMsg = strconv.AppendQuote(qMsg, msg)
b := buffer.GetBuffer()
diff --git a/textlogger/textlogger.go b/textlogger/textlogger.go
index d37451b..7aaad6a 100644
--- a/textlogger/textlogger.go
+++ b/textlogger/textlogger.go
@@ -117,8 +117,7 @@ func (l *tlogger) printWithInfos(file string, line int, now time.Time, err error
// The message is always quoted, even if it contains line breaks.
// If developers want multi-line output, they should use a small, fixed
// message and put the multi-line output into a value.
- qMsgArr := [1024]byte{}
- qMsg := qMsgArr[:0]
+ qMsg := make([]byte, 0, 1024)
qMsg = strconv.AppendQuote(qMsg, msg)
// Only create a new buffer if we don't have one cached.$ benchstat /tmp/pr /tmp/pr-modified
goos: linux
goarch: amd64
pkg: k8s.io/klog/v2
cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
│ /tmp/pr │ /tmp/pr-modified │
│ sec/op │ sec/op vs base │
KlogOutput/call_depth-36 2.685µ ± 6% 2.761µ ± 6% +2.83% (p=0.009 n=6)
KlogOutput/KObjs-36 2.848µ ± 5% 2.711µ ± 8% ~ (p=0.121 n=6)
KlogOutput/MarshalLog()_for_nil-36 3.498µ ± 12% 3.606µ ± 8% ~ (p=0.589 n=6)
KlogOutput/Error()_that_panics-36 2.682µ ± 11% 2.781µ ± 10% ~ (p=0.937 n=6)
KlogOutput/other_vmodule-36 170.9n ± 2% 171.1n ± 2% ~ (p=1.000 n=6)
KlogOutput/log_with_multiple_names_and_values-36 3.139µ ± 6% 3.254µ ± 7% ~ (p=0.258 n=6)
KlogOutput/handle_odd-numbers_of_KVs_in_both_log_values_and_Info_args-36 4.234µ ± 8% 4.226µ ± 5% ~ (p=0.699 n=6)
KlogOutput/KObjSlice_ints-36 2.631µ ± 7% 2.759µ ± 10% ~ (p=0.394 n=6)
KlogOutput/vmodule-36 3.053µ ± 10% 3.067µ ± 9% ~ (p=0.699 n=6)
KlogOutput/log_with_values-36 2.223µ ± 6% 2.246µ ± 3% ~ (p=0.818 n=6)
KlogOutput/override_single_value-36 2.151µ ± 13% 2.370µ ± 8% ~ (p=0.065 n=6)
KlogOutput/KObj-36 2.313µ ± 7% 2.219µ ± 11% ~ (p=0.180 n=6)
KlogOutput/KObjSlice_nil_arg-36 2.123µ ± 6% 2.035µ ± 15% ~ (p=0.394 n=6)
KlogOutput/MarshalLog()_that_panics-36 3.225µ ± 9% 3.400µ ± 9% ~ (p=0.132 n=6)
KlogOutput/cyclic_list-36 150.7µ ± 1% 163.5µ ± 4% +8.52% (p=0.002 n=6)
KlogOutput/verbosity_enabled-36 1.881µ ± 7% 1.955µ ± 7% ~ (p=0.513 n=6)
KlogOutput/KObjSlice_okay-36 2.413µ ± 4% 2.390µ ± 8% ~ (p=0.394 n=6)
KlogOutput/KObjSlice_int_arg-36 2.335µ ± 5% 2.401µ ± 12% ~ (p=0.394 n=6)
KlogOutput/quotation-36 2.526µ ± 5% 2.459µ ± 13% ~ (p=0.240 n=6)
KlogOutput/odd_WithValues-36 8.681µ ± 7% 9.211µ ± 8% ~ (p=0.310 n=6)
KlogOutput/print_duplicate_keys_in_arguments-36 2.622µ ± 8% 2.535µ ± 13% ~ (p=0.937 n=6)
KlogOutput/KObjSlice_nil_entry-36 2.301µ ± 4% 2.318µ ± 9% ~ (p=0.485 n=6)
KlogOutput/regular_error_types_as_value-36 2.328µ ± 3% 2.366µ ± 4% ~ (p=0.240 n=6)
KlogOutput/struct_keys-36 3.987µ ± 9% 3.957µ ± 10% ~ (p=0.699 n=6)
KlogOutput/slice_values-36 2.773µ ± 9% 2.716µ ± 4% ~ (p=0.665 n=6)
KlogOutput/verbosity_disabled-36 14.41n ± 0% 15.41n ± 3% +6.94% (p=0.002 n=6)
KlogOutput/handle_integer_keys-36 4.398µ ± 7% 4.307µ ± 5% ~ (p=0.558 n=6)
KlogOutput/map_values-36 3.877µ ± 5% 3.957µ ± 5% ~ (p=0.132 n=6)
KlogOutput/struct_values-36 2.613µ ± 4% 2.470µ ± 9% ~ (p=0.093 n=6)
KlogOutput/klog.Format-36 4.295µ ± 6% 4.336µ ± 8% ~ (p=0.589 n=6)
KlogOutput/String()_that_panics-36 3.247µ ± 11% 3.178µ ± 10% ~ (p=0.589 n=6)
KlogOutput/empty_WithValues-36 1.832µ ± 7% 1.855µ ± 9% ~ (p=0.699 n=6)
KlogOutput/preserve_order_of_key/value_pairs-36 5.068µ ± 6% 5.189µ ± 3% ~ (p=0.132 n=6)
KlogOutput/handle_odd-numbers_of_KVs-36 2.687µ ± 10% 2.727µ ± 7% ~ (p=0.394 n=6)
KlogOutput/MarshalLog()_that_returns_itself-36 2.447µ ± 9% 2.283µ ± 5% ~ (p=0.132 n=6)
KlogOutput/map_keys-36 2.993µ ± 7% 2.993µ ± 10% ~ (p=1.000 n=6)
KlogOutput/override_WithValues-36 3.285µ ± 7% 3.016µ ± 9% -8.20% (p=0.041 n=6)
KlogOutput/multiple_WithValues-36 11.07µ ± 6% 11.38µ ± 11% ~ (p=1.000 n=6)
KlogOutput/html_characters-36 2.271µ ± 8% 2.379µ ± 8% ~ (p=0.065 n=6)
KlogOutput/ignore_MarshalJSON-36 2.288µ ± 3% 2.262µ ± 6% ~ (p=0.818 n=6)
KlogOutput/regular_error_types_when_using_logr.Error-36 2.096µ ± 8% 2.084µ ± 10% ~ (p=0.937 n=6)
KlogOutput/Error()_for_nil-36 6.308µ ± 11% 6.685µ ± 6% +5.97% (p=0.015 n=6)
KlogOutput/String()_for_nil-36 6.581µ ± 6% 6.428µ ± 6% ~ (p=0.974 n=6)
KlogOutput/log_with_name_and_values-36 3.017µ ± 4% 2.969µ ± 2% ~ (p=0.457 n=6)
geomean 2.795µ 2.815µ +0.71%
│ /tmp/pr │ /tmp/pr-modified │
│ B/op │ B/op vs base │
KlogOutput/call_depth-36 337.0 ± 0% 337.0 ± 0% ~ (p=1.000 n=6) ¹
...That was with Go 1.23. I also tried with 1.19, which gave me similar results, but not exactly the same. It used less memory per operation and in some cases significantly more. I see those difference also with your variant of the optimization, so this particular make optimization was already supported in 1.19.
There are probably other differences at play here - I haven't compared compiler versions without this optimization.
There was a problem hiding this comment.
I totally agree with your patch. In fact, I've noticed the same thing independently in another similar case, but forgot to update this PR. Updated, thanks!
pohly
left a comment
There was a problem hiding this comment.
Can you squash into one commit and add a short explanation why the code works, i.e. add a remark about make([]byte, 0, 1024) not allocating from the heap?
| # Vscode files | ||
| .vscode | ||
|
|
||
| testdata/ No newline at end of file |
2ff1062 to
8f936bb
Compare
Done. Note, that history was rewritten due to force-push |
|
I'm fine with rewriting history in a PR. But please format the commit message according to https://www.kubernetes.dev/docs/guide/pull-requests/#commit-message-guidelines |
Optimization is accomplished by preallocating a fixed-size buffer and using strconv.AppendQuote instead of always dynamically allocating memory in strconv.Quote. make([]byte, 0, 1024) doesn't escape to heap at least in Go 1.19 and 1.23 Signed-off-by: Alexandr Stefurishin <[email protected]>
8f936bb to
a1604b5
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: astef, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
I've noticed that we can reduce memory allocations during log message formatting by preallocating a fixed-size buffer and using
strconv.AppendQuoteinstead of always dynamically allocate memory instrconv.Quote.Size of the buffer -
1024- is chosen arbitrarily, it's just looks like the size, which may be enough got >99% of the logs and not too big to be a memory concern for goroutine stack. For cases, when it won't be enough,appendinsidestrconv.AppendQuotewill reallocate it to heap, and all gains will be lost.Another improvement is the reduced time we're holding the buffer taken from the pool by doing message quotation before getting the buffer from the pool. It's hard to estimate the effect of that, but it's definetely not worse.
In order to measure the effect, I've ran
go test -benchmem -run=^$ -bench ^BenchmarkKlogOutput$ k8s.io/klog/v2before and after the change and got the following results:Special notes for your reviewer:
Release note: