Skip to content

Use strconv.AppendQuote instead of strconv.Quote for message formatting#413

Merged
k8s-ci-robot merged 1 commit intokubernetes:mainfrom
astef:astef-optimize_quote
Oct 17, 2024
Merged

Use strconv.AppendQuote instead of strconv.Quote for message formatting#413
k8s-ci-robot merged 1 commit intokubernetes:mainfrom
astef:astef-optimize_quote

Conversation

@astef
Copy link
Copy Markdown

@astef astef commented Oct 1, 2024

What this PR does / why we need it:

I've noticed that we can reduce memory allocations during log message formatting by preallocating a fixed-size buffer and using strconv.AppendQuote instead of always dynamically allocate memory in strconv.Quote.

Size of the buffer - 1024 - is chosen arbitrarily, it's just looks like the size, which may be enough got >99% of the logs and not too big to be a memory concern for goroutine stack. For cases, when it won't be enough, append inside strconv.AppendQuote will reallocate it to heap, and all gains will be lost.

Another improvement is the reduced time we're holding the buffer taken from the pool by doing message quotation before getting the buffer from the pool. It's hard to estimate the effect of that, but it's definetely not worse.

In order to measure the effect, I've ran go test -benchmem -run=^$ -bench ^BenchmarkKlogOutput$ k8s.io/klog/v2 before and after the change and got the following results:

NAME BEFORE BEFORE AFTER AFTER
BenchmarkKlogOutput/call_depth-16 352 B/op 7 allocs/op 336 B/op 6 allocs/op
BenchmarkKlogOutput/cyclic_list-16 686 B/op 11 allocs/op 675 B/op 10 allocs/op
BenchmarkKlogOutput/empty_WithValues-16 280 B/op 5 allocs/op 272 B/op 4 allocs/op
BenchmarkKlogOutput/Error()_for_nil-16 793 B/op 10 allocs/op 777 B/op 9 allocs/op
BenchmarkKlogOutput/Error()_that_panics-16 408 B/op 7 allocs/op 392 B/op 6 allocs/op
BenchmarkKlogOutput/handle_integer_keys-16 761 B/op 15 allocs/op 737 B/op 14 allocs/op
BenchmarkKlogOutput/handle_odd-numbers_of_Kvs_in_both_log_values_and_Info_args-16 825 B/op 14 allocs/op 809 B/op 13 allocs/op
BenchmarkKlogOutput/handle_odd-numbers_of_KVs-16 416 B/op 8 allocs/op 392 B/op 7 allocs/op
BenchmarkKlogOutput/html_characters-16 344 B/op 8 allocs/op 336 B/op 7 allocs/op
BenchmarkKlogOutput/ignore_MarshalJSON-16 344 B/op 7 allocs/op 336 B/op 6 allocs/op
BenchmarkKlogOutput/klog.Format-16 641 B/op 14 allocs/op 625 B/op 13 allocs/op
BenchmarkKlogOutput/KObj-16 336 B/op 6 allocs/op 328 B/op 5 allocs/op
BenchmarkKlogOutput/KObjs-16 416 B/op 6 allocs/op 408 B/op 5 allocs/op
BenchmarkKlogOutput/KObjSlice_int_arg-16 368 B/op 6 allocs/op 360 B/op 5 allocs/op
BenchmarkKlogOutput/KObjSlice_ints-16 408 B/op 7 allocs/op 400 B/op 6 allocs/op
BenchmarkKlogOutput/KObjSlice_nil_arg-16 320 B/op 6 allocs/op 312 B/op 5 allocs/op
BenchmarkKlogOutput/KObjSlice_nil_entry-16 352 B/op 6 allocs/op 344 B/op 5 allocs/op
BenchmarkKlogOutput/KObjSlice_okay-16 376 B/op 6 allocs/op 360 B/op 5 allocs/op
BenchmarkKlogOutput/log_with_multiple_names_and_values-16 512 B/op 12 allocs/op 504 B/op 11 allocs/op
BenchmarkKlogOutput/log_with_name_and_values-16 480 B/op 12 allocs/op 472 B/op 11 allocs/op
BenchmarkKlogOutput/log_with_values-16 344 B/op 7 allocs/op 336 B/op 6 allocs/op
BenchmarkKlogOutput/map_keys-16 456 B/op 11 allocs/op 440 B/op 10 allocs/op
BenchmarkKlogOutput/map_values-16 577 B/op 14 allocs/op 569 B/op 13 allocs/op
BenchmarkKlogOutput/MarshalLog()_for_nil-16 560 B/op 8 allocs/op 536 B/op 7 allocs/op
BenchmarkKlogOutput/MarshalLog()_that_panics-16 480 B/op 9 allocs/op 456 B/op 8 allocs/op
BenchmarkKlogOutput/MarshalLog()_that_returns_itself-16 360 B/op 6 allocs/op 328 B/op 5 allocs/op
BenchmarkKlogOutput/multiple_WithValues-16 1811 B/op 30 allocs/op 1779 B/op 26 allocs/op
BenchmarkKlogOutput/odd_WithValues-16 1474 B/op 27 allocs/op 1402 B/op 24 allocs/op
BenchmarkKlogOutput/other_vmodule-16 0 B/op 0 allocs/op 0 B/op 0 allocs/op
BenchmarkKlogOutput/override_single_value-16 344 B/op 7 allocs/op 336 B/op 6 allocs/op
BenchmarkKlogOutput/override_WithValues-16 512 B/op 14 allocs/op 512 B/op 13 allocs/op
BenchmarkKlogOutput/preserve_order_of_key/value_pairs-16 961 B/op 15 allocs/op 953 B/op 14 allocs/op
BenchmarkKlogOutput/print_duplicate_keys_in_arguments-16 416 B/op 8 allocs/op 408 B/op 7 allocs/op
BenchmarkKlogOutput/quotation-16 384 B/op 7 allocs/op 368 B/op 6 allocs/op
BenchmarkKlogOutput/regular_error_types_as_value-16 344 B/op 7 allocs/op 336 B/op 6 allocs/op
BenchmarkKlogOutput/regular_error_types_when_using_logr.Error-16 312 B/op 6 allocs/op 304 B/op 5 allocs/op
BenchmarkKlogOutput/slice_values-16 392 B/op 6 allocs/op 376 B/op 5 allocs/op
BenchmarkKlogOutput/String()_for_nil-16 849 B/op 11 allocs/op 825 B/op 10 allocs/op
BenchmarkKlogOutput/String()_that_panics-16 464 B/op 8 allocs/op 440 B/op 7 allocs/op
BenchmarkKlogOutput/struct_keys-16 680 B/op 14 allocs/op 664 B/op 13 allocs/op
BenchmarkKlogOutput/struct_values-16 360 B/op 6 allocs/op 344 B/op 5 allocs/op
BenchmarkKlogOutput/verbosity_disabled-16 0 B/op 0 allocs/op 0 B/op 0 allocs/op
BenchmarkKlogOutput/verbosity_enabled-16 296 B/op 5 allocs/op 280 B/op 4 allocs/op
BenchmarkKlogOutput/vmodule-16 440 B/op 6 allocs/op 312 B/op 4 allocs/op

Special notes for your reviewer:

Release note:

Reduce memory allocations for log messages shorter than 1024 bytes.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 1, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 1, 2024
@k8s-ci-robot
Copy link
Copy Markdown

Welcome @astef!

It looks like this is your first PR to kubernetes/klog 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/klog has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 1, 2024
@pohly
Copy link
Copy Markdown

pohly commented Oct 1, 2024

/assign

This looks useful. I'll have a look when time permits.

@dgrisonnet
Copy link
Copy Markdown
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 3, 2024
Comment thread klog.go Outdated
// message and put the multi-line output into a value.
b.WriteString(strconv.Quote(msg))
qMsgArr := [1024]byte{}
qMsg := qMsgArr[:0]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a similar optimization elsewhere recently and learned that make([]byte, 0, 1024) automatically avoids allocations. Apparently the compiler is doing some escape analysis and then uses the stack. This makes the code simpler. I don't know whether that is something that can be relied upon.

What do you think?

I tried the patch below on top of yours. It gave me the exact same allocations and similar runtime.

diff --git a/klog.go b/klog.go
index 63b12df..a5997e0 100644
--- a/klog.go
+++ b/klog.go
@@ -812,8 +812,7 @@ func (l *loggingT) printS(err error, s severity.Severity, depth int, msg string,
 	// The message is always quoted, even if it contains line breaks.
 	// If developers want multi-line output, they should use a small, fixed
 	// message and put the multi-line output into a value.
-	qMsgArr := [1024]byte{}
-	qMsg := qMsgArr[:0]
+	qMsg := make([]byte, 0, 1024)
 	qMsg = strconv.AppendQuote(qMsg, msg)
 
 	// Only create a new buffer if we don't have one cached.
diff --git a/klogr_slog.go b/klogr_slog.go
index e61bce1..0010e93 100644
--- a/klogr_slog.go
+++ b/klogr_slog.go
@@ -63,8 +63,7 @@ func slogOutput(file string, line int, now time.Time, err error, s severity.Seve
 	}
 
 	// See printS.
-	qMsgArr := [1024]byte{}
-	qMsg := qMsgArr[:0]
+	qMsg := make([]byte, 0, 1024)
 	qMsg = strconv.AppendQuote(qMsg, msg)
 
 	b := buffer.GetBuffer()
diff --git a/textlogger/textlogger.go b/textlogger/textlogger.go
index d37451b..7aaad6a 100644
--- a/textlogger/textlogger.go
+++ b/textlogger/textlogger.go
@@ -117,8 +117,7 @@ func (l *tlogger) printWithInfos(file string, line int, now time.Time, err error
 	// The message is always quoted, even if it contains line breaks.
 	// If developers want multi-line output, they should use a small, fixed
 	// message and put the multi-line output into a value.
-	qMsgArr := [1024]byte{}
-	qMsg := qMsgArr[:0]
+	qMsg := make([]byte, 0, 1024)
 	qMsg = strconv.AppendQuote(qMsg, msg)
 
 	// Only create a new buffer if we don't have one cached.
$ benchstat /tmp/pr /tmp/pr-modified
goos: linux
goarch: amd64
pkg: k8s.io/klog/v2
cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
                                                                         │   /tmp/pr    │          /tmp/pr-modified          │
                                                                         │    sec/op    │    sec/op     vs base              │
KlogOutput/call_depth-36                                                   2.685µ ±  6%   2.761µ ±  6%  +2.83% (p=0.009 n=6)
KlogOutput/KObjs-36                                                        2.848µ ±  5%   2.711µ ±  8%       ~ (p=0.121 n=6)
KlogOutput/MarshalLog()_for_nil-36                                         3.498µ ± 12%   3.606µ ±  8%       ~ (p=0.589 n=6)
KlogOutput/Error()_that_panics-36                                          2.682µ ± 11%   2.781µ ± 10%       ~ (p=0.937 n=6)
KlogOutput/other_vmodule-36                                                170.9n ±  2%   171.1n ±  2%       ~ (p=1.000 n=6)
KlogOutput/log_with_multiple_names_and_values-36                           3.139µ ±  6%   3.254µ ±  7%       ~ (p=0.258 n=6)
KlogOutput/handle_odd-numbers_of_KVs_in_both_log_values_and_Info_args-36   4.234µ ±  8%   4.226µ ±  5%       ~ (p=0.699 n=6)
KlogOutput/KObjSlice_ints-36                                               2.631µ ±  7%   2.759µ ± 10%       ~ (p=0.394 n=6)
KlogOutput/vmodule-36                                                      3.053µ ± 10%   3.067µ ±  9%       ~ (p=0.699 n=6)
KlogOutput/log_with_values-36                                              2.223µ ±  6%   2.246µ ±  3%       ~ (p=0.818 n=6)
KlogOutput/override_single_value-36                                        2.151µ ± 13%   2.370µ ±  8%       ~ (p=0.065 n=6)
KlogOutput/KObj-36                                                         2.313µ ±  7%   2.219µ ± 11%       ~ (p=0.180 n=6)
KlogOutput/KObjSlice_nil_arg-36                                            2.123µ ±  6%   2.035µ ± 15%       ~ (p=0.394 n=6)
KlogOutput/MarshalLog()_that_panics-36                                     3.225µ ±  9%   3.400µ ±  9%       ~ (p=0.132 n=6)
KlogOutput/cyclic_list-36                                                  150.7µ ±  1%   163.5µ ±  4%  +8.52% (p=0.002 n=6)
KlogOutput/verbosity_enabled-36                                            1.881µ ±  7%   1.955µ ±  7%       ~ (p=0.513 n=6)
KlogOutput/KObjSlice_okay-36                                               2.413µ ±  4%   2.390µ ±  8%       ~ (p=0.394 n=6)
KlogOutput/KObjSlice_int_arg-36                                            2.335µ ±  5%   2.401µ ± 12%       ~ (p=0.394 n=6)
KlogOutput/quotation-36                                                    2.526µ ±  5%   2.459µ ± 13%       ~ (p=0.240 n=6)
KlogOutput/odd_WithValues-36                                               8.681µ ±  7%   9.211µ ±  8%       ~ (p=0.310 n=6)
KlogOutput/print_duplicate_keys_in_arguments-36                            2.622µ ±  8%   2.535µ ± 13%       ~ (p=0.937 n=6)
KlogOutput/KObjSlice_nil_entry-36                                          2.301µ ±  4%   2.318µ ±  9%       ~ (p=0.485 n=6)
KlogOutput/regular_error_types_as_value-36                                 2.328µ ±  3%   2.366µ ±  4%       ~ (p=0.240 n=6)
KlogOutput/struct_keys-36                                                  3.987µ ±  9%   3.957µ ± 10%       ~ (p=0.699 n=6)
KlogOutput/slice_values-36                                                 2.773µ ±  9%   2.716µ ±  4%       ~ (p=0.665 n=6)
KlogOutput/verbosity_disabled-36                                           14.41n ±  0%   15.41n ±  3%  +6.94% (p=0.002 n=6)
KlogOutput/handle_integer_keys-36                                          4.398µ ±  7%   4.307µ ±  5%       ~ (p=0.558 n=6)
KlogOutput/map_values-36                                                   3.877µ ±  5%   3.957µ ±  5%       ~ (p=0.132 n=6)
KlogOutput/struct_values-36                                                2.613µ ±  4%   2.470µ ±  9%       ~ (p=0.093 n=6)
KlogOutput/klog.Format-36                                                  4.295µ ±  6%   4.336µ ±  8%       ~ (p=0.589 n=6)
KlogOutput/String()_that_panics-36                                         3.247µ ± 11%   3.178µ ± 10%       ~ (p=0.589 n=6)
KlogOutput/empty_WithValues-36                                             1.832µ ±  7%   1.855µ ±  9%       ~ (p=0.699 n=6)
KlogOutput/preserve_order_of_key/value_pairs-36                            5.068µ ±  6%   5.189µ ±  3%       ~ (p=0.132 n=6)
KlogOutput/handle_odd-numbers_of_KVs-36                                    2.687µ ± 10%   2.727µ ±  7%       ~ (p=0.394 n=6)
KlogOutput/MarshalLog()_that_returns_itself-36                             2.447µ ±  9%   2.283µ ±  5%       ~ (p=0.132 n=6)
KlogOutput/map_keys-36                                                     2.993µ ±  7%   2.993µ ± 10%       ~ (p=1.000 n=6)
KlogOutput/override_WithValues-36                                          3.285µ ±  7%   3.016µ ±  9%  -8.20% (p=0.041 n=6)
KlogOutput/multiple_WithValues-36                                          11.07µ ±  6%   11.38µ ± 11%       ~ (p=1.000 n=6)
KlogOutput/html_characters-36                                              2.271µ ±  8%   2.379µ ±  8%       ~ (p=0.065 n=6)
KlogOutput/ignore_MarshalJSON-36                                           2.288µ ±  3%   2.262µ ±  6%       ~ (p=0.818 n=6)
KlogOutput/regular_error_types_when_using_logr.Error-36                    2.096µ ±  8%   2.084µ ± 10%       ~ (p=0.937 n=6)
KlogOutput/Error()_for_nil-36                                              6.308µ ± 11%   6.685µ ±  6%  +5.97% (p=0.015 n=6)
KlogOutput/String()_for_nil-36                                             6.581µ ±  6%   6.428µ ±  6%       ~ (p=0.974 n=6)
KlogOutput/log_with_name_and_values-36                                     3.017µ ±  4%   2.969µ ±  2%       ~ (p=0.457 n=6)
geomean                                                                    2.795µ         2.815µ        +0.71%

                                                                         │    /tmp/pr     │           /tmp/pr-modified           │
                                                                         │      B/op      │     B/op      vs base                │
KlogOutput/call_depth-36                                                     337.0 ± 0%       337.0 ± 0%       ~ (p=1.000 n=6) ¹
...

That was with Go 1.23. I also tried with 1.19, which gave me similar results, but not exactly the same. It used less memory per operation and in some cases significantly more. I see those difference also with your variant of the optimization, so this particular make optimization was already supported in 1.19.

There are probably other differences at play here - I haven't compared compiler versions without this optimization.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with your patch. In fact, I've noticed the same thing independently in another similar case, but forgot to update this PR. Updated, thanks!

Copy link
Copy Markdown

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you squash into one commit and add a short explanation why the code works, i.e. add a remark about make([]byte, 0, 1024) not allocating from the heap?

Comment thread .gitignore Outdated
# Vscode files
.vscode

testdata/ No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this change.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 17, 2024
@astef astef force-pushed the astef-optimize_quote branch from 2ff1062 to 8f936bb Compare October 17, 2024 06:30
@astef
Copy link
Copy Markdown
Author

astef commented Oct 17, 2024

Can you squash into one commit and add a short explanation why the code works, i.e. add a remark about make([]byte, 0, 1024) not allocating from the heap?

Done. Note, that history was rewritten due to force-push

@pohly
Copy link
Copy Markdown

pohly commented Oct 17, 2024

I'm fine with rewriting history in a PR. But please format the commit message according to https://www.kubernetes.dev/docs/guide/pull-requests/#commit-message-guidelines

Optimization is accomplished by preallocating a fixed-size buffer and
using strconv.AppendQuote instead of always dynamically allocating
memory in strconv.Quote. make([]byte, 0, 1024) doesn't escape to heap at
least in Go 1.19 and 1.23
Signed-off-by: Alexandr Stefurishin <[email protected]>
@astef astef force-pushed the astef-optimize_quote branch from 8f936bb to a1604b5 Compare October 17, 2024 09:47
Copy link
Copy Markdown

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 17, 2024
@k8s-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astef, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants