XmlLayout - Faster XML encoding with INoAllocationStringValueRenderer#6094
XmlLayout - Faster XML encoding with INoAllocationStringValueRenderer#6094
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughXml escaping/sanitization was refactored to support optional incremental writes into a provided StringBuilder; XmlEncodeLayoutRendererWrapper prefers a non-allocating string-value renderer and routes rendering through builder-aware escape/CDAT A helpers; XML element/attribute rendering and unit tests were updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant Renderer as XmlEncodeLayoutRendererWrapper
participant Helper as XmlHelper
participant Builder as StringBuilder
Renderer->>Renderer: InitializeLayoutRenderer() (resolve _stringValueRenderer)
Renderer->>Renderer: RenderInnerAndTransform(event)
alt string-value renderer available
Renderer->>Renderer: Obtain string value (no allocation)
Renderer->>Builder: Append raw text
Renderer->>Helper: EscapeXmlWhenNeeded(Builder, startPos, xmlEncodeNewlines, removedInvalidXmlChars?)
else fallback to layout-render path
Renderer->>Builder: Inner.Render(logEvent, Builder)
Renderer->>Helper: EscapeXmlWhenNeeded(Builder, startPos, xmlEncodeNewlines)
end
Helper->>Helper: RequiresXmlEscape scan(text segment)
alt invalid XML chars found
Helper->>Builder: RemoveInvalidXmlChars(text, Builder)
end
alt escaping needed
Helper->>Builder: EscapeXmlString(text, xmlEncodeNewlines, startIndex, Builder)
end
Renderer->>Builder: Append attributes / CDATA markers as needed
Renderer->>Renderer: Continue composing output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
4e1d0e1 to
473b4fc
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/NLog.UnitTests/LayoutRenderers/Wrappers/XmlEncodeTests.cs (1)
55-112: Good coverage for invalid XML characters, but consider adding a surrogate pair test.The test validates that all non-surrogate, non-forbidden characters survive encoding. However, it explicitly skips surrogates (line 74–77), which means valid surrogate pairs (e.g., emoji like
"\ud83d\ude00") are not tested. Given the surrogate handling issue inRemoveInvalidXmlIfNeeded, adding a test with valid surrogate pairs would be valuable to prevent regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/NLog.UnitTests/LayoutRenderers/Wrappers/XmlEncodeTests.cs` around lines 55 - 112, Add a new test (or extend XmlEncodeBadStringTest1) that includes valid surrogate pairs to ensure RemoveInvalidXmlIfNeeded preserves them: create a string containing known valid surrogate pair(s) (e.g., "\ud83d\ude00" or other emoji), push it into the ScopeContext property "foo" and render with SimpleLayout using "${xml-encode:${scopeproperty:foo}}" and "${xml-encode:${scopeproperty:foo}:CDataEncode=true}", then assert the output is non-empty and that the surrogate pair characters remain valid XML characters (use System.Xml.XmlConvert.IsXmlChar on each resulting char or verify the original codepoint survives). Reference the existing XmlEncodeBadStringTest1, ScopeContext.PushProperty, SimpleLayout rendering, and the RemoveInvalidXmlIfNeeded behavior when adding the assertions.src/NLog/Internal/XmlHelper.cs (1)
176-194: String-basedEscapeXmlWhenNeeded— minor dead code observation.The
elsebranch at line 187 (EscapeXmlString(text, xmlEncodeNewlines, startIndex, sb)) appears unreachable: we only enter theif (startIndex >= 0)block whentextcontains at least one XML escape char (<>&'"\r\n), all of which are valid XML chars. ThereforeRemoveInvalidXmlIfNeededwill always append at least that char, ensuringorgLength != sb.Length.Not a functional issue — just a note for code clarity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/NLog/Internal/XmlHelper.cs` around lines 176 - 194, The code in EscapeXmlWhenNeeded has an unreachable else branch after RemoveInvalidXmlIfNeeded: since startIndex >= 0 guarantees at least one XML-escape character is present, RemoveInvalidXmlIfNeeded will always increase sb.Length, so the branch that calls EscapeXmlString(text, xmlEncodeNewlines, startIndex, sb) when orgLength == sb.Length is dead; simplify by removing that else branch and always calling EscapeXmlWhenNeeded(sb, orgLength, xmlEncodeNewlines) when startIndex >= 0 (or adjust the logic to only call RemoveInvalidXmlIfNeeded and then process the sb), cleaning up the unused path and related comments while keeping references to EscapeXmlWhenNeeded, RemoveInvalidXmlIfNeeded and EscapeXmlString intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 134-147: CreateValidXmlString is stripping valid UTF-16 surrogate
pairs because it checks characters one-by-one with XmlConvertIsXmlChar; change
the loop in CreateValidXmlString to detect high-surrogate + low-surrogate pairs
(use char.IsHighSurrogate(text[i]) && i + 1 < text.Length &&
char.IsLowSurrogate(text[i+1])), compute the Unicode scalar with
char.ConvertToUtf32(text, i) and call XmlConvertIsXmlChar on that codepoint, and
if valid append both chars to sb and advance i by one; otherwise fall back to
the existing single-char XmlConvertIsXmlChar check and append. Ensure you
reference the variables/text parameters used in the method (text, startIndex,
sb) and update only the loop logic to preserve existing builder behavior.
- Around line 85-99: RemoveInvalidXmlIfNeeded and CreateValidXmlString currently
drop valid surrogate pairs because they call XmlConvertIsXmlChar per-char;
update both methods to detect high-surrogate/low-surrogate pairs while iterating
(use Char.IsHighSurrogate/Char.IsLowSurrogate semantics) and, when a high
surrogate is followed by a low surrogate, treat the pair as a single valid unit
(append both to the StringBuilder or skip advancing appropriately) instead of
discarding them; ensure the loop in RemoveInvalidXmlIfNeeded checks the next
character before delegating to CreateValidXmlString and that
CreateValidXmlString appends the full surrogate pair when valid, advancing the
index by 2 in that case so emoji and supplementary Unicode are preserved for
callers like EscapeXmlWhenNeeded, EscapeCDataWhenNeeded and EscapeCData.
---
Nitpick comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 176-194: The code in EscapeXmlWhenNeeded has an unreachable else
branch after RemoveInvalidXmlIfNeeded: since startIndex >= 0 guarantees at least
one XML-escape character is present, RemoveInvalidXmlIfNeeded will always
increase sb.Length, so the branch that calls EscapeXmlString(text,
xmlEncodeNewlines, startIndex, sb) when orgLength == sb.Length is dead; simplify
by removing that else branch and always calling EscapeXmlWhenNeeded(sb,
orgLength, xmlEncodeNewlines) when startIndex >= 0 (or adjust the logic to only
call RemoveInvalidXmlIfNeeded and then process the sb), cleaning up the unused
path and related comments while keeping references to EscapeXmlWhenNeeded,
RemoveInvalidXmlIfNeeded and EscapeXmlString intact.
In `@tests/NLog.UnitTests/LayoutRenderers/Wrappers/XmlEncodeTests.cs`:
- Around line 55-112: Add a new test (or extend XmlEncodeBadStringTest1) that
includes valid surrogate pairs to ensure RemoveInvalidXmlIfNeeded preserves
them: create a string containing known valid surrogate pair(s) (e.g.,
"\ud83d\ude00" or other emoji), push it into the ScopeContext property "foo" and
render with SimpleLayout using "${xml-encode:${scopeproperty:foo}}" and
"${xml-encode:${scopeproperty:foo}:CDataEncode=true}", then assert the output is
non-empty and that the surrogate pair characters remain valid XML characters
(use System.Xml.XmlConvert.IsXmlChar on each resulting char or verify the
original codepoint survives). Reference the existing XmlEncodeBadStringTest1,
ScopeContext.PushProperty, SimpleLayout rendering, and the
RemoveInvalidXmlIfNeeded behavior when adding the assertions.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/NLog/Internal/XmlHelper.cs (1)
176-194: Theelsebranch (line 187) is unreachable whenstartIndex >= 0.When
startIndex >= 0, at least one XML escape char exists intext. Since all escape chars (<,>,&,',",\r,\n) are valid XML characters,RemoveInvalidXmlIfNeededwill always preserve at least one char in the builder, soorgLength != sb.Lengthis alwaystrue. Theelsebranch callingEscapeXmlStringdirectly (bypassing invalid-char removal) can never execute in this code path.This doesn't cause a runtime issue, but it's misleading — a future reader might assume both branches are reachable and that the
elsepath intentionally skips sanitization.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/NLog/Internal/XmlHelper.cs` around lines 176 - 194, When startIndex >= 0 the branch that calls EscapeXmlString(...) is unreachable because RemoveInvalidXmlIfNeeded(...) will always change sb.Length for inputs containing escape chars; remove that misleading branch and simplify EscapeXmlWhenNeeded(string,text,...) so that after creating sb and recording orgLength you call RemoveInvalidXmlIfNeeded(text, sb) and then always invoke EscapeXmlWhenNeeded(sb, orgLength, xmlEncodeNewlines) (dropping the EscapeXmlString call), then return sb.ToString() or string.Empty as before; update references to methods EscapeXmlWhenNeeded, RemoveInvalidXmlIfNeeded, EscapeXmlString, sb, and orgLength accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 85-99: RemoveInvalidXmlIfNeeded incorrectly iterates chars and
treats lone halves of valid surrogate pairs as invalid; change its loop to an
index-based scan like RemoveInvalidXmlChars using XmlConvertIsXmlChar and
XmlConvertIsXmlSurrogatePair so you advance by 2 when you encounter a valid
surrogate pair and by 1 otherwise, and only call CreateValidXmlString when you
find a truly invalid code point (passing the current index and builder). Ensure
the final fast-path still appends the full text to builder and returns text when
no invalid code points are found.
- Around line 134-147: CreateValidXmlString currently treats chars individually
and therefore drops valid UTF-16 surrogate pairs; update it to detect
high-surrogate/low-surrogate pairs and treat them as a single valid character:
when char.IsHighSurrogate(ch) and i+1 < text.Length and
char.IsLowSurrogate(text[i+1]) then validate the full pair (ensure
XmlConvertIsXmlChar for both or otherwise use XmlConvert.IsXmlChar with the
combined scalar), append both chars to sb and increment i to skip the low
surrogate; if a lone low surrogate or invalid pair is encountered, skip it. Keep
the existing behavior of appending into the provided builder (and returning
sb.ToString() only when builder is null). Ensure this fix covers callers
RemoveInvalidXmlIfNeeded and EscapeCData which delegate here.
---
Nitpick comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 176-194: When startIndex >= 0 the branch that calls
EscapeXmlString(...) is unreachable because RemoveInvalidXmlIfNeeded(...) will
always change sb.Length for inputs containing escape chars; remove that
misleading branch and simplify EscapeXmlWhenNeeded(string,text,...) so that
after creating sb and recording orgLength you call
RemoveInvalidXmlIfNeeded(text, sb) and then always invoke
EscapeXmlWhenNeeded(sb, orgLength, xmlEncodeNewlines) (dropping the
EscapeXmlString call), then return sb.ToString() or string.Empty as before;
update references to methods EscapeXmlWhenNeeded, RemoveInvalidXmlIfNeeded,
EscapeXmlString, sb, and orgLength accordingly.
debba41 to
2e31025
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/NLog/Internal/XmlHelper.cs (2)
138-163: Recursive restart fromstartPosre-scans already-verified chars; passiinstead.In the
!XmlConvertIsXmlCharbranch (Line 149), the recursive call passesstartPosrather thani. Chars fromstartPos..i-1were already confirmed to be valid XML chars that require no escaping, so re-scanning them is pure waste. Since XML special chars are single-char tokens (unlike the]]>sliding window inEscapeCDataWhenNeeded, which legitimately needs anorgLengthrestart), it's safe to passihere.♻️ Proposed change
- EscapeXmlWhenNeeded(builder, startPos, xmlEncodeNewlines, removedInvalidXmlChars: true); + EscapeXmlWhenNeeded(builder, i, xmlEncodeNewlines, removedInvalidXmlChars: true);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/NLog/Internal/XmlHelper.cs` around lines 138 - 163, In EscapeXmlWhenNeeded, the recursive restart in the branch that handles non-XML chars incorrectly passes startPos causing previously-verified chars to be re-scanned; change the recursive call EscapeXmlWhenNeeded(builder, startPos, xmlEncodeNewlines, removedInvalidXmlChars: true) to pass the current index i instead (EscapeXmlWhenNeeded(builder, i, xmlEncodeNewlines, removedInvalidXmlChars: true)) so scanning resumes at the first unprocessed character; locate this inside the !XmlConvertIsXmlChar branch where RemoveInvalidXmlChars(text, builder) is called and update the parameter to i to avoid redundant work.
165-188: Unnecessary O(n) append+truncate in the common (no-invalid-chars) path; fragileReferenceEqualscontract.Two related points:
Double-scan cost: When
startIndex >= 0and the text contains no invalid XML chars (by far the common case for well-formed log messages),RemoveInvalidXmlChars(text, sb)eagerly appends the full text tosb(Line 172), only forsb.Length = orgLengthto immediately undo it (Line 175) beforeEscapeXmlStringdoes its own O(n) scan. A cheapHasInvalidXmlCharssentinel would save the redundant allocation/copy.Fragile sentinel: Line 173 relies on
RemoveInvalidXmlCharsreturning the exact same reference (ReferenceEquals) to signal "nothing was changed". This invariant isn't documented. IfRemoveInvalidXmlCharsis ever changed to return a copy even for clean input, the logic silently falls into the wrong branch. A brief inline comment would make this safe-to-maintain.♻️ Minimal improvement: document the ReferenceEquals contract
+ // RemoveInvalidXmlChars returns the original reference when nothing was changed; + // ReferenceEquals is the sentinel for "no invalid chars found". var validText = RemoveInvalidXmlChars(text, sb); if (ReferenceEquals(validText, text))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/NLog/Internal/XmlHelper.cs` around lines 165 - 188, EscapeXmlWhenNeeded currently does an unnecessary append+truncate and relies on ReferenceEquals from RemoveInvalidXmlChars to detect "no change"; implement an explicit check to avoid the double O(n) work and the fragile contract: add or use a HasInvalidXmlChars(text) sentinel (or change RemoveInvalidXmlChars to return a bool/tuple) and in EscapeXmlWhenNeeded first check that sentinel when startIndex >= 0 — if no invalid chars, call EscapeXmlString(text, xmlEncodeNewlines, startIndex, sb) directly (no append/trim); if there are invalid chars, call RemoveInvalidXmlChars(text, sb) and then EscapeXmlWhenNeeded(sb, orgLength, xmlEncodeNewlines, removedInvalidXmlChars: true) as before; remove the ReferenceEquals check and/or add a short comment documenting the intended contract if you keep it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 138-163: In EscapeXmlWhenNeeded, the recursive restart in the
branch that handles non-XML chars incorrectly passes startPos causing
previously-verified chars to be re-scanned; change the recursive call
EscapeXmlWhenNeeded(builder, startPos, xmlEncodeNewlines,
removedInvalidXmlChars: true) to pass the current index i instead
(EscapeXmlWhenNeeded(builder, i, xmlEncodeNewlines, removedInvalidXmlChars:
true)) so scanning resumes at the first unprocessed character; locate this
inside the !XmlConvertIsXmlChar branch where RemoveInvalidXmlChars(text,
builder) is called and update the parameter to i to avoid redundant work.
- Around line 165-188: EscapeXmlWhenNeeded currently does an unnecessary
append+truncate and relies on ReferenceEquals from RemoveInvalidXmlChars to
detect "no change"; implement an explicit check to avoid the double O(n) work
and the fragile contract: add or use a HasInvalidXmlChars(text) sentinel (or
change RemoveInvalidXmlChars to return a bool/tuple) and in EscapeXmlWhenNeeded
first check that sentinel when startIndex >= 0 — if no invalid chars, call
EscapeXmlString(text, xmlEncodeNewlines, startIndex, sb) directly (no
append/trim); if there are invalid chars, call RemoveInvalidXmlChars(text, sb)
and then EscapeXmlWhenNeeded(sb, orgLength, xmlEncodeNewlines,
removedInvalidXmlChars: true) as before; remove the ReferenceEquals check and/or
add a short comment documenting the intended contract if you keep it.
a820c08 to
5b8632a
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/NLog.UnitTests/LayoutRenderers/Wrappers/XmlEncodeTests.cs (1)
105-115:XmlEncodeBadStringTest1CDATA path should also validate CDATA structure.The assertions only check that each output char is a valid XML char — which passes even when the CDATA wrapper is completely absent (because
<,!,[,C,D,A,T,>are all valid XML chars). This means the critical bug inEscapeCData(missing<![CDATA[...]]>wrapper when input contains invalid XML chars) is not caught here.✅ Suggested additional assertions
var goodString = l.Render(LogEventInfo.CreateNullEvent()); Assert.NotEmpty(goodString); + Assert.StartsWith("<![CDATA[", goodString); + Assert.EndsWith("]]>", goodString); foreach (char c in goodString) { Assert.True(System.Xml.XmlConvert.IsXmlChar(c), $"Invalid char {Convert.ToInt32(c)} was not removed"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/NLog.UnitTests/LayoutRenderers/Wrappers/XmlEncodeTests.cs` around lines 105 - 115, The test XmlEncodeBadStringTest1 currently only validates that output chars are legal XML chars; update it to also assert that when CDataEncode=true the rendered string is wrapped in a proper CDATA section and the inner content is the escaped value: verify the rendered string begins with "<![CDATA[" and ends with "]]>" (using the SimpleLayout l = "${xml-encode:${scopeproperty:foo}:CDataEncode=true}" and the rendered goodString), and additionally assert the inner substring (between the wrappers) does not contain the forbidden sequence "]]>" (or that any original invalid chars were properly escaped/removed by EscapeCData). Ensure these assertions run inside the existing ScopeContext.PushProperty block that sets "foo" to badString so the test fails if the CDATA wrapper is missing or malformed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 551-555: EscapeCData currently appends emptyCData ("<![CDATA[]]>")
to builder but always returns that non-empty string; change the return to follow
the existing pattern (return builder is null ? result : string.Empty) so that
when builder is provided the method returns string.Empty. Specifically in
EscapeCData update the early-exit branch that defines emptyCData and calls
builder?.Append(emptyCData) to return builder is null ? emptyCData :
string.Empty instead of always returning emptyCData.
- Around line 574-577: EscapeCData currently skips adding the <![CDATA[ ... ]]>
wrapper when RemoveInvalidXmlChars found invalid chars and appended sanitized
text to the StringBuilder (the path calling EscapeCDataWhenNeeded(builder,
orgLength, removedInvalidXmlChars: true)); update EscapeCData (or
EscapeCDataWhenNeeded) so that in the branch where builder != null and
removedInvalidXmlChars is true you insert the opening "<![CDATA[" at the
original orgLength position and append the closing "]]>" after processing
(ensuring EscapeCDataWhenNeeded still handles internal "]]>" sequences), or
alternatively make EscapeCDataWhenNeeded itself add these wrappers when
removedInvalidXmlChars==true so
XmlEncodeLayoutRendererWrapper.RenderInnerAndTransform receives a proper CDATA
section instead of bare sanitized text.
---
Nitpick comments:
In `@tests/NLog.UnitTests/LayoutRenderers/Wrappers/XmlEncodeTests.cs`:
- Around line 105-115: The test XmlEncodeBadStringTest1 currently only validates
that output chars are legal XML chars; update it to also assert that when
CDataEncode=true the rendered string is wrapped in a proper CDATA section and
the inner content is the escaped value: verify the rendered string begins with
"<![CDATA[" and ends with "]]>" (using the SimpleLayout l =
"${xml-encode:${scopeproperty:foo}:CDataEncode=true}" and the rendered
goodString), and additionally assert the inner substring (between the wrappers)
does not contain the forbidden sequence "]]>" (or that any original invalid
chars were properly escaped/removed by EscapeCData). Ensure these assertions run
inside the existing ScopeContext.PushProperty block that sets "foo" to badString
so the test fails if the CDATA wrapper is missing or malformed.
| { | ||
| var emptyCData = "<![CDATA[]]>"; | ||
| builder?.Append(emptyCData); | ||
| return emptyCData; | ||
| } |
There was a problem hiding this comment.
EscapeCData returns "<![CDATA[]]>" instead of string.Empty when builder is provided and text is empty.
Every other return path in this method follows return builder is null ? result : string.Empty, but the early-exit for empty/null text always returns the non-empty string.
🔧 Proposed fix
if (string.IsNullOrEmpty(text))
{
var emptyCData = "<![CDATA[]]>";
builder?.Append(emptyCData);
- return emptyCData;
+ return builder is null ? emptyCData : string.Empty;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| { | |
| var emptyCData = "<![CDATA[]]>"; | |
| builder?.Append(emptyCData); | |
| return emptyCData; | |
| } | |
| { | |
| var emptyCData = "<![CDATA[]]>"; | |
| builder?.Append(emptyCData); | |
| return builder is null ? emptyCData : string.Empty; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/NLog/Internal/XmlHelper.cs` around lines 551 - 555, EscapeCData currently
appends emptyCData ("<![CDATA[]]>") to builder but always returns that non-empty
string; change the return to follow the existing pattern (return builder is null
? result : string.Empty) so that when builder is provided the method returns
string.Empty. Specifically in EscapeCData update the early-exit branch that
defines emptyCData and calls builder?.Append(emptyCData) to return builder is
null ? emptyCData : string.Empty instead of always returning emptyCData.
5b8632a to
6cd080c
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/NLog/Internal/XmlHelper.cs (1)
138-163: Surrogate pairs take the expensive fallback path in theStringBuilderoverload.When
removedInvalidXmlCharsisfalse(which is the case for direct callers likeXmlEncodeLayoutRendererWrapperandXmlAttribute), a valid high surrogate at line 144 failsXmlConvertIsXmlChar, entering the expensive extract→truncate→RemoveInvalidXmlChars→recurse path.RemoveInvalidXmlCharshandles surrogate pairs correctly so correctness is preserved, but you could add a surrogate-pair short-circuit here (as done inRemoveInvalidXmlCharsat line 99) to avoid the allocation for texts containing emoji/supplementary characters.Optional: add surrogate-pair fast path
var chr = builder[i]; if (!XmlConvertIsXmlChar(chr) && !removedInvalidXmlChars) { + if (i + 1 < builderLength && XmlConvertIsXmlSurrogatePair(builder[i + 1], chr)) + { + ++i; // valid surrogate pair — skip low surrogate + continue; + } var text = builder.ToString(i, builderLength - i);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/NLog/Internal/XmlHelper.cs` around lines 138 - 163, EscapeXmlWhenNeeded currently treats a valid high surrogate as invalid (calling the expensive extract→RemoveInvalidXmlChars→recurse path) because XmlConvertIsXmlChar returns false for lone surrogates; update EscapeXmlWhenNeeded to short‑circuit when encountering a high surrogate followed by a low surrogate by detecting char.IsHighSurrogate(chr) && i+1 < builderLength && char.IsLowSurrogate(builder[i+1]) and simply advance i past the pair (continue loop) so surrogate pairs are handled cheaply; keep the existing behavior for other invalid chars and still call RemoveInvalidXmlChars/ EscapeXmlString when necessary, referencing the EscapeXmlWhenNeeded, XmlConvertIsXmlChar, RemoveInvalidXmlChars and RequiresXmlEscape symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 548-555: The EscapeCData method returns "<![CDATA[]]>" for
empty/null input even when a StringBuilder is supplied, breaking the established
convention of returning the result only when builder is null; change the
early-empty branch in EscapeCData so it appends emptyCData to builder (if
non-null) but returns string.Empty when builder is provided, and returns the
emptyCData string only when builder is null — this makes EscapeCData's return
behavior consistent with other paths and callers like
XmlEncodeLayoutRendererWrapper.
---
Nitpick comments:
In `@src/NLog/Internal/XmlHelper.cs`:
- Around line 138-163: EscapeXmlWhenNeeded currently treats a valid high
surrogate as invalid (calling the expensive
extract→RemoveInvalidXmlChars→recurse path) because XmlConvertIsXmlChar returns
false for lone surrogates; update EscapeXmlWhenNeeded to short‑circuit when
encountering a high surrogate followed by a low surrogate by detecting
char.IsHighSurrogate(chr) && i+1 < builderLength &&
char.IsLowSurrogate(builder[i+1]) and simply advance i past the pair (continue
loop) so surrogate pairs are handled cheaply; keep the existing behavior for
other invalid chars and still call RemoveInvalidXmlChars/ EscapeXmlString when
necessary, referencing the EscapeXmlWhenNeeded, XmlConvertIsXmlChar,
RemoveInvalidXmlChars and RequiresXmlEscape symbols.
6cd080c to
c66afde
Compare
c66afde to
7a36259
Compare
|
|



Followup to #5774 with faster XML encoding, by reducing enumeration of StringBuilders.
Also optimized CData-encoding by reducing the number of StringBuilder-Index-Lookups.
Also improved xml-encoding when outputting Xml-Attribute-Values for correctness.