Make System.Guid readonly#44629
Conversation
- Also adds Unsafe.Unbox for internal use
I like this idea. GuidResult can actually have just 4 ints (instead of the bytes and shorts from Guid). It may make the parsing tiny bit faster - we won't need to split the parsed ints into bytes. |
src/libraries/System.Private.CoreLib/src/Internal/Runtime/CompilerServices/Unsafe.cs
Outdated
Show resolved
Hide resolved
| // Compare each element | ||
|
|
||
| if (rA != rB) { return false; } | ||
| if (Unsafe.Add(ref rA, 1) != Unsafe.Add(ref rB, 1)) { return false; } |
There was a problem hiding this comment.
You can also change this to do the comparison as longs using Unsafe.ReadUnaligned. We had abandoned PR for that: #35654
There was a problem hiding this comment.
Do you know why it was abandoned? I see the alignment discussion on that thread, but there doesn't seem to be any clear reason the PR was rejected.
There was a problem hiding this comment.
I guess Bruce had more important things to do than to finish it.
There was a problem hiding this comment.
Something like this I guess 🙂
private static bool EqualsCore(in Guid left, in Guid right)
{
ref byte leftBytes = ref Unsafe.As<int, byte>(ref Unsafe.AsRef(in left._a));
ref byte rightBytes = ref Unsafe.As<int, byte>(ref Unsafe.AsRef(in right._a));
// early out if not equal
if (Unsafe.ReadUnaligned<ulong>(ref leftBytes) !=
Unsafe.ReadUnaligned<ulong>(ref rightBytes))
return false;
return Unsafe.ReadUnaligned<ulong>(ref Unsafe.Add(ref leftBytes, 8)) ==
Unsafe.ReadUnaligned<ulong>(ref Unsafe.Add(ref rightBytes, 8));
}G_M41817_IG01:
G_M41817_IG02:
mov rax, qword ptr [rcx]
cmp rax, qword ptr [rdx]
je SHORT G_M41817_IG05
G_M41817_IG03:
xor eax, eax
G_M41817_IG04:
ret
G_M41817_IG05:
mov rax, qword ptr [rcx+8]
cmp rax, qword ptr [rdx+8]
sete al
movzx rax, al
G_M41817_IG06:
reta SIMD version of it is slightly faster for equal guids but in most cases, I guess, we can "early out"
Also, this code can be branchless even without the SIMD (https://godbolt.org/z/jjM3Gs) but JIT doesn't optimize cmp(x1,x2) && cmp(x3,x4) to cmp(x1,x2) | cmp(x3,x4) yet, but it probably makes no sense to make Guid.Equals branchless.
There was a problem hiding this comment.
Using 64-bit comparisons didn't seem to help matters - see #44629 (comment).
There was a problem hiding this comment.
| Equal_32Bit | 0.9002 ns | 0.0087 ns | 0.0082 ns |
| Equal_64Bit | 1.1797 ns | 0.0109 ns | 0.0102 ns |
Interesting. #35654 had this case as an improvement.
|
Also needs the |
|
Here's the perf run from BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen 9 3950X, 1 CPU, 32 logical and 16 physical cores
.NET Core SDK=5.0.100
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT
DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT
| Method | Mean | Error | StdDev |
|------------------------- |----------:|----------:|----------:|
| Equal_32Bit | 0.9002 ns | 0.0087 ns | 0.0082 ns |
| Equal_64Bit | 1.1797 ns | 0.0109 ns | 0.0102 ns |
| Equal_64BitBranchless | 0.8910 ns | 0.0054 ns | 0.0050 ns |
| Equal_Sse2 | 0.9096 ns | 0.0080 ns | 0.0075 ns |
| NotEqual_32Bit | 1.1562 ns | 0.0082 ns | 0.0073 ns |
| NotEqual_64Bit | 1.1845 ns | 0.0091 ns | 0.0085 ns |
| NotEqual_64BitBranchless | 0.8972 ns | 0.0091 ns | 0.0086 ns |
| NotEqual_Sse2 | 0.9080 ns | 0.0069 ns | 0.0058 ns |I'm not seeing any measurable improvement in using 64-bit comparisons over 32-bit comparisons. The branchless / SIMD versions can be a little better, but now we're operating in the picosecond range. Doesn't seem worth the complexity to me. |
|
On the plus side, I experimented with replacing |
|
With latest iteration, I tried a hybrid of the "move all the fields to If this approach isn't preferred, we can investigate alternatives. |
| g._d = (byte)(uintTmp >> 8); | ||
| g._e = (byte)uintTmp; | ||
| // _d, _e must be stored as a big-endian ushort | ||
| result._de = (BitConverter.IsLittleEndian) ? BinaryPrimitives.ReverseEndianness((ushort)uintTmp) : (ushort)uintTmp; |
There was a problem hiding this comment.
| result._de = (BitConverter.IsLittleEndian) ? BinaryPrimitives.ReverseEndianness((ushort)uintTmp) : (ushort)uintTmp; | |
| result._de = BitConverter.IsLittleEndian ? BinaryPrimitives.ReverseEndianness((ushort)uintTmp) : (ushort)uintTmp; |
Nit - multiple places. I believe the usual style for src/libraries is without the extra ( ) in this context.
(BitConverter.IsLittleEndian) ? - 27 occurences under src/libraris
BitConverter.IsLittleEndian ? - 3 occurrences under src/libraries
There was a problem hiding this comment.
Maybe this is an argument to add BinaryPrimitives.htonl, which would encapsulate the check and make the call sites cleaner? Right now I think this API sits on IPAddress, which is the wrong layering.
Edit: We considered this in #29222 and rejected it.
I like this hybrid approach. |
|
Latest iteration: removed internal Crossing my fingers and hoping CI cooperates today. :) |
|
Thanks @jkotas for the feedback. Much appreciated! :) |
|
/azp run runtime |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This is a follow-up to #1809. It contains two changes: (a) make
System.Guidreadonly (+ some minor code cleanup); and (b) exposeUnsafe.Unbox<T>for internal use, now thatGuidcan use it as an implementation detail. :)One piece of feedback given in #1809 was that we shouldn't use the internal
MutableGuidtype. I've changed the call sites here to useUnsafe.AsRef, but to be honest I think this is introducing noise and thatMutableGuidis a cleaner solution. If we don't likeMutableGuid, then perhaps we can clone all of the GUID fields into the existing mutable typeGuidResult, then introduce an APIGuidResult.ToGuid()which is basically a glorified reinterpret_cast.I think I hit all the needed places to shim
Unsafe.Unbox. From investigating other calls, as long as we have a rudimentary implementation in Unsafe.cs we don't need to enlighten the mono interpreter.