Skip to content

Improve processing of the powershell built-in type data from 'types.ps1xml', 'typesV3.ps1xml' and 'GetEvent.types.ps1xml'#10898

Merged
adityapatwardhan merged 12 commits intoPowerShell:masterfrom
daxian-dbw:allocTypeTable
Nov 1, 2019
Merged

Improve processing of the powershell built-in type data from 'types.ps1xml', 'typesV3.ps1xml' and 'GetEvent.types.ps1xml'#10898
adityapatwardhan merged 12 commits intoPowerShell:masterfrom
daxian-dbw:allocTypeTable

Conversation

@daxian-dbw
Copy link
Copy Markdown
Member

@daxian-dbw daxian-dbw commented Oct 25, 2019

PR Summary

Improve processing of the powershell built-in type data from types.ps1xml, typesV3.ps1xml and GetEvent.types.ps1xml.

A few changes to speed up the processing of the built-in PowerShell type data.

  1. Refactor TypeTable.cs a little to make it more efficiently at collection allocations.
    • Remove a few internal methods that are not used anywhere.
    • Update some code that does collection allocations to use a specific capacity, so as to avoid resizing or wasted allocations.
    • Add an new overload to ProcessStandardMembers, which takes an instance of PSMemberInfoInternalCollection<PSMemberInfo> directly that contains the standard members.
    • Separate out ProcessTypeConverter and ProcessTypeAdapter from ProcessTypeDataToAdd.
  2. Add two overloads constructors to PSMemberSet and PSPropertySet that are used from TypeTable, which skip argument checks and reuse the passed in collection argument.
  3. Generate code to register the built-in type information directly to the TypeTable without going through the TypeData representation as the intermediate step.
    • Currently the type information is represented by TypeData objects constructed in Types_Ps1Xml, TypesV3_Ps1Xml and GetEvent_Types_Ps1Xml. Then the TypeData objects are processed by TypeTable.ProcessTypeDataToAdd one by one.
    • I wrote a tool to go through the TypeData objects from Types_Ps1Xml.Get(), TypesV3_Ps1Xml.Get() and GetEvent_Types_Ps1Xml.Get() and generate the code that directly register the type information into TypeTable, so we can get rid of the intermediate step.
    • The tool can be found here. It's in one of the commits here, but was removed in a later commit as part of the cleanup work.
    • The new code are in TypeTable_Types_Ps1Xml.cs, TypeTable_Types_TypesV3_Ps1Xml.cs and TypeTable_GetEvent_Types_Ps1Xml. TypeTable is now declared as partial class, and spread in TypeTable.cs and the aforementioned 3 files.
  4. Some more minor optimizations to the generated code, for example, use the cached ValueFactory delegate instance instead of create a new one for every duplicate one.

After this change, creating a default TypeTable instance those 3 built-in PowerShell type files takes about 47% less time at a stable state.

### 7.0.0-preview.5
PS> [string[]] $file = @("$PSHOME\types.ps1xml", "$PSHOME\typesv3.ps1xml", "$PSHOME\GetEvent.types.ps1xml")
>> Measure-Command { foreach($i in 1..100) { [System.Management.Automation.Runspaces.TypeTable]::new($file) } }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 150
Ticks             : 1501009
TotalDays         : 1.73727893518519E-06
TotalHours        : 4.16946944444444E-05
TotalMinutes      : 0.00250168166666667
TotalSeconds      : 0.1501009
TotalMilliseconds : 150.1009
### After Change
PS> [string[]] $file = @("$PSHOME\types.ps1xml", "$PSHOME\typesv3.ps1xml", "$PSHOME\GetEvent.types.ps1xml")
>> Measure-Command { foreach($i in 1..100) { [System.Management.Automation.Runspaces.TypeTable]::new($file) } }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 82
Ticks             : 827806
TotalDays         : 9.58108796296296E-07
TotalHours        : 2.29946111111111E-05
TotalMinutes      : 0.00137967666666667
TotalSeconds      : 0.0827806
TotalMilliseconds : 82.7806

Runspace.Open() is about 10% faster with the change at a stable state:

### 7.0.0-preview.5
PS> $iss = [initialsessionstate]::CreateDefault()
>> Measure-Command { foreach($i in 1..100) { $rs = [runspacefactory]::CreateRunspace($iss); $rs.Open(); $rs.Dispose() } }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 495
Ticks             : 4954424
TotalDays         : 5.73428703703704E-06
TotalHours        : 0.000137622888888889
TotalMinutes      : 0.00825737333333333
TotalSeconds      : 0.4954424
TotalMilliseconds : 495.4424
### After Change
PS> $iss = [initialsessionstate]::CreateDefault()
>> Measure-Command { foreach($i in 1..100) { $rs = [runspacefactory]::CreateRunspace($iss); $rs.Open(); $rs.Dispose() } }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 442
Ticks             : 4428146
TotalDays         : 5.12516898148148E-06
TotalHours        : 0.000123004055555556
TotalMinutes      : 0.00738024333333333
TotalSeconds      : 0.4428146
TotalMilliseconds : 442.8146

Memory allocation comparison
The profiling was taken using pwsh.exe -noprofile -c exit. The left side was built from master branch before this PR. The right side was built with the changes in this PR.

As the image below shows, the right side has less memory allocations in general.

image

PR Checklist

@PowerShell PowerShell deleted a comment from PoshChan Oct 28, 2019
@daxian-dbw
Copy link
Copy Markdown
Member Author

@PoshChan Please retry windows

@PoshChan
Copy link
Copy Markdown
Collaborator

@daxian-dbw, successfully started retry of PowerShell-CI-Windows

@PowerShell PowerShell deleted a comment from PoshChan Oct 28, 2019
@PowerShell PowerShell deleted a comment from PoshChan Oct 28, 2019
@daxian-dbw daxian-dbw changed the title WIP: Improve processing of the powershell built-in type data from 'types.ps1xml', 'typesV3.ps1xml' and 'GetEvent.types.ps1xml' Improve processing of the powershell built-in type data from 'types.ps1xml', 'typesV3.ps1xml' and 'GetEvent.types.ps1xml' Oct 28, 2019
@daxian-dbw
Copy link
Copy Markdown
Member Author

@PoshChan Please retry Linux and macOS

@PoshChan
Copy link
Copy Markdown
Collaborator

@daxian-dbw, I do not understand the build target(s) Linux and macOS; I only allow static, windows, macos, linux, all

@daxian-dbw
Copy link
Copy Markdown
Member Author

@PoshChan Please retry Linux

@daxian-dbw
Copy link
Copy Markdown
Member Author

@PoshChan Please retry macos

@PoshChan
Copy link
Copy Markdown
Collaborator

@daxian-dbw, successfully started retry of PowerShell-CI-Linux

@PoshChan
Copy link
Copy Markdown
Collaborator

@daxian-dbw, successfully started retry of PowerShell-CI-macOS

@daxian-dbw daxian-dbw removed the request for review from TravisEz13 October 28, 2019 23:43
@daxian-dbw daxian-dbw added the CL-Performance Indicates that a PR should be marked as a performance improvement in the Change Log label Oct 28, 2019
@daxian-dbw
Copy link
Copy Markdown
Member Author

Almost 90% of the CodeFactor issues are existing ones in TypeTable.cs. The rest are complaining using #region/#endregion in the generated code, which can be ignored (maybe disable that rule?).

Copy link
Copy Markdown
Collaborator

@iSazonov iSazonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to add tests to explicitly cover the code.

@daxian-dbw
Copy link
Copy Markdown
Member Author

@PoshChan Please retry macOS

@PoshChan
Copy link
Copy Markdown
Collaborator

@daxian-dbw, successfully started retry of PowerShell-CI-macOS

@iSazonov
Copy link
Copy Markdown
Collaborator

@daxian-dbw I expect that the PR reduce significantly allocations. Can you share PerfView info in the PR description if you have?

@daxian-dbw
Copy link
Copy Markdown
Member Author

@iSazonov I added the PrefView screenshot. With the changes in this PR, the memory allocation is less in general.

@ghost ghost added the Waiting on Author The PR was reviewed and requires changes or comments from the author before being accept label Oct 30, 2019
@ghost ghost removed the Waiting on Author The PR was reviewed and requires changes or comments from the author before being accept label Oct 31, 2019
@lzybkr
Copy link
Copy Markdown
Contributor

lzybkr commented Oct 31, 2019

Is it not possible to speed up the public api?

One of the motivations for using the public api was to enable external modules to load their types in a similar manner.
Another motivation was to avoid adding features that can't be expressed via the public api. This was a big problem with remoting.

We could share the source to the tool I wrote to convert a ps1xml file to the public C# api.

@daxian-dbw
Copy link
Copy Markdown
Member Author

daxian-dbw commented Nov 1, 2019

Is it not possible to speed up the public api?

@lzybkr Are you referring to the TypeData and the related types?
We possibly can improve the code that processing TypeData, but TypeData and its family types themselves are essentially just property bags that don't have much room for improvement.
The current way of handling built-in type information uses the TypeData representation as an intermediate layer, which can be removed to get improved processing of the built-in type info.

I strongly agree it's important that the TypeData should always be updated when new type members are introduced in future, and I think that can be enforced by the Get-TypeData command, because any new type members should be reflected by the TypeData objects returned from Get-TypeData, and similarly, TypeTable should be able to handle those TypeData objects when they are used in Update-TypeData, or in InitialSessionState for creating a new Runspace.

@adityapatwardhan
Copy link
Copy Markdown
Member

@daxian-dbw Have you reviewed CodeFactor issues? All of them in old code?

@daxian-dbw
Copy link
Copy Markdown
Member Author

@adityapatwardhan I have reviewed all CodeFactory issues. 90% of them are about the existing code in TypeTable.cs, and the rest are complaining about having #region/#endregion in the methods, which can be ignored.

@adityapatwardhan adityapatwardhan merged commit 8749db9 into PowerShell:master Nov 1, 2019
@daxian-dbw daxian-dbw deleted the allocTypeTable branch November 1, 2019 20:39
@ghost
Copy link
Copy Markdown

ghost commented Nov 21, 2019

🎉v7.0.0-preview.6 has been released which incorporates this pull request.:tada:

Handy links:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CL-Performance Indicates that a PR should be marked as a performance improvement in the Change Log

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants