[release/v7.4] Split TPN manifest and Component Governance manifest (#26891)#26961
Conversation
…owerShell#26891) (PowerShell#26955) Co-authored-by: Travis Plunk <[email protected]> Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
Backport to release/v7.4 that splits the Component Governance manifest into separate “main” and “TPN” manifests and updates tooling/pipelines to use the new paths, along with enhancements to the ClearlyDefined tooling used for harvesting/license verification.
Changes:
- Move CG manifest references from
tools/cgmanifest.jsontotools/cgmanifest/main/cgmanifest.jsonand introducetools/cgmanifest/tpn/cgmanifest.json. - Update
findMissingNotices.ps1to read/write the new manifest locations and add-ForceHarvestedOnlymode for generating a filtered TPN manifest. - Extend ClearlyDefined tooling with retry/backoff, cache import/export, and search/version helpers; update compliance pipeline scanning path to the TPN manifest folder.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/packaging/packaging.psm1 | Updates CG manifest path to the new tools/cgmanifest/main/cgmanifest.json location. |
| tools/findMissingNotices.ps1 | Switches to new manifest path and adds logic to produce/update a TPN manifest (plus a new -ForceHarvestedOnly mode). |
| tools/clearlyDefined/src/ClearlyDefined/ClearlyDefined.psm1 | Adds retry/cache/search helpers; changes harvest invocation behavior; adjusts caching behavior. |
| tools/clearlyDefined/Find-LastHarvestedVersion.ps1 | New helper to locate last harvested version for reversion scenarios. |
| tools/clearlyDefined/ClearlyDefined.ps1 | Updates manifest path used for test/harvest verification to the new main manifest location. |
| tools/cgmanifest/tpn/cgmanifest.json | Adds the new TPN-specific manifest. |
| tools/cgmanifest/main/cgmanifest.json | Adds the new main CG manifest. |
| .vsts-ci/windows-arm64.yml | Updates PR path filters to the new tools/cgmanifest/* layout. |
| .vsts-ci/psresourceget-acr.yml | Updates PR path filters to the new tools/cgmanifest/* layout. |
| .vsts-ci/mac.yml | Updates PR path filters to the new tools/cgmanifest/* layout. |
| .vsts-ci/linux-internal.yml | Updates PR path filters to the new tools/cgmanifest/* layout. |
| .pipelines/templates/compliance/generateNotice.yml | Changes component detection scan root to the TPN manifest folder. |
| Start-job -ScriptBlock { | ||
| Invoke-WebRequest -Method Post -Uri 'https://api.clearlydefined.io/harvest' -Body $using:body -ContentType 'application/json' -MaximumRetryCount $using:maxRetryCount -RetryIntervalSec $using:retryIntervalSec | ||
| } |
There was a problem hiding this comment.
Start-ClearlyDefinedHarvest now uses Start-Job for the harvest POST. This returns immediately and leaves jobs unobserved/uncollected; in CI or short-lived scripts the PowerShell session may exit before the job runs, so harvesting may not be triggered at all, and repeated calls can accumulate many background jobs. Prefer a synchronous Invoke-WebRequest with retry parameters, or explicitly manage jobs (collect, Wait-Job, and Remove-Job) if async behavior is required.
| Start-job -ScriptBlock { | |
| Invoke-WebRequest -Method Post -Uri 'https://api.clearlydefined.io/harvest' -Body $using:body -ContentType 'application/json' -MaximumRetryCount $using:maxRetryCount -RetryIntervalSec $using:retryIntervalSec | |
| } | |
| Invoke-WebRequest -Method Post ` | |
| -Uri 'https://api.clearlydefined.io/harvest' ` | |
| -Body $body ` | |
| -ContentType 'application/json' ` | |
| -MaximumRetryCount $maxRetryCount ` | |
| -RetryIntervalSec $retryIntervalSec |
| if ($retryIntervalSec -lt 300) { | ||
| $retryIntervalSec++ | ||
| } | ||
|
|
There was a problem hiding this comment.
$retryIntervalSec is a module-scoped variable and is mutated in catch blocks (e.g., incremented up to 300). Because this state is shared across all calls, a single transient failure can permanently change retry behavior for the rest of the session (and across unrelated callers), which is hard to reason about. Consider using a local retry/backoff variable per request/call instead of mutating module-level state.
| if ($retryIntervalSec -lt 300) { | |
| $retryIntervalSec++ | |
| } |
| $count = $newRegistrations.Count | ||
| $registrationsToSave = $newRegistrations | ||
| $tpnRegistrationsToSave = $null | ||
|
|
||
| # If -ForceHarvestedOnly is specified with -Fix, only include harvested packages | ||
| # and revert non-harvested packages to their previous versions | ||
| if ($Fix -and $ForceHarvestedOnly) { |
There was a problem hiding this comment.
$registrationsToSave is initialized to $newRegistrations but never changes. If the intent is to sometimes write a different registration set (e.g., when -ForceHarvestedOnly is used), update it accordingly; otherwise, consider removing the extra variable to reduce confusion.
| $key = "$($item.Name)|$($item.PackageVersion)" | ||
| $harvestStatus[$key] = $item.harvested |
There was a problem hiding this comment.
Get-ClearlyDefinedData outputs the ClearlyDefined API response object (with coordinates + added harvested fields), not the input object. As a result, $item.Name/$item.PackageVersion are not populated here, so the $harvestStatus keys will be wrong and everything will be treated as unharvested. Build the lookup key from $item.coordinates (e.g., parse it via ConvertFrom-ClearlyDefinedCoordinates to get name + revision, or use the properties on coordinates if present).
| $key = "$($item.Name)|$($item.PackageVersion)" | |
| $harvestStatus[$key] = $item.harvested | |
| $coord = ConvertFrom-ClearlyDefinedCoordinates -Coordinates $item.coordinates | |
| if ($coord -and $coord.name -and $coord.revision) { | |
| $key = "$($coord.name)|$($coord.revision)" | |
| $harvestStatus[$key] = $item.harvested | |
| } |
| foreach ($item in $finalHarvestData) { | ||
| $matchingNewRegistration = $newRegistrations | Where-Object { | ||
| $_.Component.Nuget.Name -eq $item.Name -and | ||
| $_.Component.Nuget.Version -eq $item.PackageVersion | ||
| } |
There was a problem hiding this comment.
These comparisons use $item.Name and $item.PackageVersion, but $item comes from Get-ClearlyDefinedData and won’t have those fields (it returns the API object + coordinates). This makes $matchingNewRegistration always empty and prevents the TPN update logic from running correctly. Parse $item.coordinates (or add name/version fields in Get-ClearlyDefinedData) and compare against those values instead.
Backport #26955