Output custom object with headers when no values are present#25096
Output custom object with headers when no values are present#25096jshigetomi wants to merge 31 commits intomasterfrom
Conversation
…wershell into fixNoValueCSVInput
| if (values.Count == 1 && string.IsNullOrEmpty(values[0]) && !(_cmdlet is ConvertFromCsvCommand)) | ||
| { | ||
| // skip the blank lines | ||
| // Skip empty lines when Import-CSV but keep empty lines when ConvertFromCSV |
There was a problem hiding this comment.
The bug (#24698) affects both ConvertFrom-Csv and Import-Csv; I think the fix should be applied to both commands for consistency. ConvertFrom-Csv was used throughout the original issue for demonstration purposes only.
Do we know if there's a specific reason why the check for values.Count == 1 && string.IsNullOrEmpty(values[0]) was included in the first place?
There was a problem hiding this comment.
This comes from Windows PowerShell.
There was a problem hiding this comment.
When I disable it for Import-CSV as well, `n's are captured as an extra element. Is this the behavior we want?
Describe "Import-Csv with different newlines" -Tags "CI" {
It "Test import-csv with '<name>' newline" -TestCases @(
@{ name = "CR"; newline = "`r" }
@{ name = "LF"; newline = "`n" }
@{ name = "CRLF"; newline = "`r`n" }
) {
param($newline)
$csvFile = Join-Path $TestDrive -ChildPath $((New-Guid).Guid)
$delimiter = ','
"h1,h2,h3$($newline)11,12,13$($newline)21,22,23$($newline)" | Out-File -FilePath $csvFile
$returnObject = Import-Csv -Path $csvFile -Delimiter $delimiter
$returnObject.Count | Should -Be 2
$returnObject[0].h1 | Should -Be 11
$returnObject[0].h2 | Should -Be 12
$returnObject[0].h3 | Should -Be 13
$returnObject[1].h1 | Should -Be 21
$returnObject[1].h2 | Should -Be 22
$returnObject[1].h3 | Should -Be 23
}
}There was a problem hiding this comment.
Is this the behavior we want?
No, it isn't.
This is more about ensuring whatever fix is chosen for #24698 is applied to both ConvertFrom-Csv and Import-Csv. In other words, the example CSV data below should deserialize consistently, irrespective of the source being a file or a string.
"P1"
""
""
""Can #24698 be fixed for both commands without introducing the newline issue?
There was a problem hiding this comment.
I'm thinking it can be deserialized consistently but would result in just one empty string value per header.
For example,
"P1"
""
""
""
# Would produce [pscustomobject] @{ P1 = '' } ignoring the other empty cells@'
"P1","P2"
"",
"",
'@ | ConvertFrom-Csv
# Would produce [pscustomobject] @{ P1 = ''; P2 = '' } ignoring the other empty cells againThis way we can produce a header only output consistently.
Thoughts?
There was a problem hiding this comment.
That's not what I had in mind when I filed the issue. For each non-empty line in input data, an object should be emitted by ConvertFrom-Csv/Import-Csv.
The following (using either command) produces 3 objects:
@'
"P1"
"a"
"b"
"c"
'@ | ConvertFrom-CsvTherefore, the following (using either command) should also produce 3 objects:
@'
"P1"
""
""
""
'@ | ConvertFrom-CsvLikewise, this produces 2 objects, because the empty line is correctly ignored:
@'
"P1"
"a"
"c"
'@ | ConvertFrom-CsvTherefore, this should also produce 2 objects:
@'
"P1"
""
""
'@ | ConvertFrom-Csv|
@jshigetomi Root of the issue in ParseNextRecord method. It does not distinguish between an input string without characters and a string with two double quotes PowerShell/src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs Lines 1609 to 1613 in bfdc498 Here PowerShell/src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs Lines 1390 to 1394 in bfdc498 But we get to Zugzwang - sometimes we want to, sometimes we don't. Of course, it would be possible to add more conditions, but this would require a more significant code change in same methods. My suggestion is not to complicate the code. If there are concerns, then make this change an experimental feature. |
|
@jshigetomi I checked my guess. I had to make a few more changes. You can see the result here: |
|
@iSazonov I'm seeing test failure for Import-CSV Maybe a warning to the user would be better for no value csvs? |
|
I don't see errors in my branch. |
|
Hmm I'm having some trouble reproducing your results. How are you resolving the brackets here? |
|
@jshigetomi Please look #25120 |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
@surfingoldelephant @iSazonov Thanks for the pointer for Set-Content. I was not aware of the newline behavior. I made the ConvertFrom-CSV and Import-CSV tests the same and both behave the same. |
|
|
||
| Describe "Import-Csv with empty and null values" { | ||
|
|
||
| Context 'Empty CSV Fields' { |
There was a problem hiding this comment.
Since the PR is about edge cases I think about additional tests:
- only header presents without newline
- header presents with one empty newline
And more:
- empty file (
-Headerparameter presents) - file with one empty line (
-Headerparameter presents)
There was a problem hiding this comment.
Just to be clear, we're expecting those tests to produce no output (zero custom objects).
Let's also include Import-Csv/ConvertFrom-Csv tests with multiple empty lines.
-
Header only
@' P1,P2 '@ | ConvertFrom-Csv
-
Header followed by one empty line
@' P1,P2 '@ | ConvertFrom-Csv
-
Header followed by multiple empty lines
@' P1,P2 '@ | ConvertFrom-Csv
-
Empty input and
-Headerspecified'' | ConvertFrom-Csv -Header P1, P2
-
One empty line and
-Headerspecified@' '@ | ConvertFrom-Csv -Header P1, P2
-
Multiple empty lines and
-Headerspecified@' '@ | ConvertFrom-Csv -Header P1, P2
Perhaps additional -Header tests to cover the fix:
@'
,
A1,A2
B1,B2
,
'@ | ConvertFrom-Csv -Header P1, P2
# Expected:
[pscustomobject] @{ P1 = ''; P2 = '' }
[pscustomobject] @{ P1 = 'A1'; P2 = 'A2' }
[pscustomobject] @{ P1 = 'B1'; P2 = 'B2' }
[pscustomobject] @{ P1 = ''; P2 = $null }
surfingoldelephant
left a comment
There was a problem hiding this comment.
I think we also need to take a look at no detected delimiter scenarios.
# Detected delimiter:
@'
P1,P2,P3
A1,,
B1,,
C1,,
'@ | ConvertFrom-Csv
# Current/expected behavior:
[pscustomobject] @{ P1 = 'A1'; P2 = ''; P3 = '' }
[pscustomobject] @{ P1 = 'B1'; P2 = ''; P3 = '' }
[pscustomobject] @{ P1 = 'C1'; P2 = ''; P3 = $null }# No detected delimiter:
@'
P1,P2,P3
A1
B1
C1
'@ | ConvertFrom-Csv
# Current/expected behavior:
[pscustomobject] @{ P1 = 'A1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'B1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'C1'; P2 = $null; P3 = $null }
@'
P1;P2;P3
A1
B1
C1
'@ | ConvertFrom-Csv -Delimiter ';'
# Current/expected behavior:
[pscustomobject] @{ P1 = 'A1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'B1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'C1'; P2 = $null; P3 = $null }Notice when there is no (detected) delimiter, the resultant value is always $null. Can we add tests for this?
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run PowerShell-Windows-Packaging-CI, PowerShell-CI-linux-packaging |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
There was a problem hiding this comment.
Pull request overview
This pull request improves the handling of empty and null values in CSV parsing commands (Import-Csv and ConvertFrom-Csv). The changes ensure that both cmdlets properly distinguish between truly empty lines (which should be skipped) and lines containing only delimiters (which represent rows with empty field values and should be processed).
Changes:
- Improved logic for detecting and handling blank lines in CSV parsing
- Added comprehensive test coverage for edge cases involving empty fields, null values, and various delimiter scenarios
- Refactored existing tests to reduce duplication and improve maintainability
- Fixed a spelling error in a test variable name
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs | Updated Import method to correctly identify blank lines (values.Count == 0) instead of incorrectly treating single empty strings as blank lines. Changed ParseNextRecord to return instead of break for clarity and added logic to prevent empty results from being added at the start of records. Changed loop condition from while(true) to while(!EOF) for clarity. |
| test/powershell/Modules/Microsoft.PowerShell.Utility/Import-Csv.Tests.ps1 | Fixed spelling of variable name (emptyValueCsv). Refactored duplicate test cases into a single loop-based test for better maintainability. Added extensive new test coverage for empty/null value handling across multiple contexts: empty CSV fields, header-only scenarios, empty input with -Header parameter, edge cases with whitespace and special characters, type information handling, delimiter edge cases, and large data scenarios. |
| test/powershell/Modules/Microsoft.PowerShell.Utility/ConvertFrom-Csv.Tests.ps1 | Added comprehensive test coverage matching the Import-Csv tests, covering the same edge cases but using ConvertFrom-Csv instead of Import-Csv, ensuring both cmdlets handle empty and null values consistently. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
test/powershell/Modules/Microsoft.PowerShell.Utility/ConvertFrom-Csv.Tests.ps1
Outdated
Show resolved
Hide resolved
…om-Csv.Tests.ps1 Co-authored-by: Copilot <[email protected]>
| It 'Should handle no delimiter with empty fields' { | ||
| $result = @' | ||
| P1,P2,P3 | ||
| A1 | ||
| B1 | ||
| C1 | ||
| '@ | ConvertFrom-Csv | ||
|
|
||
| $result.Count | Should -Be 3 | ||
| $result[0].P1 | Should -Be 'A1' | ||
| $result[0].P2 | Should -BeNullOrEmpty | ||
| $result[0].P3 | Should -BeNullOrEmpty | ||
| $result[1].P1 | Should -Be 'B1' | ||
| $result[1].P2 | Should -BeNullOrEmpty | ||
| $result[1].P3 | Should -BeNullOrEmpty | ||
| $result[2].P1 | Should -Be 'C1' | ||
| $result[2].P2 | Should -BeNullOrEmpty | ||
| $result[2].P3 | Should -BeNullOrEmpty | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
I think we need to watch out for -BeNullOrEmpty. That returns $true for both $null and empty strings (''), whereas most of the tests should be checking explicitly for $null or an empty string, not both.
For example, in the quoted test, P2 and P3 should only be $null, yet the test will pass if the value is ''.
I suggest replacing all instances of -BeNullOrEmpty to either -BeExactly $null or -BeExactly '' depending on the expected result.
Perhaps also update existing -Be usage to -BeExactly. It won't change the behavior in this case, but signals clearer intent.


PR Summary
Fixes: #24698
This pull request includes changes to improve the handling of empty and null values in the
ConvertFrom-Csvcommand and updates the corresponding unit tests to verify the new behavior. The most important changes include modifying theImportmethod inCsvCommands.csand adding new unit tests inConvertFrom-Csv.Tests.ps1.Enhancements to
ConvertFrom-Csvcommand:src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs: Updated theImportmethod to skip empty lines only when usingImport-CSV, but to keep them when usingConvertFromCSV.Unit tests updates:
test/powershell/Modules/Microsoft.PowerShell.Utility/ConvertFrom-Csv.Tests.ps1: Added new unit tests to verify thatConvertFrom-Csvhandles empty and null values correctly.PR Context
Improve CSV parsing edge case handling and add comprehensive unit tests
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.- [ ] Issue filed:
(which runs in a different PS Host).