UPDATE: Don’t read this post!! –>  Read this one instead

I originally titled this “Kicking CSV in the Balls with PowerShell boots” but that was too violent, and too dramatic.  Then I thought about “Giving CSV a Happy Rub-Down with PowerShell Lotion” but that felt creepy.  So I went for the boring technically-apropos title.

1wearandtear

This was one of those “little things that drives someone (me) to stick their head in a sink filled with ice water and sing ‘God bless America’ until oxygen runs out”.  Go ahead, make fun of me, that’s okay. I do.  My dog does.  The confused people in the adjacent hotel room probably would. Anyhow…

I ran into a challenge that involved reading CSV files to extract specific column data by column name.  I ran across a bunch of articles, some with good advice and some with bad advice (note: those who say use Get-Content to parse column names by leveraging .Split(“,”) are ignoring CSV that contains quoted/nested commas).  So, this article kicked one of my functional brain cells right in the nuts, and got me on the right path (thank you!)

Example CSV data:

ID,Department,"Last,First Name",City
001,Marketing,"Jorgan,Hugh",Miami
002,Sales,"Phisting,Anita","New York"
003,Engineering,"McCracken,Phill","San Francisco"

You get the idea.  So, I wanted to be able to pull all of the values that are associated with the “Last,First Name” column.  The problem is this…

  • Import-CSV returns a custom object type, not a Hash table or Array
  • Without a native hash table, it’s difficult to “easily” extract row elements by name (i.e. Javascript getElementById, or PowerShell $hash.Item(), or the implicit alias thereof)
  • Parsing CSV by fetching the first row (column names) using Get-Content and parsing via Split() method is unreliable when column names contain nested commas.

The Code:

param (
  [parameter(Mandatory=$True)] [string] $CsvFile,
  [parameter(Mandatory=$True)] [string] $Column
)

# returns an array of logical column names using first "row"

function Get-CsvColumns {
  param ($csvData)
  Get-Member -InputObject $csvdata[0] |
    Where-Object {$_.MemberType -eq "NoteProperty"} |
      Select-Object -ExpandProperty Name
}

# returns the integer index for the element position
# among the logical array of column names

function Get-CsvColumnIndex {
  param ($column, $array)
  $cols = Get-CsvColumns $array
  $cindex = [array]::IndexOf($cols,"$column")
  return $cindex
}

if (Test-Path $CsvFile) {
  $fdata  = Import-Csv $CsvFile
  $colums = Get-CsvColumns -csvData $fdata
  $index  = Get-CsvColumnIndex -column "$column" -array $fdata
  foreach ($row in $fdata) {
    $v = $row.psobject.Properties.value[$index]
    # replace write-output with whatever you want, I dont care right now...
    Write-Output $v
  }
}
else {
  Write-Output "error: $CsvFile not found"
}

So, there you have it.  To run it…

.\Get-CsvData.ps1 -CsvFile ".\csvtest.csv" -Column "Last,First Name"

Jorgan, Hugh
Phisting, Anita
McCracken, Phill

And it will spit out the values for that logical “column” in the CSV file.

No more than 30 seconds after I stood up, and stretched my arms out to the sides and yelled “yes!!!  I have kicked this in the nuts!!” did it dawn on me that I could’ve simply invoked ADO and run it all via a T-SQL horse and buggy.  Damn it.  Victories fade quick.

Now, if you’ll excuse me, I need to pass out.

Enjoy!

Advertisements

2 thoughts on “CSV Column Extraction with PowerShell

  1. That is indeed an interesting problem and one that I had not encountered before. Another approach is to apply your own header, parse out the comma separated property and write a new object to the pipeline. My csv file uses the data from your example.

    Get-Content c:\work\in.csv | Select -Skip 1 |
    ConvertFrom-csv -Header “ID”,”Department”,”Name”,”City” |
    Select *,
    @{Name=”Givenname”;Expression = {($_.Name -split “,”)[1]}},
    @{Name=”Surname”;Expression = {($_.Name -split “,”)[0]}} -ExcludeProperty Name

    This gives me an object like:
    ID : 001
    Department : Marketing
    City : Miami
    Givenname : Hugh
    Surname : Jorgan

    With this I can pipe to Select-Object to get just the first and last names.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s