Note that there are some explanatory texts on larger screens.

plurals
  1. POAnalyzing CSV file looking for trends or abberations
    primarykey
    data
    text
    <p>I'm often faced with data (spreadsheets, config, etc) I have to analyze to try and find what might be causing things to happen. Sometimes good things, but usually bad things and often urgently in data I've never looked at before and may be unfamiliar with generally.</p> <p>I tried looking for an advanced analysis tool, something that will look for repeated phrases or other things that might make it easier to generically understand trends in the data, but couldn't find anything.</p> <p>I'm posting for two reasons</p> <ul> <li>I'm hoping for a recommendation on some kind of software that can do this kind of analysis</li> <li>I wrote a powershell script that does a very basic analysis, I wanted to share it and I'm hoping for improvements to it (including encapsulating it into a function).</li> </ul> <p>The code I came up with just counts the number of times each entry shows up in each column, sorts based on that count, and outputs formatted results.</p> <blockquote><pre> #Before You Begin, Set the following $SourceFile = Get-ChildItem ".\SomeFile.csv" $OutputFile = &{$d=$SourceFile.Directory; $n=$SourceFile.BaseName; $e=$SourceFile.Extension; "$d\$n"+"_Stats"+"$e"} #This just appends _Stats to the source filename #$Data = gci . #For Testing $Data = Import-Csv $SourceFile $ColumnList = $Data|Get-Member|where-object{$_.MemberType -eq "NoteProperty"}|ForEach-Object{$_.Name} $CountedData = $ColumnList|ForEach-Object{ $ThisColumn = $_; $Data|Group-Object $ThisColumn|Select-Object @{ n="ColumnName"; e={$ThisColumn} },Count, @{ n="Value"; e={$_.Name} } }|Sort -Descending Count,ColumnName,Value #ColumnName, Count, Value $Results="" $CountedData|Group-Object ColumnName|ForEach-Object{ $ThisColumn=$_.name; $ThisGroup=$_.Group; $Results="$Results`n$ThisColumn"; $ThisGroup|ForEach-Object{ $ThisCount=$_.Count; $ThisValue=$_.Value; $Results=$Results+",($ThisCount) $ThisValue" } } $Results|Out-File $OutputFile start $SourceFile.Directory </pre></blockquote>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload