Let's do a deep drill into Select-String

Let's do a deep drill into Select-String

  • Comments 3

In my blog post Processing .EML files using Select-String and SetCreationTIme() (http://blogs.msdn.com/powershell/archive/2006/11/23/processing-eml-files-with-select-string-and-setcreationtime.aspx ), I showed you could do some incredibly powerful stuff using 2 lines of PowerShell Script.  The reason you can do this is because of the design of Select-String.  Select-String is a very cool utility and can make you very productive so let’s drill into it.  Note – I had a bug in that script which I’ll correct at the end.

I think you are going to enjoy this one but you’ll probably want to grab a cup of coffee before starting – you want to be alert and you don’t want to rush this.  While this is a deep drill into Select-String, it is more than that as well.  It is a drill-into the mindset and the patterns of Windows PowerShell.  That is a lesson worth learning because you’ll use the lessons thousands of times in the future.

You've probably already realized that Select-String is an awesome tool.  You've probably realized that you can specify a string and a wildcarded list of files to search for things.  Let's use create some common text so that we are on searching against the same stuff.  Here is a chewy little one-liner which dumps all of your aliases into a set of files.  Note that in these examples, I’ll be using the ALIAS ss for Select-String. Also not that my aliases may be different than your aliases so your results may be different.

PS> cd c:\sstest
PS> [int][char]"a"..[int][char]"z" |
>> %{$c=[char]$_; gal ($c + "*") > ($c + "-alias.txt")}
>>
PS> dir


    Directory: Microsoft.PowerShell.Core\FileSystem::C:\SSTest


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        11/23/2006   6:14 AM        930 a-alias.txt
-a---        11/23/2006   6:14 AM       2162 b-alias.txt
-a---        11/23/2006   6:14 AM       3086 c-alias.txt
-a---        11/23/2006   6:14 AM        776 d-alias.txt
-a---        11/23/2006   6:14 AM       1084 e-alias.txt
-a---        11/23/2006   6:14 AM       1084 f-alias.txt
-a---        11/23/2006   6:14 AM       4164 g-alias.txt
-a---        11/23/2006   6:14 AM        622 h-alias.txt
-a---        11/23/2006   6:14 AM       1238 i-alias.txt
-a---        11/23/2006   6:14 AM          0 j-alias.txt
-a---        11/23/2006   6:14 AM        468 k-alias.txt
-a---        11/23/2006   6:14 AM        776 l-alias.txt
-a---        11/23/2006   6:14 AM       1392 m-alias.txt
-a---        11/23/2006   6:14 AM       1238 n-alias.txt
-a---        11/23/2006   6:14 AM        468 o-alias.txt
-a---        11/23/2006   6:14 AM        930 p-alias.txt
-a---        11/23/2006   6:14 AM          0 q-alias.txt
-a---        11/23/2006   6:14 AM       2624 r-alias.txt
-a---        11/23/2006   6:14 AM       3394 s-alias.txt
-a---        11/23/2006   6:14 AM        776 t-alias.txt
-a---        11/23/2006   6:14 AM        468 u-alias.txt
-a---        11/23/2006   6:14 AM          0 v-alias.txt
-a---        11/23/2006   6:14 AM        622 w-alias.txt
-a---        11/23/2006   6:14 AM          0 x-alias.txt
-a---        11/23/2006   6:14 AM          0 y-alias.txt
-a---        11/23/2006   6:14 AM          0 z-alias.txt


Select-String accepts arrays of wildcards to specify Files

Now let’s find a string.  First we’ll look for all files then I’ll show you how you can specify which files you want to look for.  You can do this by specifying a wildcard or a set of wildcards:

PS> ss drive *

g-alias.txt:8:Alias           gdr              Get-PSDrive
m-alias.txt:6:Alias           mount            New-PSDrive
n-alias.txt:5:Alias           ndr              New-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive


PS> ss drive g*

g-alias.txt:8:Alias           gdr              Get-PSDrive


PS> ss drive [a-m]*

g-alias.txt:8:Alias           gdr              Get-PSDrive
m-alias.txt:6:Alias           mount            New-PSDrive


PS> ss drive g*,r*

g-alias.txt:8:Alias           gdr              Get-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive

 

Select-String accepts arrays of regular expressions to specify STRINGS

In that example, we were looking for a string but you can specify a regular expression or set of regular expressions:

PS> ss " [a-r]dr" *

g-alias.txt:8:Alias           gdr              Get-PSDrive
n-alias.txt:5:Alias           ndr              New-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive


PS> ss " [a-r]dr|mount" *

g-alias.txt:8:Alias           gdr              Get-PSDrive
m-alias.txt:6:Alias           mount            New-PSDrive
n-alias.txt:5:Alias           ndr              New-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive


PS> ss " [a-n]dr"," [m-z]dr" *

g-alias.txt:8:Alias           gdr              Get-PSDrive
n-alias.txt:5:Alias           ndr              New-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive

NOTE: If you don’t want to select-string to interpret the search string as a regular expression, you merely have to add the switch –SimpleMatch and we’ll use the string as a literal.

 

Select-String accepts –Include and –Exclude to tweak which files it operates on

It is probably pretty obvious that you can control which files you search against by specifying the correct wildcard in the file specification but did you also realize that you can tweak this with –INCLUDE and –EXCLUDE?   These parameters take a wildcard expression or set of wildcard expressions and operate AFTER the filepath is resolved.  Here is an example of exclude:

PS> ss drive * -exclude *[mn]-al*

g-alias.txt:8:Alias           gdr              Get-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive


PS> ss drive * -exclude [mn]-al*

g-alias.txt:8:Alias           gdr              Get-PSDrive
m-alias.txt:6:Alias           mount            New-PSDrive
n-alias.txt:5:Alias           ndr              New-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive

Now take a second and examine the differences between those two and ask yourself the question, why didn’t the second request exclude the files that started with M or N?

The answer is that –Exclude and –Include work against the FULL PATHNAME (C:\sstest\m-alias.txt) not the ChildName.(m-alias.txt).

The benefits of this become clearer when you begin to explore the pipeline scenarios.  Select-String can accept pipeline data from any command that

1.       Produces objects that have a PATH property.

2.       Produce FileInfo or MatchInfo Objects

a.       Actually it will accept any object from the pipeline but it is only going to actually do work when the object is one of these.  Now as a side note, you might ask why we accept any object.  There is a long answer that I won’t go into but let me tease you with a feature that we might add in the future and then you can connect the dots.  Imagine the following command:

PS> Get-Process |Select-String  Office –Property Description
where this would create MATCHINFO record for the equivalent:
PS> Get-Process |Select Description |where {$_ -match “*Description*”

 

You can pipe anything that produces FileInfo objects into Select-String.

 

PS> dir [g-r]*
    Directory: Microsoft.PowerShell.Core\FileSystem::C:\SSTest

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        11/23/2006   6:26 AM       2760 g-alias.txt
-a---        11/23/2006   6:26 AM        414 h-alias.txt
-a---        11/23/2006   6:26 AM        822 i-alias.txt
-a---        11/23/2006   6:26 AM          0 j-alias.txt
-a---        11/23/2006   6:26 AM        312 k-alias.txt
-a---        11/23/2006   6:26 AM        516 l-alias.txt
-a---        11/23/2006   6:26 AM        924 m-alias.txt
-a---        11/23/2006   6:26 AM        822 n-alias.txt
-a---        11/23/2006   6:26 AM        312 o-alias.txt
-a---        11/23/2006   6:26 AM        618 p-alias.txt
-a---        11/23/2006   6:26 AM          0 q-alias.txt
-a---        11/23/2006   6:26 AM       1740 r-alias.txt


PS> dir [g-r]* |ss drive
g-alias.txt:8:Alias           gdr              Get-PSDrive
m-alias.txt:6:Alias           mount            New-PSDrive
n-alias.txt:5:Alias           ndr              New-PSDrive
r-alias.txt:4:Alias           rdr              Remove-PSDrive

PS> ([System.IO.DirectoryInfo]"c:\sstest").GetFileSystemInfos()
Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        11/23/2006   6:26 AM        618 a-alias.txt
-a---        11/23/2006   6:26 AM       1434 b-alias.txt
-a---        11/23/2006   6:26 AM       2046 c-alias.txt
-a---        11/23/2006   6:26 AM        516 d-alias.txt
-a---        11/23/2006   6:26 AM        720 e-alias.txt
-a---        11/23/2006   6:26 AM        720 f-alias.txt
-a---        11/23/2006   6:26 AM       2760 g-alias.txt
-a---        11/23/2006   6:26 AM        414 h-alias.txt
-a---        11/23/2006   6:26 AM        822 i-alias.txt
-a---        11/23/2006   6:26 AM          0 j-alias.txt
-a---        11/23/2006   6:26 AM        312 k-alias.txt
-a---        11/23/2006   6:26 AM        516 l-alias.txt
-a---        11/23/2006   6:26 AM        924 m-alias.txt
-a---        11/23/2006   6:26 AM        822 n-alias.txt
-a---        11/23/2006   6:26 AM        312 o-alias.txt
-a---        11/23/2006   6:26 AM        618 p-alias.txt
-a---        11/23/2006   6:26 AM          0 q-alias.txt
-a---        11/23/2006   6:26 AM       1740 r-alias.txt
-a---        11/23/2006   6:26 AM       2250 s-alias.txt
-a---        11/23/2006   6:26 AM        516 t-alias.txt
-a---        11/23/2006   6:26 AM        312 u-alias.txt
-a---        11/23/2006   6:26 AM          0 v-alias.txt
-a---        11/23/2006   6:26 AM        414 w-alias.txt
-a---        11/23/2006   6:26 AM          0 x-alias.txt
-a---        11/23/2006   6:26 AM          0 y-alias.txt
-a---        11/23/2006   6:26 AM          0 z-alias.txt


PS> ([System.IO.DirectoryInfo]"c:\sstest").GetFileSystemInfos() |
>> ss drive
>>
c:\sstest\g-alias.txt:8:Alias           gdr              Get-PSDrive
c:\sstest\m-alias.txt:6:Alias           mount            New-PSDrive
c:\sstest\n-alias.txt:5:Alias           ndr              New-PSDrive
c:\sstest\r-alias.txt:4:Alias           rdr              Remove-PSDrive

 

 

You can pipe anything that produces MatchInfo objects into Select-String.

 

Here is where it gets fun!  While the above statement is true, at this point the only command that produces these is Select-String.  Let's do a manual text substitution and restate the heading:

You can pipe the output of Select-String into Select-String

Yes – that’s exactly correct and the implications are awesome.  First it means that you can avoid some really hairy/scary regular expressions for certain things.  The regularity of the data we are working with doesn’t allow a very interesting example but here is one:

PS> ss get *|ss item

d-alias.txt:6:Alias           dir              Get-ChildItem
g-alias.txt:6:Alias           gci              Get-ChildItem
g-alias.txt:10:Alias           gi               Get-Item
g-alias.txt:13:Alias           gp               Get-ItemProperty
l-alias.txt:5:Alias           ls               Get-ChildItem

 

What is so cool about this is that the second Select-String is ONLY searching the results of the first Select-String (i.e. it is not reprocessing the file itself).  This is great because it allows you to store the results of one long and expensive Select-String and then do a set of very quick and cheap searches against those results.   In the example below I search all the txt files underneath the Windows Directory and store the results in a global variable.  I then use that global variable to search for the same term a second time and show that it is 31 times faster to do it the second time.  Of course the point would be that you’d be looking for other things.  This makes it easy to iterate through a series of investigations.  We want you to explore you system so making it cheap to do so is important to us.

PS> (Measure-Command {
>> $global:x = dir c:\windows *.txt -recurse -ea SilentlyContinue |
>> ss PowerShell
>> }
>> ).TotalMilliseconds
>>
Select-String : The file can not be read: C:\windows\Tasks\SCHEDLGU.TXT
At line:3 char:3
+ ss  <<<< PowerShell
5111.9712
PS> (Measure-Command {
>> $global:x |
>> ss PowerShell
>> }
>> ).TotalMilliseconds
>>
166.9715
PS> [int](5111/166)
31
 

Select-String outputs MatchInfo objects not Strings

 

As with most Cmdlets, the output of Select-String looks like texts but is actually a stream of objects with a text rendering.  This is great because then you can party with the object.

PS> ss stop *

k-alias.txt:4:Alias           kill             Stop-Process
s-alias.txt:13:Alias           spps             Stop-Process
s-alias.txt:14:Alias           spsv             Stop-Service


PS> ss Stop * |get-Member -MemberType Property


   TypeName: Microsoft.PowerShell.Commands.MatchInfo

Name       MemberType Definition
----       ---------- ----------
Filename   Property   System.String Filename {get;}
IgnoreCase Property   System.Boolean IgnoreCase {get;set;}
Line       Property   System.String Line {get;set;}
LineNumber Property   System.Int32 LineNumber {get;set;}
Path       Property   System.String Path {get;set;}
Pattern    Property   System.String Pattern {get;set;}


PS> ss "stop","new.*ve" *|fl *


IgnoreCase : True
LineNumber : 4
Line       : Alias           kill             Stop-Process
Filename   : k-alias.txt
Path       : C:\SSTest\k-alias.txt
Pattern    : stop

IgnoreCase : True
LineNumber : 6
Line       : Alias           mount            New-PSDrive
Filename   : m-alias.txt
Path       : C:\SSTest\m-alias.txt
Pattern    : new.*ve

IgnoreCase : True
LineNumber : 5
Line       : Alias           ndr              New-PSDrive
Filename   : n-alias.txt
Path       : C:\SSTest\n-alias.txt
Pattern    : new.*ve

IgnoreCase : True
LineNumber : 13
Line       : Alias           spps             Stop-Process
Filename   : s-alias.txt
Path       : C:\SSTest\s-alias.txt
Pattern    : stop

IgnoreCase : True
LineNumber : 14
Line       : Alias           spsv             Stop-Service
Filename   : s-alias.txt
Path       : C:\SSTest\s-alias.txt
Pattern    : stop



PS> ss "stop","new.*ve" *|group Pattern

Count Name                      Group
----- ----                      -----
    3 stop                      {k-alias.txt, s-alias.txt, s-alias.txt}
    2 new.*ve                   {m-alias.txt, n-alias.txt}

So yes you can party with the objects but it’s not about partying is it?   It’s about being incredibly productive.  

Sometimes you just want the first match in a file

In my blog http://blogs.msdn.com/powershell/archive/2006/11/23/processing-eml-files-with-select-string-and-setcreationtime.aspx we did the following:

foreach ($record in Select-String ^Date: *.eml) {
  [System.IO.File]::SetCreationTime($Record.Path, [datetime]($record.line.substring(6)))
}

While this works – there is a potential bug here.  Image the case that a particular file has multiple lines that match “^Date:”.  The scripte would set the CreateTime of the file for each of the lines.  What we want is just the very first match.  That is exactly what the switch -List does.

PS> ss item r*

r-alias.txt:5:Alias           ri               Remove-Item
r-alias.txt:6:Alias           rni              Rename-Item
r-alias.txt:7:Alias           rnp              Rename-ItemPr...
r-alias.txt:8:Alias           rp               Remove-ItemPr...
r-alias.txt:13:Alias           rm               Remove-Item
r-alias.txt:14:Alias           rmdir            Remove-Item
r-alias.txt:15:Alias           rd               Remove-Item
r-alias.txt:16:Alias           ren              Rename-Item


PS> ss item r* -list

r-alias.txt:5:Alias           ri               Remove-Item

So the correct script should have been:

foreach ($record in Select-String ^Date: *.eml -List) {
  [System.IO.File]::SetCreationTime($Record.Path, [datetime]($record.line.substring(6)))
}

 

Enjoy  and have a good Thanksgiving.

Jeffrey Snover [MSFT]
Windows PowerShell/MMC Architect
Visit the Windows PowerShell Team blog at:    http://blogs.msdn.com/PowerShell
Visit the Windows PowerShell ScriptCenter at:  http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx

Leave a Comment
  • Please add 7 and 2 and type the answer here:
  • Post
  • Still getting to grips with the juicyness of select-string and powershell generally - being able to use real .Net regexes means I can do this:

    gci . -i *.cs -r | select-string "(?<!//.*)queryflag"

    to find all instances of the string "queryflag" when it hasn't been commented out - and suddenly negative look-behind makes sense to me.

    But why do I have to pipe my files in from gci? Wouldn't a -recurse flag make sense for select-string, just as for findstr??

  • I like how select-string only returns ONE match.  WTF??  What if you have a bunch of things in your log like "FAIL" or "ERROR"?  Why would I only want to see the first match? LAME-O! You fail, Powershell.  Why don't they just make a 'grep' like Linux?  

  • Just use the -AllMatches option.  You're the one failing, Anony Mouse.

Page 1 of 1 (3 items)