Automating the world one-liner at a time…
In my blog post Processing .EML files using Select-String and SetCreationTIme() (http://blogs.msdn.com/powershell/archive/2006/11/23/processing-eml-files-with-select-string-and-setcreationtime.aspx ), I showed you could do some incredibly powerful stuff using 2 lines of PowerShell Script. The reason you can do this is because of the design of Select-String. Select-String is a very cool utility and can make you very productive so let’s drill into it. Note – I had a bug in that script which I’ll correct at the end.
I think you are going to enjoy this one but you’ll probably want to grab a cup of coffee before starting – you want to be alert and you don’t want to rush this. While this is a deep drill into Select-String, it is more than that as well. It is a drill-into the mindset and the patterns of Windows PowerShell. That is a lesson worth learning because you’ll use the lessons thousands of times in the future.
You've probably already realized that Select-String is an awesome tool. You've probably realized that you can specify a string and a wildcarded list of files to search for things. Let's use create some common text so that we are on searching against the same stuff. Here is a chewy little one-liner which dumps all of your aliases into a set of files. Note that in these examples, I’ll be using the ALIAS ss for Select-String. Also not that my aliases may be different than your aliases so your results may be different.
PS> cd c:\sstestPS> [int][char]"a"..[int][char]"z" |>> %{$c=[char]$_; gal ($c + "*") > ($c + "-alias.txt")}>>PS> dir Directory: Microsoft.PowerShell.Core\FileSystem::C:\SSTestMode LastWriteTime Length Name---- ------------- ------ -----a--- 11/23/2006 6:14 AM 930 a-alias.txt-a--- 11/23/2006 6:14 AM 2162 b-alias.txt-a--- 11/23/2006 6:14 AM 3086 c-alias.txt-a--- 11/23/2006 6:14 AM 776 d-alias.txt-a--- 11/23/2006 6:14 AM 1084 e-alias.txt-a--- 11/23/2006 6:14 AM 1084 f-alias.txt-a--- 11/23/2006 6:14 AM 4164 g-alias.txt-a--- 11/23/2006 6:14 AM 622 h-alias.txt-a--- 11/23/2006 6:14 AM 1238 i-alias.txt-a--- 11/23/2006 6:14 AM 0 j-alias.txt-a--- 11/23/2006 6:14 AM 468 k-alias.txt-a--- 11/23/2006 6:14 AM 776 l-alias.txt-a--- 11/23/2006 6:14 AM 1392 m-alias.txt-a--- 11/23/2006 6:14 AM 1238 n-alias.txt-a--- 11/23/2006 6:14 AM 468 o-alias.txt-a--- 11/23/2006 6:14 AM 930 p-alias.txt-a--- 11/23/2006 6:14 AM 0 q-alias.txt-a--- 11/23/2006 6:14 AM 2624 r-alias.txt-a--- 11/23/2006 6:14 AM 3394 s-alias.txt-a--- 11/23/2006 6:14 AM 776 t-alias.txt-a--- 11/23/2006 6:14 AM 468 u-alias.txt-a--- 11/23/2006 6:14 AM 0 v-alias.txt-a--- 11/23/2006 6:14 AM 622 w-alias.txt-a--- 11/23/2006 6:14 AM 0 x-alias.txt-a--- 11/23/2006 6:14 AM 0 y-alias.txt-a--- 11/23/2006 6:14 AM 0 z-alias.txt
Now let’s find a string. First we’ll look for all files then I’ll show you how you can specify which files you want to look for. You can do this by specifying a wildcard or a set of wildcards:
PS> ss drive *g-alias.txt:8:Alias gdr Get-PSDrivem-alias.txt:6:Alias mount New-PSDriven-alias.txt:5:Alias ndr New-PSDriver-alias.txt:4:Alias rdr Remove-PSDrivePS> ss drive g*g-alias.txt:8:Alias gdr Get-PSDrivePS> ss drive [a-m]*g-alias.txt:8:Alias gdr Get-PSDrivem-alias.txt:6:Alias mount New-PSDrivePS> ss drive g*,r*g-alias.txt:8:Alias gdr Get-PSDriver-alias.txt:4:Alias rdr Remove-PSDrive
In that example, we were looking for a string but you can specify a regular expression or set of regular expressions:
PS> ss " [a-r]dr" *g-alias.txt:8:Alias gdr Get-PSDriven-alias.txt:5:Alias ndr New-PSDriver-alias.txt:4:Alias rdr Remove-PSDrivePS> ss " [a-r]dr|mount" *g-alias.txt:8:Alias gdr Get-PSDrivem-alias.txt:6:Alias mount New-PSDriven-alias.txt:5:Alias ndr New-PSDriver-alias.txt:4:Alias rdr Remove-PSDrivePS> ss " [a-n]dr"," [m-z]dr" *g-alias.txt:8:Alias gdr Get-PSDriven-alias.txt:5:Alias ndr New-PSDriver-alias.txt:4:Alias rdr Remove-PSDrive
NOTE: If you don’t want to select-string to interpret the search string as a regular expression, you merely have to add the switch –SimpleMatch and we’ll use the string as a literal.
It is probably pretty obvious that you can control which files you search against by specifying the correct wildcard in the file specification but did you also realize that you can tweak this with –INCLUDE and –EXCLUDE? These parameters take a wildcard expression or set of wildcard expressions and operate AFTER the filepath is resolved. Here is an example of exclude:
PS> ss drive * -exclude *[mn]-al*g-alias.txt:8:Alias gdr Get-PSDriver-alias.txt:4:Alias rdr Remove-PSDrivePS> ss drive * -exclude [mn]-al*g-alias.txt:8:Alias gdr Get-PSDrivem-alias.txt:6:Alias mount New-PSDriven-alias.txt:5:Alias ndr New-PSDriver-alias.txt:4:Alias rdr Remove-PSDrive
Now take a second and examine the differences between those two and ask yourself the question, why didn’t the second request exclude the files that started with M or N?
The answer is that –Exclude and –Include work against the FULL PATHNAME (C:\sstest\m-alias.txt) not the ChildName.(m-alias.txt).
The benefits of this become clearer when you begin to explore the pipeline scenarios. Select-String can accept pipeline data from any command that
1. Produces objects that have a PATH property.
2. Produce FileInfo or MatchInfo Objects
a. Actually it will accept any object from the pipeline but it is only going to actually do work when the object is one of these. Now as a side note, you might ask why we accept any object. There is a long answer that I won’t go into but let me tease you with a feature that we might add in the future and then you can connect the dots. Imagine the following command:PS> Get-Process |Select-String Office –Property Descriptionwhere this would create MATCHINFO record for the equivalent:PS> Get-Process |Select Description |where {$_ -match “*Description*”
PS> dir [g-r]* Directory: Microsoft.PowerShell.Core\FileSystem::C:\SSTestMode LastWriteTime Length Name---- ------------- ------ -----a--- 11/23/2006 6:26 AM 2760 g-alias.txt-a--- 11/23/2006 6:26 AM 414 h-alias.txt-a--- 11/23/2006 6:26 AM 822 i-alias.txt-a--- 11/23/2006 6:26 AM 0 j-alias.txt-a--- 11/23/2006 6:26 AM 312 k-alias.txt-a--- 11/23/2006 6:26 AM 516 l-alias.txt-a--- 11/23/2006 6:26 AM 924 m-alias.txt-a--- 11/23/2006 6:26 AM 822 n-alias.txt-a--- 11/23/2006 6:26 AM 312 o-alias.txt-a--- 11/23/2006 6:26 AM 618 p-alias.txt-a--- 11/23/2006 6:26 AM 0 q-alias.txt-a--- 11/23/2006 6:26 AM 1740 r-alias.txtPS> dir [g-r]* |ss driveg-alias.txt:8:Alias gdr Get-PSDrivem-alias.txt:6:Alias mount New-PSDriven-alias.txt:5:Alias ndr New-PSDriver-alias.txt:4:Alias rdr Remove-PSDrivePS> ([System.IO.DirectoryInfo]"c:\sstest").GetFileSystemInfos()Mode LastWriteTime Length Name---- ------------- ------ -----a--- 11/23/2006 6:26 AM 618 a-alias.txt-a--- 11/23/2006 6:26 AM 1434 b-alias.txt-a--- 11/23/2006 6:26 AM 2046 c-alias.txt-a--- 11/23/2006 6:26 AM 516 d-alias.txt-a--- 11/23/2006 6:26 AM 720 e-alias.txt-a--- 11/23/2006 6:26 AM 720 f-alias.txt-a--- 11/23/2006 6:26 AM 2760 g-alias.txt-a--- 11/23/2006 6:26 AM 414 h-alias.txt-a--- 11/23/2006 6:26 AM 822 i-alias.txt-a--- 11/23/2006 6:26 AM 0 j-alias.txt-a--- 11/23/2006 6:26 AM 312 k-alias.txt-a--- 11/23/2006 6:26 AM 516 l-alias.txt-a--- 11/23/2006 6:26 AM 924 m-alias.txt-a--- 11/23/2006 6:26 AM 822 n-alias.txt-a--- 11/23/2006 6:26 AM 312 o-alias.txt-a--- 11/23/2006 6:26 AM 618 p-alias.txt-a--- 11/23/2006 6:26 AM 0 q-alias.txt-a--- 11/23/2006 6:26 AM 1740 r-alias.txt-a--- 11/23/2006 6:26 AM 2250 s-alias.txt-a--- 11/23/2006 6:26 AM 516 t-alias.txt-a--- 11/23/2006 6:26 AM 312 u-alias.txt-a--- 11/23/2006 6:26 AM 0 v-alias.txt-a--- 11/23/2006 6:26 AM 414 w-alias.txt-a--- 11/23/2006 6:26 AM 0 x-alias.txt-a--- 11/23/2006 6:26 AM 0 y-alias.txt-a--- 11/23/2006 6:26 AM 0 z-alias.txtPS> ([System.IO.DirectoryInfo]"c:\sstest").GetFileSystemInfos() |>> ss drive>>c:\sstest\g-alias.txt:8:Alias gdr Get-PSDrivec:\sstest\m-alias.txt:6:Alias mount New-PSDrivec:\sstest\n-alias.txt:5:Alias ndr New-PSDrivec:\sstest\r-alias.txt:4:Alias rdr Remove-PSDrive
Here is where it gets fun! While the above statement is true, at this point the only command that produces these is Select-String. Let's do a manual text substitution and restate the heading:
You can pipe the output of Select-String into Select-String
Yes – that’s exactly correct and the implications are awesome. First it means that you can avoid some really hairy/scary regular expressions for certain things. The regularity of the data we are working with doesn’t allow a very interesting example but here is one:
PS> ss get *|ss itemd-alias.txt:6:Alias dir Get-ChildItemg-alias.txt:6:Alias gci Get-ChildItemg-alias.txt:10:Alias gi Get-Itemg-alias.txt:13:Alias gp Get-ItemPropertyl-alias.txt:5:Alias ls Get-ChildItem
What is so cool about this is that the second Select-String is ONLY searching the results of the first Select-String (i.e. it is not reprocessing the file itself). This is great because it allows you to store the results of one long and expensive Select-String and then do a set of very quick and cheap searches against those results. In the example below I search all the txt files underneath the Windows Directory and store the results in a global variable. I then use that global variable to search for the same term a second time and show that it is 31 times faster to do it the second time. Of course the point would be that you’d be looking for other things. This makes it easy to iterate through a series of investigations. We want you to explore you system so making it cheap to do so is important to us.
PS> (Measure-Command { >> $global:x = dir c:\windows *.txt -recurse -ea SilentlyContinue |>> ss PowerShell>> }>> ).TotalMilliseconds>>Select-String : The file can not be read: C:\windows\Tasks\SCHEDLGU.TXTAt line:3 char:3+ ss <<<< PowerShell5111.9712PS> (Measure-Command {>> $global:x |>> ss PowerShell>> }>> ).TotalMilliseconds>>166.9715PS> [int](5111/166)31
As with most Cmdlets, the output of Select-String looks like texts but is actually a stream of objects with a text rendering. This is great because then you can party with the object.
PS> ss stop *k-alias.txt:4:Alias kill Stop-Processs-alias.txt:13:Alias spps Stop-Processs-alias.txt:14:Alias spsv Stop-ServicePS> ss Stop * |get-Member -MemberType Property TypeName: Microsoft.PowerShell.Commands.MatchInfoName MemberType Definition---- ---------- ----------Filename Property System.String Filename {get;}IgnoreCase Property System.Boolean IgnoreCase {get;set;}Line Property System.String Line {get;set;}LineNumber Property System.Int32 LineNumber {get;set;}Path Property System.String Path {get;set;}Pattern Property System.String Pattern {get;set;}PS> ss "stop","new.*ve" *|fl *IgnoreCase : TrueLineNumber : 4Line : Alias kill Stop-ProcessFilename : k-alias.txtPath : C:\SSTest\k-alias.txtPattern : stopIgnoreCase : TrueLineNumber : 6Line : Alias mount New-PSDriveFilename : m-alias.txtPath : C:\SSTest\m-alias.txtPattern : new.*veIgnoreCase : TrueLineNumber : 5Line : Alias ndr New-PSDriveFilename : n-alias.txtPath : C:\SSTest\n-alias.txtPattern : new.*veIgnoreCase : TrueLineNumber : 13Line : Alias spps Stop-ProcessFilename : s-alias.txtPath : C:\SSTest\s-alias.txtPattern : stopIgnoreCase : TrueLineNumber : 14Line : Alias spsv Stop-ServiceFilename : s-alias.txtPath : C:\SSTest\s-alias.txtPattern : stopPS> ss "stop","new.*ve" *|group PatternCount Name Group----- ---- ----- 3 stop {k-alias.txt, s-alias.txt, s-alias.txt} 2 new.*ve {m-alias.txt, n-alias.txt}
So yes you can party with the objects but it’s not about partying is it? It’s about being incredibly productive.
In my blog http://blogs.msdn.com/powershell/archive/2006/11/23/processing-eml-files-with-select-string-and-setcreationtime.aspx we did the following:
foreach ($record in Select-String ^Date: *.eml) { [System.IO.File]::SetCreationTime($Record.Path, [datetime]($record.line.substring(6)))}
While this works – there is a potential bug here. Image the case that a particular file has multiple lines that match “^Date:”. The scripte would set the CreateTime of the file for each of the lines. What we want is just the very first match. That is exactly what the switch -List does.
PS> ss item r*r-alias.txt:5:Alias ri Remove-Itemr-alias.txt:6:Alias rni Rename-Itemr-alias.txt:7:Alias rnp Rename-ItemPr...r-alias.txt:8:Alias rp Remove-ItemPr...r-alias.txt:13:Alias rm Remove-Itemr-alias.txt:14:Alias rmdir Remove-Itemr-alias.txt:15:Alias rd Remove-Itemr-alias.txt:16:Alias ren Rename-ItemPS> ss item r* -listr-alias.txt:5:Alias ri Remove-Item
So the correct script should have been:
foreach ($record in Select-String ^Date: *.eml -List) { [System.IO.File]::SetCreationTime($Record.Path, [datetime]($record.line.substring(6)))}
Enjoy and have a good Thanksgiving.
Jeffrey Snover [MSFT]Windows PowerShell/MMC ArchitectVisit the Windows PowerShell Team blog at: http://blogs.msdn.com/PowerShellVisit the Windows PowerShell ScriptCenter at: http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx
Still getting to grips with the juicyness of select-string and powershell generally - being able to use real .Net regexes means I can do this:
gci . -i *.cs -r | select-string "(?<!//.*)queryflag"
to find all instances of the string "queryflag" when it hasn't been commented out - and suddenly negative look-behind makes sense to me.
But why do I have to pipe my files in from gci? Wouldn't a -recurse flag make sense for select-string, just as for findstr??
I like how select-string only returns ONE match. WTF?? What if you have a bunch of things in your log like "FAIL" or "ERROR"? Why would I only want to see the first match? LAME-O! You fail, Powershell. Why don't they just make a 'grep' like Linux?
Just use the -AllMatches option. You're the one failing, Anony Mouse.