Why is Get-ChildItem so Slow?

Why is Get-ChildItem so Slow?

  • Comments 10

We get this question fairly frequently when it comes to slow network connections.

The performance of directory listings (especially on a laggy network) are limited by the .NET APIs we call to retrieve the directory information. There are two limitations to the current set of APIs:

Forced Retrieval of Attributes

When we do a directory listing, we show the standard attributes of the file or directory: Mode, LastWriteTime, Length, and Name. The core Windows API is highly optimized for this basic scenario, and returns these attributes by default along with the rest of the file information. However, the .NET Framework doesn’t take advantage of this data, and instead goes back to the network location to ask for all of the file attributes. This chatty behaviour adds a handful of network round trips for each file or directory, making the directory listing many times slower: hundreds or thousands of times slower in many cases. The Framework team addressed this as part of .NET 4.0, and you’ll see the benefits of this new feature as soon as we are able to adopt it.

In version two, even without the benefit of the new .NET API, though, you’ll see a huge improvement in wildcarded directory listings (both local and remote.)

As a background, PowerShell wildcards are different than straight cmd.exe wildcards. For example, PowerShell wildcards do not match the 8.3 short file name, while the native filesystem filtering (exposed by cmd.exe wildcards) do. PowerShell’s wildcards support character ranges, while the native file system filtering support does not. Because of this, PowerShell wildcard processing happens AFTER we’ve retrieved all of the files.

This comes at a cost, however. Native file system filtering (as exposed by the –Filter parameter) is MUCH faster, as its processing is wired into the Windows file system.

In version two, we did a bunch of work to resolve this strain. When you provide a PowerShell wildcard, we convert as much of it as possible to a native filesystem filter, and then apply our wildcarding logic to the much smaller set of results. You’ve probably noticed this most in tab completion, but it makes huge improvements in regular wildcarded directory listings. Especially remote ones. Since the native filtering is processed by the remote file system, we don’t need to suffer the performance penalty of accessing attributes of files that you ultimately don’t care about anyways. In version one, you can work around the issue by specifying the –Filter parameter directly. If this still doesn’t provide the speed you need, you can call “cmd.exe /c dir”.

 

Lack of Enumeration API

This issue raises itself for directory listings that contain many files. The DirectoryInfo.GetFiles() method returns an array. When creating that result list, the .NET Framework does many re-allocations (and copies) of that array, causing an exponential performance degradation:

GetChildItemPerf

 

This, too, has been resolved in the .NET 4.0 updates, by offering an API that lets you enumerate through a directory result, rather than retrieve them all at once. If you are running into these limitations, you can again apply a wildcarding approach. If this still doesn’t provide the speed you need, you can call “cmd.exe /c <command>”.

Why Don’t We Fix It?

Since cmd.exe isn’t impacted by these issues, why don’t we just do the same thing and call into the core Windows APIs directly? The reason is twofold:

  1. A core tenet of PowerShell is providing access to the REAL underlying .NET objects. If we implemented the semantics ourselves, we’d have to return new types of objects – something like PSFileInfo and PSDirectoryInfo. V1 scripts (or downstream cmdlets) that expect the REAL underlying .NET objects would fail to work. While we could add a new switch (-Raw?), users would still have to change their scripts to support it. In that case, they might as well use the existing cmd /c workaround.
  2. This issue is ultimately transient. While it’s annoying to drag out over a few years, it will ultimately come and go without users having to change their behaviour. One day, you’ll install a build and the issues will just magically be gone.

Again, thanks for your continuing feedback. That’s what ultimately helped us discover the issue and make sure the right people knew about it.

 

Lee Holmes [MSFT]
Windows PowerShell Development
Microsoft Corporation

Leave a Comment
  • Please add 6 and 5 and type the answer here:
  • Post
  • > and you’ll see the benefits of this new feature as soon as we are able to adopt it.

    Does this mean that we don't need to wait for 3 years and appearance of next Windows?

  • This post points out an issue that has existed forever, there is no documentation of the responsibilities of a PowerShell provider.  PS V2.0 broke our provider because you started passing wildcards to ItemCmdletProvider.ItemExists.  The documentation for ItemExists says nothing about wildcards.  What wildcards can we expect to see?  What other changes do provider writers need to know about?

  • @Smica,

    For sure, PowerShell v3 is at least 2 years out...  One year passed between PowerShell v1 and v2 CTP1.  From there it was roughly 6 months to CTP2, another 6 months to CTP3, and more or less another 6 months to the beta, then we had to wait still a bit for the Windows 7 launch.

    Now, this post mentions .NET 4.0...  That will could mean major regression testing for PowerShell (and may also mean we now see "v2.0" for the folder structure and .ps2 script extensions vs .ps1).

    A lot to think about...

  • Hi Thomas,

    thanks for great post. Are you aware if .NET 4.0 removes also path limitation to 260 characters???

    That one is extremely painful :(

    Thanks,

    Martin

  • My biggest complaints about PowerShell right now are:

    * Performance of ls.

    * Documentation (an alternative to get-help in ISE).

    * Text readability in ISE.

    I urge you to consider a "V2.5" release that takes advantage of improvements to filesystem performance/limitations and WPF in .NET 4.0, but with very few other changes. Preferably on half the time you took to bake V2 :).

  • I've recently started looking at PS in anger.  I am an experienced C#, VB, VBScript, JScript and Perl coder so I was looking forward to getting stuck in.

    To date, I've been really disappointed.  Apart from a number of syntactical oddities like function parameters not being comma-separated, it's the performce that has proved most disappointing.  I'd class fetching every object (such as the contents of a directory or OU) and then filtering it locally as a "schoolboy error".  Read the entire contents of a large file rather than processing it line by line is a waste of precious resources.  So yes, I'm also looking forward to the nex version, as this one has not cut it.

  • I ran into this problem recently and I don't know if this will make sense to anyone...but thought I would post it because it definitely helped me.

    Using the .NET API I used this search method (searching for user.db files (Groupwise Archives) and it TREMENDOUSLY sped up the process instead of using a recursive get-childitem.

    I still definitely consider myself a powershell newbie so I hope this makes sense to people...it was one of those things where alot of MSDN research and reverse engineering came into play.  I wish I had the credit to give to the original function I had found but I can't find it :(

    # SCOPE: SEARCH A DIRECTORY FOR FILES (W/WILDCARDS IF NECESSARY)

    # Usage:

    # $directory = "\\SERVER\SHARE"

    # $searchterms = "filname[*].ext"

    # PS> $Results = Search $directory $searchterms

    [reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null

    Function Search {

    # Parameters $Path and $SearchString

    param ([Parameter(Mandatory=$true, ValueFromPipeline = $true)][string]$Path,

    [Parameter(Mandatory=$true)][string]$SearchString

    )

    try {

    #.NET FindInFiles Method to Look for file

    # BENEFITS : Possibly running as background job (haven't looked into it yet)

    [Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles(

    $Path,

    [Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,

    $SearchString

    )

    } catch { $_ }

    }

  • This is a couple of years later. I'm wondering what is the fastest way to search server shares with Powershell or anything else.

  • Get-ChildItem in Windows PowerShell 3.0 Beta (blogs.msdn.com/.../windows-management-framework-3-0-beta-available-for-download.aspx) uses the new enumeration APIs that are part of .NET 4. Searching server shares is much faster as a result.

    Travis Jones [MSFT]

    Program Manager - Windows PowerShell

  • Hello,

    I suggest you to download new Long Path Tool software that simply allows you to work easily on Long Path files.

    Thank you.............

Page 1 of 1 (10 items)