<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Media And Microcode : Scripting The Web</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+The+Web/default.aspx</link><description>Tags: Scripting The Web</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Microcode: PowerShell Scripting Tricks: Scripting the Web (Part 3) (Resolve-Link, Get-WebPageLink)</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/12/microcode-powershell-scripting-tricks-scripting-the-web-part-3-resolve-link-get-webpagelink.aspx</link><pubDate>Fri, 12 Dec 2008 12:59:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9201602</guid><dc:creator>JamesBrundage</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/mediaandmicrocode/comments/9201602.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mediaandmicrocode/commentrss.aspx?PostID=9201602</wfw:commentRss><description>&lt;P&gt;The first post in this series was learning to crawl.&amp;nbsp; I introduced &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx"&gt;Get-Web&lt;/A&gt;, which allows you to use System.Net.Webclient to download web sites in a variety of ways.&amp;nbsp; The next post was learning to walk.&amp;nbsp; I showed us &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx"&gt;Get-MarkupTag&lt;/A&gt;, which helps coerce parts of the web into XML.&amp;nbsp; Now we can start to really have some fun with the data and run wild.&lt;/P&gt;
&lt;P&gt;Pulling out semi-structured data is one thing, but it’s important to be able to pull out more complex information as well.&amp;nbsp; One interesting case is pulling out all of the links from a webpage.&amp;nbsp;&amp;nbsp; This task breaks down into four smaller tasks:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Downloading the page (done with Get-Web) 
&lt;LI&gt;Getting the &amp;lt;a&amp;gt; tags in a meaningful way (done with &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx"&gt;Get-MarkupTag&lt;/A&gt;) 
&lt;LI&gt;Extracting out the href attribute 
&lt;LI&gt;Determining if the link is relative or absolute &lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;To determine if the link is relative or absolute, I made a Resolve-Link function.&amp;nbsp; It takes a base url (e.g &lt;A href="http://www.foo.com/blah/blah.asp" mce_href="http://www.foo.com/blah/blah.asp"&gt;http://www.foo.com/blah/blah.asp&lt;/A&gt;) and a link found on it, and returns the real item it resolves to.&amp;nbsp; It optionally returns a property bag with the type of link and the resolved link.&lt;/P&gt;
&lt;P&gt;Here’s Resolve-Link:&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE class=CmdletDefinition&gt;function Resolve-Link([Uri]$uri,
    [string]$link,
    [switch]$returnLinkType) {
    #.Synopsis
    #   Resolves a relative or absolute link to an absolute url
    #.Description
    #   Takes a uri and a link to a page and returns the absolute url, or
    #   optionally returns a property bag with the link type
    #   (absolute, relative, or host relative) and the link
    #.Parameter uri
    #   The uri the link is located on
    #.Parameter link
    #   The original link text
    #.Parameter returnLinkType
    #   The return link type
    #.Example
    #   Resolve-Link http:/www.microsoft.com/ /technet/scriptcenter
    if ($link.StartsWith("/")) {
        # Relative to Host site
        if ($returnLinkType) {
            return New-Object Object |
                Add-Member NoteProperty Type "Host Relative" -PassThru |
                Add-Member NoteProperty Link ([uri]"$($uri.Scheme)://$($uri.DnsSafehost)$($link)") -PassThru
        }
        return "$($uri.Scheme)://$($uri.DnsSafehost)$($link)"
    } else {
        if ($link.StartsWith("$($uri.Scheme)://")) {
            # Absolute Link
            if ($returnLinkType) {
                return New-Object Object |
                    Add-Member NoteProperty Type "Absolute" -PassThru |
                    Add-Member NoteProperty Link ([uri]$link) -PassThru
            }            
            return $link
        } else {
            # Relative link
            $realLink = $uri.AbsoluteUri.Substring(0,
                $uri.AbsoluteUri.LastIndexOf("/")) + "/$link"    
            if ($returnLinkType) {
                return New-Object Object |
                    Add-Member NoteProperty Type "Relative" -PassThru |
                    Add-Member NoteProperty Link ([uri]$realLink) -PassThru
            }
            return $realLink            
        }
    }    
}&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;Once Resolve-Link was written, making Get-WebPageLink is an incredible snap.&amp;nbsp; It’s below, and it actually takes only 3 lines to do the real work and&amp;nbsp; 11 lines to explain the work and give examples.&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE class=CmdletDefinition&gt;function Get-WebPageLink($url) {
    #.Synopsis
    #   Returns all of the links within a webpage
    #.Description
    #   Resolves all &amp;lt;A&amp;gt; references and returns a property bag with
    #   the text contained in the link, the page the link came from,
    #   and the type of link returned (absolute, host relative, or relative)
    #.Parameter urltp
    #   The page to get links from
    #.Example
    #   Get-WebPageLink http://blogs.msdn.com/
    Get-MarkupTag a (Get-Web $url) | Foreach-Object {
        Resolve-Link $url $_.Xml.Href -returnLinkType |
            Add-Member NoteProperty Text $_.Xml."#text" -PassThru 
    }
}&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;Go ahead and give Get-WebpageLink a whirl: &lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE&gt;Get-WebpageLink http://blogs.msdn.com&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;Ready for some real fun? Remember way back when I did a post about getting RSS feeds in PowerShell with Microsoft.FeedsManager (&lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/11/microcode-scripting-rss-feeds-with-powershell-and-microsoft-feedsmanager.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/11/microcode-scripting-rss-feeds-with-powershell-and-microsoft-feedsmanager.aspx"&gt;Get-Feed&lt;/A&gt;).&amp;nbsp; If you have that script handy, go ahead and check out this one liner that will refresh every RSS item you’ve got and extract out all of the links from it.&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE&gt;    
Get-Feed -recurse -articles | Foreach-Object { Get-WebPageLink $_.Link }&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;That particular command line can take a while, depending on how many blogs you subscribe to, but it gives you a brand new view on blogs (as a simmering stew of scripts, rather than just text to be read and comprehended).&lt;/P&gt;
&lt;P&gt;There’s more fun to come in unlocking the web, but these two scripts should get you started in extracting a little more into the wild world of the web.&lt;/P&gt;
&lt;P&gt;Hope this Helps,&lt;/P&gt;
&lt;P&gt;James Brundage [MSFT]&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9201602" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/PowerShell/default.aspx">PowerShell</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Microcode/default.aspx">Microcode</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+Tricks/default.aspx">Scripting Tricks</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Feed/default.aspx">Get-Feed</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Web/default.aspx">Get-Web</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-MarkupTag/default.aspx">Get-MarkupTag</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-WebPageLink/default.aspx">Get-WebPageLink</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Resolve-Link/default.aspx">Resolve-Link</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+The+Web/default.aspx">Scripting The Web</category></item><item><title>Microcode: PowerShell Scripting Tricks: Scripting The Web (Part 2) (Get-MarkupTag)</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx</link><pubDate>Mon, 08 Dec 2008 11:17:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9183927</guid><dc:creator>JamesBrundage</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/mediaandmicrocode/comments/9183927.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mediaandmicrocode/commentrss.aspx?PostID=9183927</wfw:commentRss><description>&lt;P&gt;The first post about scripting the was a lot of waxing philosophical but little about how to extract data and give it form.&amp;nbsp; There are several approaches, with various difficulties.&amp;nbsp; I could build a full HTML parser and walk though object models, or I could use the object model of IE.&amp;nbsp; Since I personally like to minimize dependencies, I’ve chosen to use &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx"&gt;System.Net.Webclient&lt;/A&gt; to download the webpage as text rather than use Internet Explorer and get it through a proper object model. This means I will either have to write an HTML parser, or I’ll have to nudge HTML into something more useful to avoid writing a full parser.&lt;/P&gt;
&lt;P&gt;HTML is preciously close to XML, and XML is something that PowerShell supports quite well, so e IE sI decided to try to nudge HTML into XML.&amp;nbsp; I wrote a function, Get-MarkupTag, which will extract out the text for a markup tag (e.g. &amp;lt;a&amp;gt;) and attempt to coerce it into XML.&lt;/P&gt;
&lt;P&gt;It’s not a perfect approach, because there several things that are legal in HTML but not in XML.&amp;nbsp; Here’s an inventory of every curveball I’ve hit so far.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Unclosed tags like &amp;lt;IMG&amp;gt;: &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Some tags in HTML, like &amp;lt;IMG&amp;gt; are unmatched.&amp;nbsp; I’ll first have to identify all of the unmatched tags and coerce them into property closed xml (e.g. ensure that &amp;lt;IMG is &amp;lt;IMG /&amp;gt;).&amp;nbsp; For things like &amp;lt;BR&amp;gt; or &amp;lt;HR&amp;gt; this is easy, because attributes are rare, but for IMG, getting this right is critically important, because it’s how you download images.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;HTML Escape Sequences end up as XML unrecognized entities:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;There’s a lot of ways to embed special characters within HTML that get recognized by the parser as XML entities.&amp;nbsp; The most common example is &amp;amp;nbsp; (the explicit space), but foreign currencies often show up in this format as well.&amp;nbsp; I found a complete list online that I could extract out all of the sequences from, and I embedded a hashtable within to hold each item.&amp;nbsp; Then I walked through the hashtable and replaced the escape sequences with their real value.&amp;nbsp; However, since the entity is legal in any since the entity is legal in any case (e.g. &amp;amp;nsbp; is the same as &amp;amp;NBSP; to an HTML parser), I had to write a quick case insensitive replace.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Unquoted Attributes&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;Most browsers will accept HTML attributes without quotes, e.g. &amp;lt;a href=www.microsoft.com&amp;gt;, but XML can handle this.&amp;nbsp; Once I’ve got figured out the text in each tag, I need to check each attribute within that tag to ensure that the attributes are quoted.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Nested tags&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;One of the more interesting parts of parsing HTML was nested tags.&amp;nbsp; In order to match nested tags, what I did was use two regular expressions to identify all of the matches.&amp;nbsp; While PowerShell has a –match operator, –match doesn’t work for multiline strings.&amp;nbsp; So what I did was use New-Object to create the Regular Expression for extracting out a tag and then stored the result in a variable.&lt;/P&gt;
&lt;P&gt;If the number of start tags was equal to the number of end tags, I assumed that the tags are balanced.&amp;nbsp; If they’re odd, I’ll only deal with the start tags instead.&amp;nbsp; If I need to determine which start tag matches which end tag, I need to use a data structure called a stack.&amp;nbsp; Luckily, .NET comes with one (&lt;A href="http://msdn.microsoft.com/en-us/library/system.collections.stack.aspx" mce_href="http://msdn.microsoft.com/en-us/library/system.collections.stack.aspx"&gt;System.Collections.Stack&lt;/A&gt;).&amp;nbsp; A stack in code is just like a stack in real life.&amp;nbsp; You can put something on the top of the stack (Push), see what’s on top of the stack (Peek), and pull off the top of of the stack (Pop).&lt;/P&gt;
&lt;P&gt;If I put start tags and end tags in one list, and sort them by the order that they occurred (in PowerShell, this is $list1 + $list2 | Sort-Object PropertyName), then I can simply walk through the list, pushing the start tags onto the stack and popping off end tags as I encounter them.&lt;/P&gt;
&lt;P&gt;I don’t believe that I’ve taken care of all of the possible pains that exist (for instance, I know that georgraphic coordinates on wikipedia do not coerce into XML at this point.&amp;nbsp; However, it’s a good start and it can yield some quite interesting results.&lt;/P&gt;
&lt;P&gt;Since Get-MarkupTag involves a lot of escape sequences, which lends to very annoying blog formatting, I’m going to attach the script instead of embedding it.&amp;nbsp; &lt;A title=Get-MarkupTag href="http://blogs.msdn.com/mediaandmicrocode/attachment/9183927.ashx" mce_href="http://blogs.msdn.com/mediaandmicrocode/attachment/9183927.ashx"&gt;You can download it here&lt;/A&gt;.&lt;I&gt;&amp;nbsp;&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;If you look at the code in Get-MarkupTag, close to the end, you’ll see a trap statement. This means that there’s any error coercing the chunk of HTML into XML will be swallowed into the verbose log.&amp;nbsp; If you want to examine why you can’t use the XML, simply set $verbosePreference to “continue” and you will be able to see the errors produced trying to convert the tag into XML.&amp;nbsp; This is an example of a technique I call error redirection.&lt;/P&gt;
&lt;P&gt;One of the kind of cool things you can do with it is extract the individual rows from ConvertTo-HTML:&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE&gt;    $text = Get-ChildItem | Select Name, LastWriteTime | ConvertTo-HTML | Out-String 
    Get-MarkupTag "tr" $text
        &lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;It's also possible to extract out all of the pages &lt;A href="http://www.microsoft.com/" mce_href="http://www.microsoft.com"&gt;www.microsoft.com&lt;/A&gt; links to:&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE&gt;    $microsoft= (New-Object Net.Webclient).DownloadString("http://www.microsoft.com/")
    Get-MarkupTag "a" $microsoft | % { $_.Xml.Href }        &lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;What this fairly slightly quick and dirty approach to extracting out HTML gives you is a fairly simply way to start giving the data form. With XML you get the syntactic sugar PowerShell pours on XML, and you also get the power of XPath, and from that, you can start turning the HTML into something more interesting.&lt;/P&gt;
&lt;P&gt;In the next post, I’ll look at some of the things you can do with this new toy.&lt;/P&gt;
&lt;P&gt;Try it out on a few sites.&lt;/P&gt;
&lt;P&gt;Hope this helps,&lt;/P&gt;
&lt;P&gt;James Brundage [MSFT]&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9183927" width="1" height="1"&gt;</description><enclosure url="http://blogs.msdn.com/mediaandmicrocode/attachment/9183927.ashx" length="11118" type="application/octet-stream" /><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/PowerShell/default.aspx">PowerShell</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Microcode/default.aspx">Microcode</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-MarkupTag/default.aspx">Get-MarkupTag</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+The+Web/default.aspx">Scripting The Web</category></item><item><title>Microcode: PowerShell Scripting Tricks: Scripting The Web (Part 1) (Get-Web)</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx</link><pubDate>Mon, 01 Dec 2008 06:13:06 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9159333</guid><dc:creator>JamesBrundage</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/mediaandmicrocode/comments/9159333.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mediaandmicrocode/commentrss.aspx?PostID=9159333</wfw:commentRss><description>&lt;p&gt;Several of the last posts have tackled how to take the wild world of data and start to turn it into PowerShell objects, so that it’s easier to make heads or tails out of it.&amp;#160; Once all of that data is in a form that PowerShell can use more effectively in the object pipeline, you can use PowerShell to slice and dice that data in wonderfully effective ways.&lt;/p&gt;  &lt;p&gt;Data mining is a strange art that I find scripting languages normally are a little better at than compiled languages. I find that PowerShell can be incredibly useful for data miners for two important reasons.&amp;#160; First, it can pull upon an unprecedented array of technologies (COM, WMI, .NET, SQL, Regular Expressions, Web Service, Command Line) in order to get structured data.&amp;#160; Second, and more importantly, since you can easily create objects on the fly in PowerShell, and since PowerShell has wonderful string processing, you can often use PowerShell to extract structure data out of unstructured data.&lt;/p&gt;  &lt;p&gt;Being able to pull data out of the mist and give it form is a very valuable skill, because people do not think in structured data.&amp;#160; While it might be useful to the computing world to have most information in structured data, most people disseminating information don’t think or record their thoughts with rigorous structure.&amp;#160; However, many people do record their thoughts in semi-rigorous structure, like the sentences and paragraphs you’re reading now.&lt;/p&gt;  &lt;p&gt;The fact of the matter is that tons of data floats on the web requiring a very little bit of work and a small amount of art to extract it.&amp;#160; This is because the web that people record their thoughts in is largely in HTML, and so, it is possible to learn a few ways to pull the data from the little structure that exists.&lt;/p&gt;  &lt;p&gt;The first piece of the toolkit to extract out data from the Web is a function I’ve called Get-Web.&amp;#160; Get-Web will simply download web pages, and it wraps part of the &lt;a href="http://msdn.microsoft.com/en-us/library/system.net.webclient.aspx"&gt;System.Net.Webclient&lt;/a&gt; object.&lt;/p&gt;  &lt;p&gt;Using WebClient has pros and cons.&amp;#160; The biggest pro is that it relies on .NET, rather than on a particular browser, which means that you can use it without IE.&amp;#160; The biggest con is that a lot of web pages do checks on the browser in order to change how they display.&lt;/p&gt;  &lt;p&gt;Get-Web is below.&amp;#160; As with &lt;a href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/30/microcode-powershell-scripting-tricks-more-joy-of-hashtables-with-get-hashtableasobject.aspx"&gt;Get-HashtableAsObject&lt;/a&gt;, I’m using comments to declare some nifty inline help.&lt;/p&gt;  &lt;pre&gt;function Get-Web($url, 
    [switch]$self,
    $credential, 
    $toFile,
    [switch]$bytes)
{
    #.Synopsis
    #    Downloads a file from the web
    #.Description
    #    Uses System.Net.Webclient (not the browser) to download data
    #    from the web.
    #.Parameter self
    #    Uses the default credentials when downloading that page (for downloading intranet pages)
    #.Parameter credential
    #    The credentials to use to download the web data
    #.Parameter url
    #    The page to download (e.g. www.msn.com)    
    #.Parameter toFile
    #    The file to save the web data to
    #.Parameter bytes
    #    Download the data as bytes   
    #.Example
    #    # Downloads www.live.com and outputs it as a string
    #    Get-Web http://www.live.com/
    #.Example
    #    # Downloads www.live.com and saves it to a file
    #    Get-Web http://wwww.msn.com/ -toFile www.msn.com.html
    $webclient = New-Object Net.Webclient
    if ($credential) {
        $webClient.Credential = $credential
    }
    if ($self) {
        $webClient.UseDefaultCredentials = $true
    }
    if ($toFile) {
        if (-not &amp;quot;$toFile&amp;quot;.Contains(&amp;quot;:&amp;quot;)) {
            $toFile = Join-Path $pwd $toFile
        }
        $webClient.DownloadFile($url, $toFile)
    } else {
        if ($bytes) {
            $webClient.DownloadData($url)
        } else {
            $webClient.DownloadString($url)
        }
    }
}&lt;/pre&gt;

&lt;p&gt;To walk through a few examples of Get-Web, simply point it to any webpage. &lt;/p&gt;

&lt;pre&gt;	Get-Web http://en.wikipedia.org/&lt;/pre&gt;

&lt;p&gt;To save a page to disk &lt;/p&gt;

&lt;pre&gt;	Get-Web http://www.msn.com/ -toFile www.msn.com.html&lt;/pre&gt;

&lt;p&gt;Just downloading the data is only the first step.&amp;#160; All being able to download web data gives you is a way to get the mist into a bottle, but it doesn’t help you give it form.&amp;#160; The next piece will cover how to pull the data out of the web and into PowerShell.&lt;/p&gt;

&lt;p&gt;Hope this helps,&lt;/p&gt;

&lt;p&gt;James Brundage [MSFT] &lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9159333" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/PowerShell/default.aspx">PowerShell</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Microcode/default.aspx">Microcode</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Web/default.aspx">Get-Web</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/System.Net.Webclient/default.aspx">System.Net.Webclient</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+The+Web/default.aspx">Scripting The Web</category></item></channel></rss>