<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Media And Microcode : Get-Feed</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Feed/default.aspx</link><description>Tags: Get-Feed</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Microcode: PowerShell Scripting Tricks: Scripting the Web (Part 3) (Resolve-Link, Get-WebPageLink)</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/12/microcode-powershell-scripting-tricks-scripting-the-web-part-3-resolve-link-get-webpagelink.aspx</link><pubDate>Fri, 12 Dec 2008 12:59:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9201602</guid><dc:creator>JamesBrundage</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/mediaandmicrocode/comments/9201602.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mediaandmicrocode/commentrss.aspx?PostID=9201602</wfw:commentRss><description>&lt;P&gt;The first post in this series was learning to crawl.&amp;nbsp; I introduced &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/01/microcode-powershell-scripting-tricks-scripting-the-web-part-1-get-web.aspx"&gt;Get-Web&lt;/A&gt;, which allows you to use System.Net.Webclient to download web sites in a variety of ways.&amp;nbsp; The next post was learning to walk.&amp;nbsp; I showed us &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx"&gt;Get-MarkupTag&lt;/A&gt;, which helps coerce parts of the web into XML.&amp;nbsp; Now we can start to really have some fun with the data and run wild.&lt;/P&gt;
&lt;P&gt;Pulling out semi-structured data is one thing, but it’s important to be able to pull out more complex information as well.&amp;nbsp; One interesting case is pulling out all of the links from a webpage.&amp;nbsp;&amp;nbsp; This task breaks down into four smaller tasks:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Downloading the page (done with Get-Web) 
&lt;LI&gt;Getting the &amp;lt;a&amp;gt; tags in a meaningful way (done with &lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/12/08/microcode-powershell-scripting-tricks-scripting-the-web-part-2-get-markuptag.aspx"&gt;Get-MarkupTag&lt;/A&gt;) 
&lt;LI&gt;Extracting out the href attribute 
&lt;LI&gt;Determining if the link is relative or absolute &lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;To determine if the link is relative or absolute, I made a Resolve-Link function.&amp;nbsp; It takes a base url (e.g &lt;A href="http://www.foo.com/blah/blah.asp" mce_href="http://www.foo.com/blah/blah.asp"&gt;http://www.foo.com/blah/blah.asp&lt;/A&gt;) and a link found on it, and returns the real item it resolves to.&amp;nbsp; It optionally returns a property bag with the type of link and the resolved link.&lt;/P&gt;
&lt;P&gt;Here’s Resolve-Link:&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE class=CmdletDefinition&gt;function Resolve-Link([Uri]$uri,
    [string]$link,
    [switch]$returnLinkType) {
    #.Synopsis
    #   Resolves a relative or absolute link to an absolute url
    #.Description
    #   Takes a uri and a link to a page and returns the absolute url, or
    #   optionally returns a property bag with the link type
    #   (absolute, relative, or host relative) and the link
    #.Parameter uri
    #   The uri the link is located on
    #.Parameter link
    #   The original link text
    #.Parameter returnLinkType
    #   The return link type
    #.Example
    #   Resolve-Link http:/www.microsoft.com/ /technet/scriptcenter
    if ($link.StartsWith("/")) {
        # Relative to Host site
        if ($returnLinkType) {
            return New-Object Object |
                Add-Member NoteProperty Type "Host Relative" -PassThru |
                Add-Member NoteProperty Link ([uri]"$($uri.Scheme)://$($uri.DnsSafehost)$($link)") -PassThru
        }
        return "$($uri.Scheme)://$($uri.DnsSafehost)$($link)"
    } else {
        if ($link.StartsWith("$($uri.Scheme)://")) {
            # Absolute Link
            if ($returnLinkType) {
                return New-Object Object |
                    Add-Member NoteProperty Type "Absolute" -PassThru |
                    Add-Member NoteProperty Link ([uri]$link) -PassThru
            }            
            return $link
        } else {
            # Relative link
            $realLink = $uri.AbsoluteUri.Substring(0,
                $uri.AbsoluteUri.LastIndexOf("/")) + "/$link"    
            if ($returnLinkType) {
                return New-Object Object |
                    Add-Member NoteProperty Type "Relative" -PassThru |
                    Add-Member NoteProperty Link ([uri]$realLink) -PassThru
            }
            return $realLink            
        }
    }    
}&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;Once Resolve-Link was written, making Get-WebPageLink is an incredible snap.&amp;nbsp; It’s below, and it actually takes only 3 lines to do the real work and&amp;nbsp; 11 lines to explain the work and give examples.&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE class=CmdletDefinition&gt;function Get-WebPageLink($url) {
    #.Synopsis
    #   Returns all of the links within a webpage
    #.Description
    #   Resolves all &amp;lt;A&amp;gt; references and returns a property bag with
    #   the text contained in the link, the page the link came from,
    #   and the type of link returned (absolute, host relative, or relative)
    #.Parameter urltp
    #   The page to get links from
    #.Example
    #   Get-WebPageLink http://blogs.msdn.com/
    Get-MarkupTag a (Get-Web $url) | Foreach-Object {
        Resolve-Link $url $_.Xml.Href -returnLinkType |
            Add-Member NoteProperty Text $_.Xml."#text" -PassThru 
    }
}&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;Go ahead and give Get-WebpageLink a whirl: &lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE&gt;Get-WebpageLink http://blogs.msdn.com&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;Ready for some real fun? Remember way back when I did a post about getting RSS feeds in PowerShell with Microsoft.FeedsManager (&lt;A href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/11/microcode-scripting-rss-feeds-with-powershell-and-microsoft-feedsmanager.aspx" mce_href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/11/microcode-scripting-rss-feeds-with-powershell-and-microsoft-feedsmanager.aspx"&gt;Get-Feed&lt;/A&gt;).&amp;nbsp; If you have that script handy, go ahead and check out this one liner that will refresh every RSS item you’ve got and extract out all of the links from it.&lt;/P&gt;&lt;I&gt;
&lt;BLOCKQUOTE&gt;&lt;PRE&gt;    
Get-Feed -recurse -articles | Foreach-Object { Get-WebPageLink $_.Link }&lt;/PRE&gt;&lt;/BLOCKQUOTE&gt;&lt;/I&gt;
&lt;P&gt;That particular command line can take a while, depending on how many blogs you subscribe to, but it gives you a brand new view on blogs (as a simmering stew of scripts, rather than just text to be read and comprehended).&lt;/P&gt;
&lt;P&gt;There’s more fun to come in unlocking the web, but these two scripts should get you started in extracting a little more into the wild world of the web.&lt;/P&gt;
&lt;P&gt;Hope this Helps,&lt;/P&gt;
&lt;P&gt;James Brundage [MSFT]&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9201602" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/PowerShell/default.aspx">PowerShell</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Microcode/default.aspx">Microcode</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+Tricks/default.aspx">Scripting Tricks</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Feed/default.aspx">Get-Feed</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Web/default.aspx">Get-Web</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-MarkupTag/default.aspx">Get-MarkupTag</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-WebPageLink/default.aspx">Get-WebPageLink</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Resolve-Link/default.aspx">Resolve-Link</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Scripting+The+Web/default.aspx">Scripting The Web</category></item><item><title>Microcode: Scripting RSS Feeds with PowerShell and Microsoft.FeedsManager</title><link>http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/11/microcode-scripting-rss-feeds-with-powershell-and-microsoft-feedsmanager.aspx</link><pubDate>Tue, 11 Nov 2008 04:34:53 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9059055</guid><dc:creator>JamesBrundage</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/mediaandmicrocode/comments/9059055.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mediaandmicrocode/commentrss.aspx?PostID=9059055</wfw:commentRss><description>&lt;p&gt;PowerShell's an amazing glue language.&amp;#160;&amp;#160; It can help you bring code from all corners of the earth into one environment, and then you can custom the code to be more to your liking.&amp;#160; While the last several entries of my blog have spent time looking at what we can glue together with .NET, this one shows you how you can bring COM into the mix.&lt;/p&gt;  &lt;p&gt;Internet Explorer has an RSS feed manager which you can script in .NET.&amp;#160; While the command line might not be the ideal spot to read your blogs, it's possible to make a much nicer front end by scripting WPF &amp;amp; PowerShell, and having a quick and easy API to get at your RSS feeds makes building applications that use RSS feeds as a way to synchronize information between multiple machines.&lt;/p&gt;  &lt;p&gt;The object that makes this all work is &lt;a href="http://msdn.microsoft.com/en-us/library/ms684749.aspx"&gt;Microsoft.FeedsManager&lt;/a&gt;, and the cmdlet that makes working with it possible is &lt;a href="http://technet.microsoft.com/en-us/library/bb978545.aspx"&gt;New-Object&lt;/a&gt;.&amp;#160; By default, New-Object will create a new .NET object from the classes that are currently loaded by .NET (for more on this, see my previous posts about &lt;a href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/10/23/microcode-powershell-scripting-tricks-exploring-net-types-with-a-get-type-function-and-reflection.aspx"&gt;Get-Type&lt;/a&gt; and &lt;a href="http://blogs.msdn.com/mediaandmicrocode/archive/2008/11/08/microcode-exploring-more-of-net-with-get-assembly.aspx"&gt;Get-Assembly&lt;/a&gt;).&amp;#160; While .NET is a fairly easy world to explore, COM is considerably more tricky.&lt;/p&gt;  &lt;p&gt;Luckily, COM objects are the main backbone of VBScript.&amp;#160; This is an important point for VBScripters trying to learn PowerShell: while the world of .NET may be new and exotic, and Cmdlets might be cool, you don't have to drop your VBScript knowledge at the door.&amp;#160; While some of a VBScript (e.g. strings) will require new learning in PowerShell, the vast majority of it doesn't.&lt;/p&gt;  &lt;p&gt;Microsoft.FeedsManager is one of these examples.&amp;#160; I found out about this object by looking at the source code for the RSS reader vista sidebar gadget, and, once I knew its name, I knew how to get at it in PowerShell.&amp;#160; Even though I was looking at javascript, I knew that there was a scriptable COM object there to help me out, and I knew how to bring it into PowerShell. &lt;/p&gt;  &lt;p&gt;To create a Microsoft.FeedsMananger, simply use this one liner:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;New-Object -comObject Microsoft.FeedsManager&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Since the root folder looks promising, let's explore that:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;(New-Object -comObject Microsoft.FeedsManager).RootFolder&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;While I can just save this into a variable and script it like I were still in VBScript land (e.g. $feeds = New-Object -comObject Microsoft.FeedsManager, $feeds.RootFolder), I far prefer the parlance of PowerShell, so I'll make a quick Get-Feed function to explore feeds more easily.&amp;#160; I'll accept a wildcard for the name or title of the feed, and, since the feeds can have an arbitrary number of subfolders, I'll add a -recurse parameter.&amp;#160; In order to make -recurse work elegantly, I'll use a short recursive script and I'll add a $folder parameter, which can be null and will be assumed to be the root folder.&amp;#160; Finally, since I'll often want to read the items more than the feed, I'll add a switch parameter to extract the articles.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;function Get-Feed($feed = &amp;quot;*&amp;quot;, $folder, [switch]$recurse, [switch]$articles) {&amp;#160;&amp;#160;&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if (! $folder) {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; $feedsManager = New-Object -ComObject Microsoft.FeedsManager&amp;#160;&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; $folder = $feedsManager.RootFolder&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; }         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; $folder.Feeds |         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; Where-Object {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ($_.Title -like &amp;quot;$feed&amp;quot;) -or ($_.Name -like &amp;quot;$feed&amp;quot;)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; } | Foreach-Object {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if ($articles) { $_.Items } else { $_ }         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; }         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if ($recurse) {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; $folder.Subfolders | Foreach-Object {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; Get-Feed $feed $_ -recurse -articles:$articles         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; }         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; }         &lt;br /&gt;}&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Once I've defined Get-Feed, I can use it to search my feeds through PowerShell.&amp;#160; This one liner will get my all my feeds, sort them by name, and display just the Name and URL:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;Get-Feed -recurse | Sort-Object Name | Select-Object Name, URL &lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;To see all of the feed items sort them by when they were published, and display the post title, the blog title, and the publish date, you can use this pipeline:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;Get-Feed -articles -recurse |        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; Sort-Object PubDate -descending |         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; Select-Object Title, @{Name='Blog'; Expression={$_.Parent.Title }}, PubDate&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;I promise that we can do much more with RSS and PowerShell, but hopefully this post will get you started down the path of using the FeedsManager from PowerShell.&lt;/p&gt;  &lt;p&gt;Hope this helps,&lt;/p&gt;  &lt;p&gt;James Brundage [MSFT]&lt;/p&gt;  &lt;p&gt;&lt;/p&gt;  &lt;table border="1"&gt;&lt;theader&gt;&lt;tbody&gt;     &lt;tr&gt;       &lt;td&gt;&lt;strong&gt;Module Name&lt;/strong&gt;&lt;/td&gt;        &lt;td&gt;&lt;strong&gt;Scripts&lt;/strong&gt;&lt;/td&gt;     &lt;/tr&gt; &lt;/theader&gt;&lt;/tbody&gt;&lt;tbody&gt;     &lt;tr&gt;       &lt;td&gt;&lt;a href="http://cid-2b8a402d0ba15e82.skydrive.live.com/browse.aspx/PowerShell%20Scripts/RSS"&gt;RSS&lt;/a&gt;&lt;/td&gt;        &lt;td&gt;&lt;a href="http://cid-2b8a402d0ba15e82.skydrive.live.com/self.aspx/PowerShell%20Scripts/RSS/Get-Feed.ps1"&gt;Get-Feed&lt;/a&gt;&lt;/td&gt;     &lt;/tr&gt;   &lt;/tbody&gt;&lt;/table&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9059055" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Microcode/default.aspx">Microcode</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Select-Object/default.aspx">Select-Object</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Sort-Object/default.aspx">Sort-Object</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Get-Feed/default.aspx">Get-Feed</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/New-Object/default.aspx">New-Object</category><category domain="http://blogs.msdn.com/mediaandmicrocode/archive/tags/Microsoft.FeedsManager/default.aspx">Microsoft.FeedsManager</category></item></channel></rss>