Dupes be gone!

Published 06 July 06 09:28 AM | michaelaffronti 


Duplicate items are an RSS aggregator's worst enemy, and many of the dedicated folks who are using Outlook 2007 Beta 2 know we did not do a great job in that build of handling the many ways that dupes can occur.

Since the Beta 2 build we've made numerous improvements to the RSS architecture around our ability to deal with duplicate items. This includes changes in both the individual download logic for feeds, the server sync if you're in an Exchange environment, and the delete behavior for individual items.

When you delete an individual RSS item from the feed's folder in Outlook 2007, we take it as "I'm done with this item and don't want to see it again." This means if the post continues to exist in the XML file we get from the content publisher for another few days (or however long it takes to roll off the end of the file), we will not download it again. Read Status is also handled the same way; mark an item as Read and its status will not change in this scenario.

If a blogger or content publisher modifies a post and wants their readers to be sure they see it again, they should follow the best practice of re-posting the new content. This will create a new GUID and cause Outlook (and other aggregators that follow this delete model) to see it as a new item and download it as appropriate.

Minor or non-content changes made to existing items in the feed's XML - especially random tags used by a specific aggregator or inserted automatically by the syndication engine - will not cause Outlook to see it as a new item and download a duplicate. We saw a large number of duplicate feed items in Beta 2 because of this and our improvements to the update logic for individual posts is designed to handle this. The specific logic for determining which fields to use for change detection in Outlook is now the same as IE 7.

 

Filed under:

Comments

# Dan Dautrich said on July 6, 2006 1:03 PM:
Hooray!  I've noticed the "non-content changes" showing up as unread posts in my own blog feed (when I post new items and older items get the "recent posts" links updated).  Seeing 3 or 4 identical copies per post on my subscribed Microsoft blogs isn't out of the ordinary either, so this comes as a welcome improvement.  Keep up the great work, guys.
# Kevin Dente said on July 6, 2006 5:45 PM:
Have you done a survey of blogging engines to see how they behave relative to post changes?

This is definitely a tricky area, and one where different aggregators behave quite differently. IIRC, Newsgator Inbox actually uses hashes to detect post changes, which can definitely result in false positives but ensures that I see post updates that other aggregators (eg RSS Bandit) would ignore.

It's a fine line, I guess, between too much noise and missing updates. I tend to prefer Newsgators approach myself, as I don't like to miss potentially important updates.





# Jorgen said on July 7, 2006 6:38 AM:
Michael,
Please note that this also occures when you move a RSS feed delivery location from an ost folder to a pst folder. When doing that you receive all the RSS feeds once again...
# The Insider by Sidebar Geek said on July 10, 2006 12:06 PM:
Michael Affronti explains how the next builds of Outlook 2007 fix major issues with feed duplicates appearing
# Make You Go Hmm: » Ill-advised RSS duplicate post solution in IE7 and Outlook 2007 said on July 10, 2006 2:46 PM:
PingBack from http://www.makeyougohmm.com/20060710/3556/
# search.subscribe.share said on July 10, 2006 4:12 PM:
There's been some conversations going on about how Outlook is handling duplicate RSS items and what that...
# nick gogerty said on July 10, 2006 4:53 PM:
Our free Outlook reader inclue!  supports video today http://mtadmin1.mailtail.com/video/inclueRSSFox.wmv

We also have a one click feed discovery and add button
http://www.inclue.com/incluebutton

We handle dupes in the same way as stated above, but are working on doing things at the item level.  

We launch a free version for outlook express in 3 weeks.
# Robert Burke's Weblog said on July 14, 2006 12:22 PM:
Last night, I took the plunge: I'm now using Office 2007 Beta 2 on all of my computers.
The last stronghold...
# gay sex said on August 13, 2006 12:58 PM:
gay men masturbating
galleries of very young boys in speedos
teen boy penis
the boys of summer
frat boys uncensored
# gay sex said on August 13, 2006 7:31 PM:
naked male in yahoo
gay porn post
yaoi gay
free gay black porn
gay cum face
# gay sex pics said on August 17, 2006 7:07 AM:
free gay cinema
training sissy boys
free galleries young boys
women with strap on dicks
nude school boys
# gay sex pics said on August 23, 2006 10:04 AM:
black dick white pussy
gay sex techniques
boys wearing bikini briefs
croatia naturist boys
gay cartoon gallery
# gay guys doing it said on August 23, 2006 9:43 PM:
gay cum eater
dad son sex gallery
puberty in boys teen sexuality
gay hot men
swimming boys underwear
# search.subscribe.share said on September 18, 2006 1:02 PM:


With the Office Beta 2 Technical Refresh now live on the web, I wanted to take a few minutes and...
# RSS in Outlook - upgrading from Beta 2 to B2TR | Truckers' Kit said on July 9, 2008 8:40 AM:

PingBack from http://truckkit.info/rss-in-outlook-upgrading-from-beta-2-to-b2tr/

# RSS in der Aussicht - verbessernd von Beta2 zu B2TR | Truckers' Kit said on July 14, 2008 6:12 PM:

PingBack from http://truckkit.info/rss-in-der-aussicht-verbessernd-von-beta2-zu-b2tr/

# RSS in Outlook - upgrading from beta 2 tons of B2TR | Truckers' Kit said on July 25, 2008 12:10 PM:

PingBack from http://truckkit.info/rss-in-outlook-upgrading-from-beta-2-tons-of-b2tr/

# RSS in the prospect - improving from Beta2 to B2TR | Truckers' Kit said on August 3, 2008 1:14 PM:

PingBack from http://truckkit.info/rss-in-the-prospect-improving-from-beta2-to-b2tr/

New Comments to this post are disabled

Search

This Blog

Syndication

Page view tracker