Blog - Title

Delays delivering mail alerts in TFS 2010

Delays delivering mail alerts in TFS 2010

  • Comments 10

I've seen a few people complain about longer delays delivering mail alerts in TFS 2010.  The reason is that, by default, we batch all notifications and process them every 2 minutes.  This was done to accomplish higher scale on very high volume servers.  In retrospect a 2 minute delay was not the best default.  We will look at changing this for SP1.  In the meantime, you can change the default yourself.  You can set the delay to 0 if you want, in which case you'll get the same behavior you saw in TFS 2008.  Chris Sidi wrote a post giving some details of how this works.

Also, we are working on a Power Tool that will provide a UI for browsing and modifying these kinds of settings.

Brian

Leave a Comment
  • Please add 5 and 2 and type the answer here:
  • Post
  • I don't understand Microsoft when it comes to stuff like this.  When something works well, why does MS feel a need to mess with it?  

  • Start by realizing that "Microsoft" doesn't make decisions.  Microsoft is a whole bunch of people and those people make decisions.  Occasionally they make them without realizing the ramifications of them and there are negative consequences.  This one was made as the team was doing performance work on TFS and they were seeing a significant amount of time (like 30%) spent processing event notifications when the system was under high load.  They reduced it significantly by batching the event processing.  I think, at the time, they were thinking that a 2 minute delay in email notifications would be negligible compared to the already unpredictable delay in mail delivery.  I think they failed to consider other uses for eventing where timely delivery may be important.

    Brian

  • Brian,

    I, and I trust others, will appreciate your willingness to engage reasonable and understandable criticism directly - as you have here - in an even-handed and intellectually honest fashion, I wish this was more the case with many of the responses I read in the MSFT blog/forum sphere to reasonable and understandable criticism and pointed questioning.

    That said, the scenario you relate as to how this particular decision was made would seem to point to a potential problem with the larger product dev and testing process, i.e., a myopic view of product development and testing whereby corner cases of one kind or another are the final driver for change from one release to another.

    A high volume "under load" as you relate TFS server that is under a load high enough to actually bog down an email server and therefore requires 2 minute delays for batch processing is a TFS email server that can only be called a corner case, and can not possibly represent the majority of TFS license paying customers and their typical TFS installations.

    Correct me if I am wrong on this point. To make the default a default setting that solves only the largest TFS installation email issues out of the box seems myopic, and at worst, illogical and lacking thoughtful consideration.

    I don't understand how a decision that benefits only the most sophisticated, largest, and active TFS installations trumps the majority of installations wherein an arbitrary two minute delay for email alerts and events would obviously, and has, cause confusion and concern and dissatisfaction.

    Is this case an aberration or a symptom?

    I appreciate any learning you may be patient and kind enough to offer.

    Keep up the great work by the way, I have benefitted immensely from your mentoring and the power of example that your work and community involvement demonstrate.

    -adam

  • Adam, not sure what you're talking about. In our shop, people are not sitting next to each other with one pressing the check-in button and another one using the stopwatch to trace the time it takes to deliver an email with pure FYI content.

    Let me put it differently: if an employee already knows that a notification email will arrive, it doesn't make sense to wait for the arrival at all. Plus, as Brian already said, it could also have been a delay in email delivery.

  • @Ooh :: appreciate your share. I guess it makes two of us, as I have no idea what you're talking about (nor do I understand your agenda or point at all). It is however good to know that this issue that Brian shared as having been an issue for many, hence the I imagine genesis of his post, is not in fact an issue "in your shop", congratulations on that.

    -adam

  • We have, indeed received feedback from a number of people about it but to my knowledge, very few of them complained about the delay in delivering email.  The complaints I'm aware of are the result of people capturing the same events that trigger emails and doing something programatic with them.  The inserted delay is increasing the probability that something else happens to the same resources before the event fires and increasing the occurance of conflicts.

    I don't think it's a corner case - we have many customers with very high scale servers, however, I agree it was not an appropriate default.  FWIW, as a general rule, I don't like software that requires configuration to make it behave well.  I prefer software that is self tuning - it has been one of the TFS principles from day one.  In this case, I believe the right answer would be to fire the event immediately and batch them when under heavy load.  That way no one has to learn about obtuse settings.  Absent that, I agree that defaulting to instantaneous and having a configurable delay would have been better.

    I'm not sure how to answer your question about the pattern.  For the most part, people make good decisions.  This decision was not arbitratry - the performance issue was creating problems getting accurate results from our stress tests and something needed to be done.  Everyone makes mistakes.

    A friend of mine has a saying "Good decisions come from experience.  Experience comes from bad decisions".  In the grand scheme of things this was not anywhere near one of the worst decisions that I've seen.  There's a work around for those who have an issue with this and we'll fix it at our earliest opportunity.

    Brian

  • Brian,

    Thak you very much for this learning, I really appreciate your willingness to spend time crafting this further detail, and I hope others find it as helpful and guiding as I do. Thanks again, Adam.

  • Hi Brian

    I was running into this issue, while I worked on a proccess tool. It's an extension to manage more flexible workflows and human interaction. There are many things, which are unclear or not documented. Is it possible to exchange my problems, thoughts and ideas with one of you (the product team)?

    If yes, please leave me and email: nospam [the_email_sign] mastertheteam[dot]net

    Thanks for your ongoing community support!

    Nicolas

  • Hi - was the power tool for managing these settings released?  I've had a look in the TFS power tool 2010 download but I'm not seeing anything obvious.

    Cheers,

    Richard

  • No, unfortunately not yet.  I'm hoping to get it in the next Power Tools release.

    Brian

Page 1 of 1 (10 items)