Being Cellfish

Stuff I wished I've found in some blog (and sometimes did)

July, 2010

Change of Address
This blog has moved to
  • Being Cellfish

    CCR vs Task Parallel Library


    I was reading this about profiling gotchas and couldn't resist the urge to download the samples from the book and implement a few of them using CCR to compare performance against TPL. I played around mostly with the parallel quick-sort. The first observation is that CCR takes a little more setup than using TPL but most of that would be something you do once in your application anyway so I would consider them similar in readability. Both implementations looked very similar and straight forward. I guess that is what I like with both TPL and CCR is that your code looks almost like regular single threaded code, but with the benefits of concurrent execution.

    While executing these two variants and tweaking thresholds I noticed that CCR almost always ended up executing 10-20% slower than the equivalentTPL implementation. Looking a little closer at what was happening I think it all made sense. First of all this was on a twin core machine so there are really only two threads running. CCR is built around message passing and the way I implemented the sort algorithm was to post a message when a part of the array was sorted signaling when all parts had been sorted. Since quick-sort uses a pivot number putting it in the right place it also means that a lot of these partial sorts were completed by just putting a single number in place. That is a lot of messages being sent. The handler also must be a run once handler that then puts itself back in the queue so that I don't update the remaining unsorted counterconcurrently. Hence a lot of time is relatively spent in the CCR dispatcher just to decide when everything is done rather than sorting. Did that make sense without a code listing?

    Anyway... My gut feeling is that TPL is more well suited for parallel computations than CCR. The code looks a little cleaner and performance is a little better. However CCR is more well suited for event based implementations where TPL is not that big of an advantage. As usual it is a matter of choosing the right tool for the right problem.

    Preemptive comment:I think I could have made my CCR implementation a little more efficient and I suspect the differences would have been smaller with more cores but I didn't test that. I'll let you do that!

  • Being Cellfish

    TDD is not about writing tests


    This week I held a brown-bag on TDD and coding dojos in my new team. The key point was that TDD (or unit tests without TDD) should not be about writing tests. It is about documenting behavior and writing good code. As a side effect you get a test suite that make you confident your code works as intended even when changing the code. I also emphasized that we're not making the code testable for the sake of testing it, we're writing better code since there is a correlation between bad designs and designs that are difficult to test. Michael Feathers held a talk recently on this topic. At the same conference he also talked about a simple rule you should follow even if not doing TDD; Never hide a TUF in a TUC.

  • Being Cellfish

    Good tools may hide symptoms but not cure the disease


    A few weeks ago a question was asked on an internal mailing list about how the unit test framework part of VS2010 works. Basically some client had decided to move away from that framework in favor of NUnit. Even though I think they did the right thing I think they did it for all the wrong reasons. The problem they encountered was that tests from multiple test classes were run at the same time and also the class initiators for one class ran before the class cleanup for another one was completed. If I understood the problem correctly the problem was not with their unit tests but with a number of bigger integration tests that were implemented using a unit test framework. Still that is not their problem I think. the problem is that they had tests that used some common resource (probably a database) and that it was used so that fixtures overlapped. Switching to a framework that happened to not run tests concurrently (like VS2010) nor in random order (like did not fix their problem. It just suppressed the symptoms. It's like talking pain killers for a broken leg. It may feel good for the moment but it will not help you in the long run.

    What they should have done was to fix their test fixtures. There are three things they could have done;

    • Do not used shared fixtures. The obvious advice but it may make tests run slower and sometimes you have some resource that just needs to be shared.
    • Protect shared resources. Sometimes you may have some resource you just have to share. A better approach would have been to protect it in the test code by using a lock or something. Forcing the tests to not run in parallel.
    • Share fixtures but not data. Some fixtures takes a long time to setup and it is OK to share those fixtures amon lots of tests. But you should not share the data used by each test. For example consider a fixture that needs to create a database and add data to it. If you have 20 tests that needs a customer record in the database; have each test use different customer records. That way you share fixture but not data.
  • Being Cellfish

    Good tools may hide symptoms but not cure the disease - mocking frameworks edition


    Yesterday was about writing good test fixtures rather than using a tool that makes you get away with bad designs. The same thing goes for mocking frameworks. If you use a mocking framework you should avoid using it do fake things you wrote yourself. And you should use it rarely for anything else. That way you're more likely to end up with a good design in my opinion.

  • Being Cellfish

    How to not dispatch work in CCR


    Being new to the Robotics team also means I get a chance to do some beginner mistakes with CCR. Let's assume that you have a method that does a lot of work and you want to use CCR to do the work. Before using CCR your code would have looked like this:

    1: private void ProcessWithoutCcr() 2: { 3: // Do Work 4: // Do a lot more work that takes a lot of time 5: }

    If all the work you're doing is just taking time because it needs to use the CPU the method is actually OK. But if there are a lot of things working with files, network etc you want to use a series of yield returns to split up the execution. If you're not really interested in the result or the methods cannot fail you (as a beginner) wold maybe do something like this:

    1: private void DoSomething() 2: { 3: // Do Work 4: } 5:   6: private IEnumerator<ITask> DoSomethingElse() 7: { 8: // Do work that takes a lot of time 9: // and split up with yield returns 10: yield break; 11: } 12:   13: public IEnumerator<ITask> Process() 14: { 15: yield return Arbiter.FromHandler(DoSomething); 16: yield return Arbiter.FromIteratorHandler(DoSomethingElse); 17: }

    This will however not work and only DoSomething will be executed. The problem is how the execution IEnumerator<ITask> is implemented in CCR. To get it to work you can use a SuccessFailurePort either by passing it in or by returning it. Both approaches are shown in this example:

    1: private SuccessFailurePort DoSomethingWithPort() 2: { 3: // Do Work 4: SuccessFailurePort port = new SuccessFailurePort(); 5: port.Post(SuccessResult.Instance); 6: return port; 7: } 8:   9: private IEnumerator<ITask> DoSomethingElseWithPort(SuccessFailurePort port) 10: { 11: // Do work that takes a lot of time 12: // and split up with yield returns 13: port.Post(SuccessResult.Instance); 14: yield break; 15: } 16:   17: public IEnumerator<ITask> ProcessWithPort() 18: { 19: yield return DoSomethingWithPort().Choice(CcrServiceBase.EmptyHandler, CcrServiceBase.EmptyHandler); 20: SuccessFailurePort port = new SuccessFailurePort(); 21: SpawnIterator(port, DoSomethingElseWithPort); 22: yield return port.Choice(CcrServiceBase.EmptyHandler, CcrServiceBase.EmptyHandler); 23: }

    In this example no real handler is used for the port so it could be simplified with a more simple port type, but the SuccessFailurePort pattern is pretty nifty so why not use it...

  • Being Cellfish

    Sometimes you need to be careful how you setup your tests


    I've been working with MSTest (the unit test framework that comes with Visual Studio) lately and I learned the hard way that it is a bad idea to not use the TestInitialize attribute. This may sound obvious to you but I'm more used to work in and there I typically setup as much as I can with field initializers and the rest in the constructor. So while working in MSTest I did the same thing; I setup as much as I could in field initializers. I added a failing test, implemented it and suddenly I had multiple test errors. Not failures, but errors. The message was: Type Tin assembly A is not serializable. Interestingly I did not touch T and it's an exception so it's definitely serializable... Looking at what a search on the Internet gives you for this error indicates problems with finding the right assemblies. Well, the build environment I'm using is kind of tweaked, but i didn't change that when the tests started to report errors. By looking at the tests now with errors I manage to conclude that they all use a default constructor for an object that calls another constructor with an empty string. And that constructor does not accept an empty string. It should but it doesn't. So now I know how to fix it but why the bad error message I think. Well I don't know that, but i do know that when i moved the call to the constructor from the field initialization to a method with the TestInitialize attribute i got a test failure; Unexpected excpetion in initializer: T: Path must start with '/'. I guess I'll stop using field initializers when dealing with MSTest in the future...

  • Being Cellfish

    Having a ship party at Microsoft


    The best thing about relocating to Redmond is the opportunity to attend ship parties. Assuming that you work on a shipping product... I would like to add a few things to this list:

    • Do not light a cigarette next to somebody covered in silly-string.
    • Do not work on a team that saves budget unless you like multiple stage parties spanning several days.
    • Do not assume that the ice-sculpturer has limits in what they can build.
    • If you cannot pass some bushes without falling into them, you should not call your wife to be picked up.
  • Being Cellfish

    What morale events look like


    The thing about ship parties made me think about a more common event at Microsoft; Morale events. What I've seen is that depending on who's the hosting manager (i.e. the manager that is everybody else's boss directly or indirectly) your morale event will be different. On one end of the spectrum we have the manager who organizes booze cruises. In the middle you find managers who like fine dining and tasty beverages. And the other end of the spectrum we have managers who know they need (or think they need) to provide alcohol but they don't drink themselves. The first two types of morale events are typically appreciated by a lot of people. It's easy to like anything that involves tasty beverages and good food. The last type however tends to annoy a lot of people since supply of alcoholic beverages is often terribly underestimated. Because in my opinion, there is nothing worse than running out of food and drinks at a party.

    Premtive comment: No, I don't think it's a must to provide unlimited amount of food and drinks. I would gladly pay for a few beers but the fact that you run out of drinks is pretty bad I think. It's OK to run out of drinks and food, but not when the party is starting. It's OK towards the end of a party!


  • Being Cellfish

    High Importance leads to Less Effectivness


    A while ago I stumbled over this article. It talks about how teams working on important projects tend to revert to less effective ways of communicating and executing their work. Ithink it makes sense. Among amateurs I think it is quite common to fail a task under pressure that they would have no problem completing otherwise. Everybody knows how easy it is to get all answers right watching Jeopardy, but the people actually participating make all kinds of stupid mistakes. And in my work I've seen projects where the most important pieces have been late while less important pieces are completed ahead of schedule. I've also seen small teams doing proof of concepts that exceeds the accomplishments of much larger teams in the same time frame. Previously I've explained that with the overhead of large teams but this article made me think if there is another dimension to it because underdog projects always seems to be a success.

  • Being Cellfish

    Parallel QuickSort using CCR


    Since there was some requests to get the code for the parallel QuickSort mentioned here I decided to post the code I used. This is not a complete listing. You need the samples that can be downloaded from here. This listing is an extension to the Sort class in Chapter6/ParallelSort/Sort.cs.

    1: private static DispatcherQueue queue = 2: new DispatcherQueue("CcrQueue"); 3: private static Port<SortInfo> infoPort = new Port<SortInfo>(); 4: private static Port<int> resultPort = new Port<int>(); 5: private static EventWaitHandle done = 6: new EventWaitHandle(false, EventResetMode.ManualReset); 7: private static int remaining = 0; 8: private static int[] theArray = null; 9:   10: class SortInfo 11: { 12: public int from; 13: public int to; 14: public int depth; 15: } 16:   17: public static void CcrQuickSort(int[] array) 18: { 19: theArray = array; 20: remaining = array.Length; 21: Arbiter.Activate(queue, 22: Arbiter.Receive(true, infoPort, CcrSortPart)); 23: Arbiter.Activate(queue, 24: Arbiter.Receive(false, resultPort, CcrSortPartDone)); 25: infoPort.Post( 26: new SortInfo 27: { 28: from = 0, 29: to = array.Length, 30: depth = 31: (int) Math.Log(Environment.ProcessorCount,2)+4 32: }); 33: done.WaitOne(); 34: } 35:   36: private static void CcrSortPartDone(int count) 37: { 38: remaining -= count; 39: if (remaining <= 0) 40: done.Set(); 41: else 42: Arbiter.Activate(queue, 43: Arbiter.Receive(false, resultPort, CcrSortPartDone)); 44: } 45:   46: private static void CcrSortPart(SortInfo info) 47: { 48: if (info.from == 49: return; 50: if ( - info.from <= Threshold) 51: { 52: InsertionSort(theArray, info.from,; 53: resultPort.Post( - info.from); 54: } 55: else 56: { 57: int pivot = info.from; 58: pivot = Partition(theArray, info.from,, pivot); 59: if (info.depth > 0) 60: { 61: infoPort.Post(new SortInfo 62: { 63: from = info.from, 64: to = pivot, 65: depth = info.depth - 1 66: }); 67: infoPort.Post(new SortInfo 68: { 69: from = pivot + 1, 70: to =, 71: depth = info.depth - 1 72: }); 73: } 74: else 75: { 76: CcrSortPart(new SortInfo 77: { 78: from = info.from, 79: to = pivot 80: }); 81: CcrSortPart(new SortInfo 82: { 83: from = pivot + 1, 84: to = 85: }); 86: } 87: resultPort.Post(1); 88: } 89: }

    The use of a static array and remaining counter are obviously not a great design for your typical application, but sneaked in there in the hunt for a saving a few microseconds. Didn't really matter...

Page 1 of 2 (11 items) 12