Posts
  • Eric Gunnerson's Compendium

    A little something that made me happy…

    • 0 Comments

    Last week, I was doing some work on a utility I own. It talked to some servers in the background that could be slow at times and there was no way to know what was happening, so I needed to provide some way of telling the user that it was busy.

    I started writing a unit test for it, and realized I needed an abstraction (R U busy? Yes, IBusy):

    public interface IBusy
    {
        void Start();
        void Stop();
    }

    I plumbed that into the code, failed the test, and then got it working, but it wasn’t very elegant. Plus, the way I have my code structured, I had to pass it into each of the managers that do async operations, and there are four of those.

    The outlook was not very bright, but I can suffer when required, so I started implementing the next set.

    Halfway through, I got an idea. When I added in the asynchronous stuff, I needed a way to abstract that out for testing purposes, so I had defined the following:

    public interface ITaskCreator
    {
        void StartTask<T>(Func<T> func, Action<T> action  );
    }

    This is very simple to use; pass in the function you want to happen asynchronously and the action to process your result. There is a TaskCreatorSynchronous class that I use in my unit tests; it looks like this:

    public void StartTask<T>(Func<T> func, Action<T> action)
    {
        action(func());
    }

    What I realized was that the times I needed to show the code was busy were exactly the times when I was running a task, and I already had a class that knew how to do that.  I modified TaskCreator:

    public class TaskCreator : ITaskCreator
    {
        public EventHandlerEmpty StartedTask;
        public EventHandlerEmpty FinishedTask;

        public void StartTask<T>(Func<T> func,
            Action<T> action  )
        {
            if (StartedTask != null)
            {
                StartedTask();
            }

            Task<T>.Factory.StartNew(func)
                .ContinueWith((task) =>
                {
                    action(task.Result);
                    if (FinishedTask != null)
                    {
                        FinishedTask();
                    }
                }, TaskScheduler.FromCurrentSynchronizationContext());
        }
    }

    It now has an event that is called before the task is started and one that is called after the task is completed. All my main code has to do is hook up appropriately to those events, and any class that uses that instance to create tasks will automatically get the busy functionality.

    I am happy when things turn out so neatly.

  • Eric Gunnerson's Compendium

    What makes a good metric?

    • 0 Comments

    I got into a discussion at work today about metrics – a discussion about correctness vs utility – and I wrote something that I thought would be of general interest.

    ------

    The important feature of metrics is that they are useful, which generally means the following:

    a) Sensitive to the actual thing that you are trying to measure (ie when the underlying value changes, the metric changes).

    b) Positively correlated with the thing you are trying to measure (a change in the underlying value produces a move in the correct direction of the metric).

    c) Not unduly influenced by other factors outside of the underlying value (ie a change in the underlying usage does not have a significant effect on the metric).

    Those give you a decent measure. It’s nice to have other things – linearity, where a 10% in the underlying value results in a 10% move in the metric – but they aren’t a requirement for utility in many cases.

    To determine utility, you typically do a static analysis, where you look at how the metric is calculated, figure out how that relates to what you are trying to measure, and generally try to come up scenarios that would break it. And you follow that up with empirical analysis, where you look at how it behaves in the field and see if it is generating the utility that you need.

    The requirements for utility vary drastically across applications. If you are doing metrics to drive an automated currency trading system, then you need a bunch of analysis to decide that a metric works correctly. In a lot of other cases, a very gross metric is good enough – it all depends on what use you are going to make of it.

    ------

    Two things to add to what I wrote:

    Some of you have undoubtedly noticed that my definition for the goodness of a metric – utility – is the same definition that is use on scientific theories. That makes me happy, because science has been so successful, and then it makes me nervous, because it seems a bit too convenient.

    The metrics I was talking about were ones coming out of customer telemetry, so the main factors I was worried about were how closely the telemetry displayed actual customer behavior and whether we were making realistic simplifying assumptions in our data processing. Metrics come up a lot in the agile/process world, and in those cases confounding factors are your main issue; people are very good at figuring out how to drive your surrogate measure in artificial ways without actually driving the underlying thing that you value.

  • Eric Gunnerson's Compendium

    Port/Adapter/Simulator and UI

    • 0 Comments

    I’ve been working on a little utility project, and I’ve been using port/adapter/simulator on both the server-side parts and on the UI parts. It has been working nicely, though it took me a while to get there.

    Initially, I started with a single UI class. After a bit of extension, it looked a bit ugly, so I decided to break it apart by functional area – there’s a main working area, there’s a favorites area, there’s an executing area, and there’s a config area. For each area, it looks something like this:

    IUIFavorites

    -> UIFavorites
    -> UIFavoritesSimulator (really more of a mock than a simulator)

    FavoritesManager(IUIFavorites, IUIStore, etc. )

    The UI side handles just that – the UI – and the manager part handles the business logic. The UI part exposes events for user actions a properties and methods for modification.

    There was one slightly sticky part of this. There are times when the working area manager needs to add itself to the favorites. Options I thought of:

    1) Passing the UIWorking object to the favorites manager.

    2) Passing the working manager to the favorites manager.

    3) Hooking the UIworking event to a favorites manager method in the main creation code.

    4) Hooking a working manager event to a favorites manager method in the main creation code.

    I didn’t like #1 or #2, so I ended up doing #4. #3 also seemed okay.

  • Eric Gunnerson's Compendium

    The no bugs journey part 3–what kind of bug is this?

    • 0 Comments

    If you are in a buggy group, you have a lot of bugs.

    After writing the preceding, I’ll endeavor to make the rest of this post slightly less obvious.

    Anyway, so, you have a lot of bugs, and you need to get better. How can you get started?

    Well, there’s a common technique in agile called root-cause analysis. It’s a good technique, but your team isn’t ready for it and you can’t afford to do it.

    Instead, I recommend a technique I came up (probably not uniquely) that I call “bug categorization”. I started this when I team that I was on had a notably buggy iteration, and I noticed that a fair number of the bugs were just sloppy – not reading the spec, not running the app after they had made a change, that sort of thing.

    What I wanted was a way to help the team to clean up their act without being prescriptive.

    So, I went through all the bugs – something like 120 – and did a really quick classification on each of them, putting them into one of the following categories:

    • Foreseeable: This is a bug that we should have caught.
    • External: Somebody else broke us.
    • Existing: This bug existed in past versions of our software.
    • Spec: There was a problem with the spec.
    • Other: Something else.

    At our retrospective, I presented a graph of the bugs, and then put up a post-it that said, “too many bugs”.

    And then I let the team work on the problem. And they did great, cutting the foreseeable bugs significantly in just a few iterations despite me expanding the definition of foreseeable to include more bugs.

    And while my initial impetus was to reduce the sloppy bugs, the team spent a lot of time fixing bugs related to definition and communication. For example, they invented a dev/test/pm meeting that the start of a story to make sure everybody was on the same page.

    So, that’s bug categorization.

  • Eric Gunnerson's Compendium

    No bugs journey episode 2: It’s a matter of values…

    • 6 Comments

    Read episode 1 first.

    Here we are at episode 2, and time for another question. Of the three, which one do you value the most?

    1. Shipping on schedule
    2. Shipping with a given set of features
    3. Shipping with high quality

    Write down your answer.

    Ha ha! It was a trick question. Pretty much everybody is going to say, “it depends”, because it’s a tradeoff between these three. Let’s try an alternate expression:

    1. Build it fast
    2. Build a great product
    3. Build it right

    All of these are obviously important. Hmm. Let’s try a different tact…

    If you get in a situation where you need to choose between schedule, features, and quality, which one would you choose?

    1. Shipping on schedule with the right set of features but adequate quality
    2. Shipping late with the right set of features and high quality.
    3. Shipping on schedule and with high quality but with fewer features than we had hoped

    Write down your answer.

    Let’s talk about option #1. First off, it doesn’t exist. The reason it doesn’t exist is that there is a minimum quality level at which you can survive and grow as a company. If you are trying to hold features and schedule constant, as you develop features you are forcing the quality level down – it *has to* go down because something has to give if you get behind (short estimates, etc.). That means you start with a product that you just shipped, and then it degrades in quality as you do features, and then at some point you want to ship, so you realize you need quality, so you focus on getting back to *adequate*. Unfortunately, fixing bugs is the activity with the most schedule uncertainty, so there is no way you are shipping on time.

    It’s actually worse than this. Typically, you’ve reached a point where your quality is poor and you know that there is no way that you are going to get the features done and reach acceptable quality, so you cut features.

    And you end up with a subset of features that ships late with barely adequate quality. There’s an old saying in software that goes, “schedule/features/quality – pick 2”, but what few people realize is that it’s very easy to pick zero.

    I hope I’ve convinced you that the first approach doesn’t work, and given that I’m saying this series is about “no bugs”, you probably aren’t surprised. Let’s examine the other options.

    I’ve see #2 work for teams; they had a feature target in their head, built their software to high quality standards, and then shipped when they were done. This was common in the olden days when box products were common, when the only way to get your product out there was produce a lot of CDs (or even diskettes) and ship them out to people. It worked, in the sense that companies that used the approach were successful.

    It is, however, pretty disruptive on the business as a whole; it’s hard to run a business where you don’t know:

    • If new revenue is going to show up 1 year or 3 years from now
    • When you’ll be spending money on sales and marketing
    • Whether the market has any interest in buying what you build

    Not to mention making it tough on customers who don’t know when they’ll get that bug fix and enhancement they requested.

    Which leaves option #3, which through an amazing coincidence, is the one that is the most aligned with agile philosophy. Here’s one way to express it:

    Given that:

    • It is important to the rest of the business and our customers that we be predictable in our shipping schedule
    • Bugs slow us down, make customers unhappy, and pose significant schedule risk
    • The accuracy at which we can make estimates of the time to develop features is poor
    • Work always expands (people leave the team or get sick, we need to spend time troubleshooting customer issues, engineering systems break), forcing us to flex somewhere

    The only rational option that we have is to flex on features. If we are willing to accept that features don’t show up as quickly as we would like (or had originally forecast), it is possible to ship on time with high quality.

    I chose the phrase “it is possible” carefully; it is possible to build such a system but not get the desired result.

    Flexing on features effectively

    Time for another exercise. Studies have shown that you will get a much better review of a plan if you spend some focused time thinking about how it might go wrong.

    Thinking about the world I just described, what could go wrong? What might keep us from shipping on time with high quality?

    Write down your answers.

    Here’s my list:

    1. At the time we need to decide to cut features, we might have 10 features that are 80% done. If we move out all 10 of them to the next iteration, we have nothing to ship.
    2. We might have a hard time tracking where we are early enough to make decisions; most people have seen a case where all the features were on schedule until a week before shipping and suddenly 25% were behind by a week or more. If this happens, it may be too late to adapt.
    3. We might have teams that are dependent on each other; the only way to make my schedule is to cut work from team A, but that means that teams B & C can’t finish their work as planned, and they will have to adjust, using time we don’t have.
    4. This release was feature A and a few bugfixes, and feature A isn’t on schedule. We’ll look silly if we just shipped bugfixes.

    (At this point I’m really hoping that you don’t have something important on your list that I’ve missed. If so, that’s what comments are for…)

    How can we mitigate? Well, for the first one, we can focus on getting one feature done before we move on to the next one. That means that we would have 8 features 100% done instead of 10 features 80% done. This is one of the main drivers for the agile “work together as a team” approach.

    This mitigation works for the second one as well. If we make our tracking “is the feature fully complete and ready to ship”, we can tell where we are (3/10 features current done and ready to ship (“done done” in agile terminology)) and we have a better chance of predicting where we are going. This is another driver for “work together as a team”. Note that for both the first and the second one, the more granular our features are, the easier it is to make work; it works great if the team has 5-10 items per iteration but poorly if it only has two. This is the driver for “small self-contained stories” and “velocity measurement” in agile.

    I have a few thoughts on the third one. You can mitigate by using short cycles and having teams B and C wait until A is done with their work. You can try to break the work A does into parts so the part the other teams need can be delivered further.  Or you can go with a “vertical team” approach, which works great. 

    For the fourth one, the real problem is that we put all of our eggs in one basket. Chopping feature A up will give us some granularity and the chance to get part way there. I also think that a shorter cycle will be our friend; if we are giving our customers updates every month, they will probably be fine with a message that says, “this month’s update only contains bugfixes”.

    To summarize, if we have small stories (a few days or less) and we work on them sequentially (limiting how many we are working on at one time), our problems about tracking and having something good enough to ship become much more tractable. We can predict early what features are not going to make it, and simply shift them to the next cycle (iteration). That is our pressure-relief valve, the way that we can make sure we have enough time to get features completed on the shipping schedule with great quality.

    The world isn’t quite as simple as I’ve described it here, but I’ve also omitted a number of advanced topics that help out with that, so I think it’s a pretty fair overview.

    Before I go on, I’d like to address one comment I’ve heard in relation to this approach.

    If we flex on features, then I’m not going to be able to forecast where we will be in 6 months

    The reality is that nobody has ever been able to do that, you were just pretending. And when you tried, you were often spending time on things that weren’t the most important ones because of new priorities.

    Instead, how about always working on whatever is the highest priority for the business, and having the ability to adjust that on a periodic basis? How does that sound?

    The culture of commitment

    Time for another question:

    What are the biggest barriers to making this work in your organization?

    Write down your answers.

    I have a list, but I’m only going to talk about one, because it is so much more important than the rest. It’s about a culture of commitment.

    Does your organization ask development teams to *commit* to being done in the time that they estimated? Does it push back when developer estimates are too large? Do developers get emails or visits from managers telling them they need to be done on time?

    If so, you are encouraging them to write crappy code. You have trained them that being on the “not done” list is something to be avoided, and they will do their best not to avoid it. They can either work harder/longer – which has limited effectiveness and is crappy for the company in other ways – or they can cut corners. That’s all they can do.

    Pressure here can be pretty subtle. If my team tracks “days done” and I have to update my estimate to account for the fact that things were harder than I thought, that puts pressure on me to instead cut corners.

    This is one reason agile uses story points for estimation; it decouples the estimation process from the work process.

    Changing the culture

    Here are my suggestions:

    1. Get rid of any mention the words “committed” or “scheduled” WRT work. The team is *planning* what work they will attempt in the next iteration.
    2. Change the “are you going to be done?” interaction to ask, “is there anything you would like to move out to the next iteration?” Expect that initially, people aren’t going to want to take you up on this, and you may have to have some personal interactions to convince somebody that it’s really okay to do this.
    3. When you have somebody decide to move something out, make sure that you take public note of it.

      “To adjust the amount of work to the capacity of the team and maintain shippable quality, the paypal feature has been moved out to the next iteration.”
    4. When teams/individuals check in unfinished features (buggy/not complete) instead of letting those features make it into the common build, force them to remove them or disable them. Make this part of the team-wide status report for the iteration (“the ‘email invoice’ feature did not meet quality standards and has been disabled for this iteration”).
    5. I highly recommend switching to a team-ownership approach. Teams are much more likely to change how they work and improve over time than individuals.

    You will need to be patient; people are used to working in the old ways.

    This cultural change is the most important thing you can do to reduce the number of bugs that you have.

  • Eric Gunnerson's Compendium

    No Bugs Journey Episode 1: Inventory

    • 3 Comments

    Over the past few years I had the opportunity to work in an environment in which we achieved a significant reduction in bugs and an associated increase in quality and developer satisfaction.

    This series will be about how to start the journey from wherever you currently are to a place with fewer bugs, and a bunch of things to think about and perhaps try on your team. Some will be applicable for the individual developer, some will be about an engineering team, some will be for the entire group.

    As with all things agile, some techniques will work great for your team, some may require some modification, and some may not work at all. Becoming agile is all about adopting an experimental mindset.

    Whether you are on a team awash in bugs or you are already working towards fewer bugs, I hope you will find something of value.

    Inventory

    We’re going to start by stepping back, taking stock of the situation, and practicing the “Inspect” part of the agile “Inspect and adapt” cycle. We’ll be using a feedback technique that is often used in agile groups.

    <aside>

    The agile software community is very effective at finding useful techniques outside of the field and adopting them, which is great. They are not great at giving credit for where those techniques were created, so there are a lot of “agile” practices that really came from somewhere else. Which is why I wrote “is often used in agile groups” rather than “agile technique”…

    The technique we’re going to use is known as “affinity mapping”, which has been around for a long time.

    </aside>

    (Note: the following exercise works great as a team/group activity as well. If you want to do it as a group exercise, go find a few videos on affinity mapping and watch them first).

    Find some free time and a place where you won’t be disturbed. Get a stack of sticky notes and a pen.

    Your task is to write down the issues that are contributing to having lots of bugs, one per sticky note. Write down whatever comes into your head, and don’t spend a lot of time getting it perfect. You can phrase it either as a problem (“lots of build breaks”) or a solution (“set up a continuous integration server”).

    Start with what is in the forefront of your mind, and write them down until you run out of things to write. There is no correct number to end up with; some people end up with 4 notes, some people end up with 20.

    Now that you have done that, look at this list below, and see if thinking about those areas leads you to get any more ideas. Write them down as well.

    1. Planning
    2. Process tracking and management
    3. Reacting to changes
    4. Interactions between roles (dev, test, management, design, customer)
    5. Recognition – what leads to positive recognition, what leads to negative recognition
    6. Who is respected on the team, and why?
    7. Promotion – what sort of performance leads to promotion
    8. Developer techniques
    9. How is training handled
    10. Who owns getting something done

    This second part of the exercise is about thinking different (aka “out of the box”), in looking for non-obvious causes. I’m explicitly having you do it to see if you come up with some of the topics that I’m going to cover in the future.

    Clustering

    We now have a big pile of sticky notes. Our next task is to review the notes to see if there is any obvious clustering in them. Put all of the notes up on a wall/whiteboard, and look at them. If you find two notes that are closely related, put them next to each other. If you are doing this as an individuals, you will probably only find one or two clusters; if doing it as a group, you will find more.

    Note that we are not doing categorization, where each sticky is in a group. At the end I expect to see a lot of notes that aren’t in clusters. That is expected.

    Reflection

    Take a few minutes and look at what you have. My guess is that you have a decent list of issues, and my hope is that you thought of things that you weren’t thinking about before.

    At this point, you might be saying, “If I know what the issues are, why shouldn’t I just go off and start working to address them? Why do I need to read what Eric says?”

    If you said that, I am heartily in support of the first statement. Determine which sticky/cluster you (or your team) wants to tackle, figure out a small/cheap/low-risk experiment that you can try, and start making things better.

    As for my value-add, my plan is for it to add value in two ways:

    First, I’ve been through this a few times, and my guess is that you missed a few issues during the exercise. I know that I missed a ton of stuff the first time I tried this.

    Second, I’m hoping to give some useful advice around what techniques might be used for a specific issue, common pitfalls, and ordering constraints (you may have to address issue B to have a decent chance of addressing issue A).

    Next time

    Next time I’m to talk about the incentives that are present in a lot of teams and how they relate to bugs.

    No bugs journey episode 2: Stop encouraging your developers to write bugs

    It’s about to get real.

  • Eric Gunnerson's Compendium

    Unit test success using Ports, Adapters, & Simulators–kata walkthrough

    • 0 Comments

    You will probably want to read my conceptual post on this topic before this one.

    The kata that I’m using can be found at github here. My walkthrough is in the EricGuSolution branch, and I checked in whenever I hit a good stopping point. When you see something like:

    Commit: Added RecipeManager class

    you can find that commit on the branch and look at the change that I made. The checkin history is fairly coarse; if you want a more atomic view, go over to the original version of the kata, and there you’ll find pretty much a per-change view of the transformations.

    Our patient

    We start with a very simple Windows Forms application for managing recipes. It allows users to create/edit/delete recipes, and the user can also decide where to store their recipes. The goal is to add unit tests for it. The code is pretty tiny, but it’s pretty convoluted; there is UI code tied in with file system code, and it’s not at all clear how we can get tested.

    I’m going to be doing TDD as much as possible here, so the first thing to do is to dive right in and start writing tests, right?

    The answer to that is “nope”. Okay, if you are trying to add functionality, you can use the techniques in Feather’s excellent book, “Working Effectively with Legacy Code”, but let’s just pretend we’ve done that and are unhappy with the result, so we’re going to refactor to make it easier to test.

    The first thing that I want you to do is to look at the application & code, and find all the ports, and then write down a general description of what each port does. A port is something that a program uses to interface with an external dependency. Go do that, write them down, and then code back.

    The Ports

    I identified three ports in the system:

    1. A port that loads/saves/lists/deletes recipes
    2. A port that loads/saves the location of the recipes
    3. A port that handles all the interactions with the user (ie “UI”)

    You could conceivably break some of these up; perhaps the UI port that deals with recipes is different than the one that deals with the recipe storage directory. We’ll see what happens there later on.

    If you wanted, you could go to the next level of detail and write out the details of the interface of each port, but I find it easier to pull that out of the code as I work.

    How do I do this without breaking things?

    That’s a really good question. There are a number of techniques that will reduce the chance of that happening:

    1. If your language has a refactoring tool available, use it. This will drastically reduce the number of bugs that you introduce. I’m working in C#, so I’m going to be using Resharper.
    2. Run existing tests (integrated tests, other automated tests, manual tests) to verify that things still work.
    3. Write pinning tests around the code you are going to change.
    4. Work in small chunks, and test often.
    5. Be very careful. My favorite method of being very careful is to pair with somebody, and I would prefer to do it even if I have pretty good tests.

    Wherever possible, I used resharper to do the transformations.

    Create an adapter

    An adapter is an implementation of a port. I’m going to do the recipe one first. My goal here is to take all the code that deals with these operations and get it in one place. Reading through the code in Form1.cs, I see that there the LoadRecipes() method. That seems like something our port should be able to do. It has the following code:

    private void LoadRecipes()
    {
        string directory = GetRecipeDirectory();
        DirectoryInfo directoryInfo = new DirectoryInfo(directory);
        directoryInfo.Create();
    
        m_recipes = directoryInfo.GetFiles("*")
            .Select(fileInfo => new Recipe { Name = fileInfo.Name, Size = fileInfo.Length, Text = File.ReadAllText(fileInfo.FullName) }).ToList();
    
        PopulateList();
    }

    I see three things going on here. First, we get a string from another method, then we do some of our processing, then we call the “PopulateList()” method. The first and the last thing don’t really have anything to do with the concept of dealing with recipes, so I’ll extract the middle part out into a separate method (named “LoadRecipesPort()” because I couldn’t come up with a better name for it).

    private void LoadRecipes()
    {
        string directory = GetRecipeDirectory();
        m_recipes = LoadRecipesPort(directory);
    
        PopulateList();
    }
    
    private static List<Recipe> LoadRecipesPort(string directory)
    {
        DirectoryInfo directoryInfo = new DirectoryInfo(directory);
        directoryInfo.Create();
    
        return directoryInfo.GetFiles("*")
            .Select(
                fileInfo =>
                    new Recipe
                    {
                        Name = fileInfo.Name,
                        Size = fileInfo.Length,
                        Text = File.ReadAllText(fileInfo.FullName)
                    })
            .ToList();
    }

    Note that the extracted method is static; that verifies that it doesn’t have any dependencies on anything in the class.

    I read down some more, and come across the code for deleting recipes:

    private void DeleteClick(object sender, EventArgs e)
    {
        foreach (RecipeListViewItem recipeListViewItem in listView1.SelectedItems)
        {
            m_recipes.Remove(recipeListViewItem.Recipe);
            string directory = GetRecipeDirectory();
    
            File.Delete(directory + @"\" + recipeListViewItem.Recipe.Name);
        }
        PopulateList();
    
        NewClick(null, null);
    } 

    There is only one line there – the call to File.Delete(). I pull that out into a separate method:

    private static void DeleteRecipe(string directory, string name)
    {
        File.Delete(directory + @"\" + name);
    }

    Next is the code to save the recipe. I extract that out:

    private static void SaveRecipe(string directory, string name, string directions)
    {
        File.WriteAllText(Path.Combine(directory, name), directions);
    }

    That is all of the code that deals with recipes.

    Commit: Extracted recipe code into static methods

    <aside>

    You may have noticed that there is other code in the program that deals with the file system, but I did not extract it. That is very deliberate; my goal is to extract out the implementation of a specific port. Similarly, if I had been using a database rather than a file system, I would extract only the database code that dealt with recipes.

    This is how this pattern differs from a more traditional “wrapper” approach, and is hugely important, as I hope you will soon see.

    </aside>

    The adapter is born

    I do an “extract class” refactoring and pull out the three methods into a RecipeStore class. I convert all three of them to instance methods with resharper refactorings (add a parameter of type RecipeStore to each of them, then make them non-static, plus a bit of hand-editing in the form class). I also take the directory parameter and push it into the constructor. That cleans up the code quite a bit, and I end up with the following class:

    public class RecipeStore
    {
        private string m_directory;
    
        public RecipeStore(string directory)
        {
            m_directory = directory;
        }
    
        public List<Recipe> Load()
        {
            DirectoryInfo directoryInfo = new DirectoryInfo(m_directory);
            directoryInfo.Create();
    
            return directoryInfo.GetFiles("*")
                .Select(
                    fileInfo =>
                        new Recipe
                        {
                            Name = fileInfo.Name,
                            Size = fileInfo.Length,
                            Text = File.ReadAllText(fileInfo.FullName)
                        })
                .ToList();
        }
    
        public void Delete(string name)
        {
            File.Delete(m_directory + @"\" + name);
        }
    
        public void Save(string name, string directions)
        {
            File.WriteAllText(Path.Combine(m_directory, name), directions);
        }
    }
    Commit: RecipeStore instance class with directory in constructor

    Take a look at the class, and evaluate it from a design perspective. I’m pretty happy with it; it does only one thing, and the fact that it’s storing recipes in a file system isn’t apparent from the method signature. The form code looks better as well.

    Extract the port interface & write a simulator

    I now have the adapter, so I can extract out the defining IRecipeStore interface.

    public interface IRecipeStore
    {
        List<Recipe> Load();
        void Delete(string name);
        void Save(string name, string directions);
    }

    I’ll add a new adapter class that implements this interface:

    class RecipeStoreSimulator: IRecipeStore
    {
        public List<Recipe> Load()
        {
            throw new NotImplementedException();
        }
    
        public void Delete(string name)
        {
            throw new NotImplementedException();
        }
    
        public void Save(string name, string directions)
        {
            throw new NotImplementedException();
        }
    }

    The simulator is going to be an in-memory implementation of the recipe store, which which will make it very good for unit tests. Since it’s going to be in-memory, it doesn’t have any dependencies and therefore I can write unit tests for it. I’ll do that with TDD.

    Commit: RecipeStoreSimulator with tests

    It was a very simple interface, so it only took me about 15 minutes to write it. It’s not terribly robust, however; it has no error-handling at all. I now have a simulator that I can use to test any code that uses the RecipeStore abstraction. But wait a second; the tests I wrote for the simulator are really tests for the port.

    If I slightly modify my tests so that they use an IRecipeStore, I can re-purpose them to work with any implementation of that port. I do that, but I start seeing failures, because the tests assume an empty recipe store. If I change the tests to clean up after themselves, it should help…

    Once I’ve done that, I can successfully run the port unit tests against the filesystem recipestore.

    Commit: Unit tests set up to test RecipeStore

    RecipeStoreLocator

    We’ll now repeat the same pattern, this time with the code that figures out where the RecipeStore is located. I make the methods static, push them into a separate class, and turn them back into instance methods.

    When I first looked at the code, I was tempted not to do this port, because the code is very specific to finding a directory, and the RecipeStore is the only thing that uses it, so I could have just put the code in the RecipeStore. After a bit of thought, I decided that “where do I store my recipes” is a separate abstraction, and therefore having a locator was a good idea.

    Commit: RecipeStoreLocator class added

    I create the Simulator and unit tests, but when I go to run them, I find that I’m missing something; the abstraction has no way to reset itself to the initial state because the file persists on disk. I add a ResetToDefault() method, and then it works fine.

    Commit: Finished RecipeStoreLocator + simulator + unit tests

    Status check & on to the UI

    Let’s take a minute and see what we’ve accomplished. We’ve created two new port abstractions and pulled some messy code out of the form class, but we haven’t gotten much closer to be able to test the code in the form class itself. For example, when we call LoadRecipes(), we should get the recipes from the store, and then push them out into the UI. How can we test that code?

    Let’s try the same sort of transformations on the UI dependency. We’ll start with PopulateList():

    private void PopulateList()
    {
        listView1.Items.Clear();
    
        foreach (Recipe recipe in m_recipes)
        {
            listView1.Items.Add(new RecipeListViewItem(recipe));
        }
    }

    The first change is to make this into a static method. That will require me to pass the listview and the recipe list as parameters:

    private static void PopulateList(ListView listView, List<Recipe> recipes)
    {
        listView.Items.Clear();
    
        foreach (Recipe recipe in recipes)
        {
            listView.Items.Add(new RecipeListViewItem(recipe));
        }
    }
    And I’ll pull it out into a new class:
    public class RecipeManagerUI
    {
        private ListView m_listView;
    
        public RecipeManagerUI(ListView listView)
        {
            m_listView = listView;
        }
    
        public void PopulateList(List<Recipe> recipes)
        {
            m_listView.Items.Clear();
    
            foreach (Recipe recipe in recipes)
            {
                m_listView.Items.Add(new RecipeListViewItem(recipe));
            }
        }
    }

    This leaves the following implementation for LoadRecipes():

    private void LoadRecipes()
    {
        m_recipes = m_recipeStore.Load();
    
        m_recipeManagerUI.PopulateList(m_recipes);
    }

    That looks like a testable bit of code; it calls load and then calls PopulateList with the result. I extract it into a RecipeManager class (not sure about that name right now), make it an instance method, add a constructor to take the recipe store and ui instances, and pull the list of recipes into this class as well. I end up with the following:

    public class RecipeManager
    {
        private RecipeStore m_recipeStore;
        private RecipeManagerUI m_recipeManagerUi;
        private List<Recipe> m_recipes; 
    
        public RecipeManager(RecipeStore recipeStore, RecipeManagerUI recipeManagerUI)
        {
            m_recipeManagerUi = recipeManagerUI;
            m_recipeStore = recipeStore;   
        }
    
        public List<Recipe> Recipes { get { return m_recipes; } }
    
        public void LoadRecipes()
        {
            m_recipes = m_recipeStore.Load();
    
            m_recipeManagerUi.PopulateList(m_recipes);
        }
    }
    Commit: Added RecipeManager class 

    Now to test LoadRecipes, I want to write:

    [TestMethod()]
    public void when_I_call_LoadRecipes_with_two_recipes_in_the_store__it_sends_them_to_the_UI_class()
    {
        RecipeStoreSimulator recipeStore = new RecipeStoreSimulator();
        recipeStore.Save("Grits", "Stir");
        recipeStore.Save("Bacon", "Fry");
    
        RecipeManagerUISimulator recipeManagerUI = new RecipeManagerUISimulator();
    
        RecipeManager recipeManager = new RecipeManager(recipeStore, new RecipeManagerUISimulator());
    
        recipeManager.LoadRecipes();
    
        Assert.AreEqual(2, RecipeManagerUI.Recipes.Count);
        RecipeStoreSimulatorTests.ValidateRecipe(recipeManagerUI.Recipes, 0, "Grits", "Stir");
        RecipeStoreSimulatorTests.ValidateRecipe(recipeManagerUI.Recipes, 1, "Bacon", "Fry");
    }

    I don’t have the appropriate UI simulator, so I’ll extract the interface and write the simulator, including some unit tests.

    Commit: First full test in RecipeManager

    In the tests, I need to verify that RecipeManager.LoadRecipes() passes the recipes off to the UI, which means the simulator needs to support a property that isn’t needed by the new class. I try to avoid these whenever possible, but when I have to use them, I name them to be clear that they are something outside of the port interface. In this case, I called it SimulatorRecipes.

    We now have a bit of logic that was untested in the form class in a new class that is tested.

    UI Events

    Looking at the rest of the methods in the form class, they all happen when the user does something. That means we’re going to have to get a bit more complicated. The basic pattern is that we will put an event on our UI port, and it will either hook to the actual event in the real UI class, or to a SimulateClick() method in the simulator.

    Let’s start with the simplest one. NewClick() looks like this:

    private void NewClick(object sender, EventArgs e)
    {
        textBoxName.Text = "";
        textBoxObjectData.Text = "";
    }

    To move this into the RecipeManager class, I’ll need to add abstractions to the UI class for the click and for the two textbox values.

    I start by pulling all of the UI event hookup code out of the InitializeComponent() method and into the Form1 constructor. Then, I added a NewClick event to the UI port interface and both adapters that implement the interface. It now looks like this:

    public interface IRecipeManagerUI
    {
        void PopulateList(List<Recipe> recipes);
    
        event Action NewClick;
    
        string RecipeName { get; set; }
        string RecipeDirections { get; set; }
    }

    And, I’ll go off and implement these in the UI class, the simulator class, and the simulator test class.

    <aside>

    I’m not sure that NewClick is the best name for the event, because “click” seems bound to the UI paradigm. Perhaps NewRecipe would be a better name…

    </aside>

    Commit: Fixed code to test clicking the new button

    Note that I didn’t write tests for the simulator code in this case. Because of the nature of the UI class, I can’t run tests across the two implementations to make sure they are the same (I could maybe do so if I did some other kind of verification, but I’m not sure it’s worth it). This code mostly fits in the “if it works at all, it’s going to work” category, so I don’t feel that strongly about testing it.

    The test ends up looking like this:

    [TestMethod()]
    public void when_I_click_on_new__it_clears_the_name_and_directions()
    {
        RecipeManagerUISimulator recipeManagerUI = new RecipeManagerUISimulator();
    
        RecipeManager recipeManager = new RecipeManager(null, recipeManagerUI);
    
        recipeManagerUI.RecipeName = "Grits";
        recipeManagerUI.RecipeDirections = "Stir";
    
        Assert.AreEqual("Grits", recipeManagerUI.RecipeName);
        Assert.AreEqual("Stir", recipeManagerUI.RecipeDirections);
    
        recipeManagerUI.SimulateNewClick();
    
        Assert.AreEqual("", recipeManagerUI.RecipeName);
        Assert.AreEqual("", recipeManagerUI.RecipeDirections);
    }

    That works. We’ll keep going with the same approach – choose an event handler, and go from there. We’re going to do SaveClick() this time:

    private void SaveClick(object sender, EventArgs e)
    {
        m_recipeStore.Save(textBoxName.Text, textBoxObjectData.Text);
        m_recipeManager.LoadRecipes();
    }
    We’ll try writing the test first:
    [TestMethod()]
    public void when_I_click_on_save__it_stores_the_recipe_to_the_store_and_updates_the_display()
    {
        RecipeStoreSimulator recipeStore = new RecipeStoreSimulator();
        RecipeManagerUISimulator recipeManagerUI = new RecipeManagerUISimulator();
    
        RecipeManager recipeManager = new RecipeManager(recipeStore, recipeManagerUI);
    
        recipeManagerUI.RecipeName = "Grits";
        recipeManagerUI.RecipeDirections = "Stir";
    
        recipeManagerUI.SimulateSaveClick();
    
        var recipes = recipeStore.Load();
    
        RecipeStoreSimulatorTests.ValidateRecipe(recipes, 0, "Grits", "Stir");
    
        recipes = recipeManagerUI.SimulatorRecipes;
    
        RecipeStoreSimulatorTests.ValidateRecipe(recipes, 0, "Grits", "Stir");
    }  

    That was simple; all I had to do was stub out the SimulateSaveClick() method. The test fails, of course. About 10 minutes of work, and it passes, and the real UI works as well.

    Commit: Added Save
    Commit: Added in Selecting an item in the UI
    Commit: Added support for deleting recipes

    To be able to support changing the recipe directory required the recipe store to understand that concept. This was done by adding a new RecipeDirectory property, and implementing it in both IRecipeStore adapters.

    Commit: Added support to change recipe store directory

    All done

    Let’s look at what is left in the form class:

    public partial class Form1 : Form
    {
        private RecipeManager m_recipeManager;
    
        public Form1()
        {
            InitializeComponent();
    
            var recipeManagerUI = new RecipeManagerUI(listView1, 
                buttonNew, 
                buttonSave, 
                buttonDelete, 
                buttonSaveRecipeDirectory, 
                textBoxName, 
                textBoxObjectData, 
                textBoxRecipeDirectory);
    
            var recipeStoreLocator = new RecipeStoreLocator();
            var recipeStore = new RecipeStore(recipeStoreLocator.GetRecipeDirectory());
            m_recipeManager = new RecipeManager(recipeStore, recipeStoreLocator, recipeManagerUI);
            m_recipeManager.Initialize();
        }
    }

    This is the entirety of the form class; it just creates the RecipeManagerUI class (which encapsulates everything related to the UI), the RecipeStoreLocator class, the RecipeStore class, and finally, the RecipeManager class. It then calls Initialize() on the manager, and, at that point, it’s up and running.

    Looking through the code, I did a little cleanup:

    1. I renamed RecipeDirectory to RecipeLocation, because that’s a more abstract description.
    2. I renamed Recipe.Text to Recipe.Directions, because it has been buggin’ me…
    3. Added in testing for Recipe.Size

    Commit: Cleanup

  • Eric Gunnerson's Compendium

    Unit Test Success using Ports, Adapters, and Simulators

    • 2 Comments

    There is a very cool pattern called Port/Adapter/Simulator that has changed my perspective about unit testing classes with external dependencies significantly and improved the code that I’ve written quite a bit. I’ve talked obliquely about it and even wrote a kata about it, but I’ve never sat down and written something that better defines the whole approach, so I thought it was worth a post. Or two – the next one will be a walkthrough of an updated kata to show how to transform a very simple application into this pattern.

    I’m going to assume that you are already “down” with unit testing – that you see what the benefits are – but that you perhaps are finding it to be more work than you would like and perhaps the benefits haven’t been quite what you hoped.

    Ports and Adapters

    The Ports and Adapters pattern was originally described by Alistair Cockburn in a topic he called “Hexagonal Architecture”. I highly recommend you go and read his explanation, and then come back.

    I take that back, I just went and reread it. I recommend you read this post and then go back and read what he wrote.

    I have pulled two main takeaways from the hexagonal architecture:

    The first is the “hexagonal” part, and the takeaway is that the way we have been drawing architectural diagrams for years (User with a UI on top, app code in between (sometime in several layers), database and other external dependencies at the bottom) doesn’t really make sense. We should instead delineate between “inside the application” and “outside of the application”.  Each thing that is outside of the application should be abstracted into what he calls a port (which you can just think of as an interface between you and the external thing). The “hexagonal” thing is just a way of drawing things that emphasizes the inside/outside distinction.

    Dealing with externals is a common problem when we are trying to write unit tests; the external dependency (say, the .NET File class, for example) is not designed with unit testing in mind, so we add a layer of abstraction (wrapping it in a class of our own), and then it is testable.

    This doesn’t seem that groundbreaking; I’ve been taking all the code related to a specific dependency – say, a database – and putting it into a single class for years. And,  if that was all he was advocating, it wouldn’t be very exciting.

    The second takeaway is the idea that our abstractions should be based on what we are trying to do in the application (the inside view) rather than what is happening outside the application. The inside view is based on what we are trying to do, not the code that we will write to do it.

    Another way of saying this is “write the interface that *you wish* were available for the application to use”.  In other words, what is the simple and straightforward interface that would make developing the application code simple and fun?

    Here’s an example. Let’s assume I have a text editor, and it stores documents and preferences as files. Somewhere in my code, I have code that accesses the file system to perform these operations. If I wanted to encapsulate the file system operations in one place so that I can write unit tests, I might write the following:

    class FileSystem
    {
        public void CreateDirectory(string directory) { }
        public string ReadTextFile(string filename) { }
        public void WriteTextFile(string filename, string contents) { }
        public IEnumerable<string> GetFiles(string directory) { }
        public bool FileExists(string filename) { }
    }

    And I’ve done pretty well; I can extract an interface from that, and then do a mock/fake/whatever to write tests of the code that uses the file system. All is good, right? I used to think the answer is “yes”, but it turns out the answer is “meh, it’s okay, but it could be a lot better”.

    Cockburn’s point is that I’ve done a crappy job of encapsulating; I have a bit of isolation from the file system, but the way that I relate to the code is inherently based on the filesystem model; I have directories and files, and I do things like reading and writing files. Why should the concept of loading or saving a document be tied to this thing we call filesystem? It’s only tied that way because of an accident of implementation.

    To look at it another way, ask yourself how hard it would be to modify the code that uses FileSystem to use a database, or the cloud? It would be a pretty significant work item. That also means that my encapsulation is bad.

    What we are seeing – and this is something Cockburn notes in his discussion – is that details from the implementation are leaking into our application. Instead of treating the dependency technology as a trivial choice that we might change in the future, we are baking it into the application. I’m pretty sure that somewhere in our application code we’ll need to know file system specifics such as how to parse path specifications, what valid filename characters are, etc.

    A better approach

    Imagine that we were thinking about saving and loading documents in the abstract and had no implementation in mind. We might define the interface (“port” on Cockburn’s lingo) as follows:

    public interface IDocumentStore
    {
        void Save(DocumentName documentName, Document document);
        Document Load(DocumentName documentName);
        bool DoesDocumentExist(DocumentName documentName);
        IEnumerable<DocumentName> GetDocumentNames();
    }

    This is a very simple interface – it doesn’t need to do very much because we don’t need it to. It is also written fully using the abstractions of the application – Document and DocumentName instead of string, which makes it easier to use. It will be easy to write unit tests for the code that uses the document store.

    Once we have this defined, we can write a DocumentStoreFile class (known as an “adapter” because it adapts the application’s view of the world to the underlying external dependency).

    Also note that this abstraction is just what is required for dealing with documents; the abstraction for loading/saving preferences is a different abstraction, despite the fact that it also uses the file system. This is another way this pattern differs from a simple wrapper.

    (I should note here that this is not the typical flow; typically you have code that it tied to a concrete dependency, and you refactor it to something like this. See the next post for more information on how to do that).

    At this point, it’s all unicorns and rainbows, right?

    Not quite

    Our application code and tests are simpler now – and that’s a great thing - but that’s because we pushed the complexity down into the adapter. We should test that code, but we can’t test that code because it is talking with the non-testable file system. More complex + untestable doesn’t make me happy, but I’m not quite sure how to deal with that right now, so let’s ignore it for the moment and go write some application unit tests.

    A test double for IDocumentStore

    Our tests will need some sort of test double for code that uses the IDocumentStore interface. We could write a bunch of mocks (either with a mock library or by hand), but there’s a better option

    We can write a Simulator for the IDocumentStore interface, which is simply an adapter that is designed to be great for writing unit tests. It is typically an in-memory implementation, so it could be named DocumentStoreMemory, or DocumentStoreSimulator, either would be fine (I’ve tended to use “Simulator”, but I think that “Memory” is probably a better choice).

    Nicely, because it is backed by memory, it doesn’t have any external dependencies that we need to mock, so we can write a great set of unit tests for it (I would write them with TDD, obviously) that will define the behavior exactly the way the application wants it.

    Compared to the alternative – mock code somewhere – simulators are much nicer than mocks. They pull poorly-tested code out of the tests and put it into a place where we can test is well, and it’s much easier to do the test setup and verification by simply talking to the simulator. We will write a test that’s something like this:

    DocumentStoreSimulator documentStore = new DocumentStoreSimulator();
    DocumentManager manager = new DocumentManager(documentStore);
    Document document = new Document("Sample text");
    DocumentName documentName = new DocumentName("Fred");
    manager.Save(documentName);
    
    Assert.IsTrue(documentStore.DoesDocumentExist(documentName));
    Assert.AreEqual("Sample text", documentStore.Load(documentName).Text);

    Our test code uses the same abstractions as our product code, and it’s very easy to verify that the result after saving is correct.

    A light bulb goes off

    We’ve now written a lot of tests for our application, and things mostly work pretty well, but we keep running into annoying bugs, where the DocumentStoreFile behavior is different than the DocumentStoreMemory behavior. This is annoying to fix, and – as noted earlier – we don’t have any tests for DocumentStoreFile.

    And then one day, somebody says,

    These aren’t DocumentStoreMemory unit tests! These are IDocumentStore unit tests – why don’t we just run the tests against the DocumentStoreFile adapter?

    We can use the simulator unit tests to verify that all adapters have the same behavior, and at the same time verify that the previously-untested DocumentStoreFile adapter works as it should.

    This is where simulators really earn their keep; they give us a set of unit tests that we can use both to verify that the real adapter(s) function correctly and that all adapters behave the same way.

    And there was much rejoicing.

    In reality, it’s not quite that good initially, because you are going to miss a few things when you first write the unit tests; things like document names that are valid in one adapter but not another, error cases and how they need to be handled, etc. But, because you have a set of shared tests and they cover everything you know about the interface, you can add the newly-discovered behavior to the unit tests, and then modify the adapters so they all support it.

    Oh, and you’ll probably have to write a bit of code for test cleanup, because that document that you stored in your unit tests will be there the next time if you are using the file system adapter but not the memory adapter, but these are simple changes to make.

    Other benefits

    There are other benefits to this approach. The first is that adapters, once written, tend to be pretty stable, so you don’t need to be running their tests very much. Which is good, because you can’t run the tests for any of the real adapters as part of your unit tests suite; you typically need to run them by hand because they use real versions of the external dependencies and require some configuration.

    The second is that the adapter tests give you a great way to verify that a new version of the external dependency still works the way you expect.

    The simulator is a general-purpose adapter that isn’t limited to the unit test scenario. It can also be used for demos, for integration tests, for ATDD tests; any time that you need a document store that is convenient to work with. It might even make it into product code if you need a fast document cache.

    What about UI?

    The approach is clearest when you apply it to a service, but it can also be applied to the UI layer. It’s not quite as cool because you generally aren’t about to reuse the simulator unit tests the same way, but it’s still a nice pattern. The next post will delve into that a bit more deeply.

  • Eric Gunnerson's Compendium

    Tricks you can play on yourself #789–Linq

    • 2 Comments

    I was profile some code this morning, and came across some interesting behavior.

    Basically, we had some low level code that looked something like this:

    IEnumerable<Guid> GetSpecialQuestionIds()
    {
        return
          GetAllSpecialItems()
            .Select(specialItemXml => specialItemXml .CreateFromXml(questionXml))
            .SelectMany(specialItem => specialItem.Questions.Select(question => question.UniqueIdentifier)).Distinct();
            .Distinct();
    }

    So, it’s taking special item xml, deserializing each special item, and then grabbing all the unique question ids that are referenced by the special items. Perfectly reasonable code.

    Elsewhere, we did the following (the original code was spread out over 3 different classes but I’ve simplified it for you):

    var specialQuestionIds = GetSpecialQuestionIds();

    foreach (Item item in items)
    {
        var questions = item.Questions.Where(question => specialQuestionIds.Contains(question.UniqueIdentifier);
    }

    That also looks fine, but when I looked at the profile, I found that it was heavily dominated by the CreateFromXml() call. Well, actually, I did the profile first, and then looked at the code.

    The problem is the call to Contains(). It will walk every entry in specialQuestionIds, which normally would be fine, but because it’s never been realized, it will deserialize all the special items… for every question in every item.

    The fix is pretty simple – I changed GetSpecialQuestionIds() to call .ToList(), so that the deserialization only happened once, and the deserialization dropped from 65% down to 0.1% in the profile. And there was much rejoicing.

    The lesson is that you should be careful whenever you return an IEnumerable<T> that isn’t super-cheap, because the person who gets it may enumerate it over and over.

  • Eric Gunnerson's Compendium

    Simulators or not?

    • 0 Comments

    I’ve been spending some time playing with Cockburn’s hexagonal architecture  (aka “ports and adapters”), and the extension I learned from Arlo, simulators. I’ve found it to be quite useful.

    I was writing some code, and I ended up at a place I didn’t expect. Here’s the situation. I have the following external class (ie “port”).

    class EntityLoader
    {
        public EntityLoader(string connectionInformation) {}

        public IEnumerable<Entity> Fetch(EntithyType itemType) { … }
    }

    I need to use this class to some different kinds of entities, do some light massaging of data, and then query against the data. I’ll start figuring out what the adapter should be, and I’ll define it by the question that I want to ask it:

    interface IPeopleStore
    {
        IEnumerable<Employee> GetAllEmployeesForManager(Employee employee);
    }

    Now that I have the interface, I can use TDD to write a simulator that implements the interface:

    class PeopleStoreSimulator: IPeopleStore
    {
        public IEnumerable<Employee> GetAllEmployeesForManager(Employee employee) { ...}
    }

    The implementation for this will be pretty simple; I just add a way to get the list of employees for a manager into the simulator. Now I have unblocked my team members; they can code against the interface and use the simulator for their testing while I figure out how to write the version that talks to the EntityLoader.

    And this is where it got interesting…

    One of the cool things about port/simulator/adapter is that you can write one set of tests and run them against all of the adapters, including the simulator. This verifies that the simulator and the real adapter have the same behavior.

    That’s going to be problematic because the interface for Entity doesn’t give me any way to put data into it, so I can’t use the simulator tests on it. It will also do two things; fetch the data from the entity and implement the GetAllEmployeesForManager() method, and because I can’t put data into it, I don’t have a way to write a test for the method().

    It also violates one of my guidelines, which is to separate the fetching of data and the processing of data whenever possible. The problem is that we have the lookup method logic in a class that we can’t test – ie so we can’t adapt the data into what we need. That’s a good sign that adapter may not be a good choice here. How about a simpler approach, such as wrapper?

    Let’s start with the lookup logic. We’ll make PeopleStore a simple container class, and that will make the lookup logic trivial to test.

    class PeopleStore
    {
        IList<Employee> m_employees;
        IList<Employee> m_managers;

        public PeopleStore(IList<Employee> employees, IList<Employee> managers)
        {
            m_employees = employees;
            m_managers = managers;
        }
       
        public IEnumerable<Employee> GetAllEmployeesForManager(Employee employee)
        {
            …
        }
    }

    Now, I’ll work on the wrapper level. After going with an interface, I end up switching to an abstract class, because there is a lot of shared code.

    abstract class EntityStoreBase
    {
        protected IEnumerable<Employee> m_employees;
        protected IEnumerable<Employee> m_managers;

        IEnumerable<Employee> FetchEmployees() { return m_employees; }
        IEnumerable<Employee> FetchManagers() { return m_managers; }
    }

    class EntityStoreSimulator: EntityStoreBase
    {
        public EntityStoreSimulator(IEnumerable<Employee> employees, IEnumerable<Employee> managers)
        {
            m_employees = employees;
            m_managers = managers;
        }
    }

    class EntityStore : EntityStoreBase
    {
        public EntityStore(string connectionInformation)
        {
            EntityLoader loader = new EntityLoader(connectionInformation);

            m_employees = loader.Fetch(EntityType.Employee)
                                .Select(entity => new Employee(entity));
            m_managers = loader.Fetch(EntityType.Manager)
                                .Select(entity => new Employee(entity));
        }
    }

    That seems to work fine. Now I need a way to create the PeopleStore appropriately. How about a factory class?

    public static class EntityStoreFactory
    {
        public static EntityStoreBase Create(IEnumerable<Employee> employees, IEnumerable<Employee> managers)
        {
            return new EntityStoreSimulator(employees, managers);
        }

        public static EntityStoreBase Create(string connectionInformation)
        {
            return new EntityStore(connectionInformation);
        }
    }

    This feels close; it’s easy to create the right accessor and the EntityLoader class is fully encapsulated from the rest of the system. But looking through the code, I’m using 4 classes just for the entity-side part, and the code there is either not testable (the code to fetch the employees from the EntityLoader), or trivial. Is there a simpler solution? I think so…

    public static class PeopleStoreFactory
    {
        public static PeopleStore Create(IEnumerable<Employee> employees, IEnumerable<Employee> managers)
        {
            return new PeopleStore(employees, managers);
        }

        public static PeopleStore Create(string connectionInformation)
        {
            EntityLoader loader = new EntityLoader(connectionInformation);

            var employees = loader.Fetch(EntityType.Employee)
                                .Select(entity => new Employee(entity));
            var managers = loader.Fetch(EntityType.Manager)
                                .Select(entity => new Employee(entity));

            return Create(employees, managers);
        }
    }

    This is where I ended up, and I think it’s a good place to be. I have a class that is well-adapted to what the program needs (the PeopleStore), and very simple ways to create it (PeopleStoreFactory).

    Thinking at the meta level, I think the issue with the initial design was the read-only nature of the EntityStore; that’s what made the additional code untestable. So, as fond as I am of port/adapter/simulator, there are situations where a simple factory method is a better choice.

  • Eric Gunnerson's Compendium

    Identifying your vertical story skeleton

    • 0 Comments

    I’ve been leading an agile team for a while now, and I thought I would share some of the things we’ve learned. This one is about vertical slices, and learning how to do this has made the team more efficient and happier.

    To make this work you need a team that is cross-functional and has the skills to work on your whole stack (database/server/ui/whatever).

    As an example, assume that the team is working on the following story as part of a library application:

    As a authenticated library user, I can view the list of the books that I have checked out.

    The team discusses what this means, and here’s what they come up with:

    1. Authenticate the current user
    2. Fetch the list of book ids that are checked out by that user
    3. Fetch the book title and author for each book id
    4. Display the nicely-formatted information in a web page

    My old-school reaction is to take these four items, assign each of them to a pair, and when all of them are done, integrate them together, and the story will be complete. And that will work; lots of teams have used that approach over the years.

    But I don’t really like it. Actually, that’s not quite strong enough – I really don’t like it, for a bunch of reasons:

    • It requires a lot of coordination on details to keep everybody in sync. For example, changes need to be coordinated across the teams.
    • We won’t have anything to show until the whole story is done, so we can’t benefit from customer feedback.
    • Teams will likely be waiting for other teams to do things before they can make progress.
    • The different areas will take different amounts of time to finish, so some people are going to be idle.
    • Our architecture is going to be a bit clunky in how the pieces fit together.
    • It encourages specialization.
    • Nobody owns the end-to-end experience
    • Integrations are expensive; when we integrate the parts together we will likely find issues that we will have to address.

    The root problem is that the units of work are too coupled together. What I want is a work organization where the units of work are an end-to-end slice, and a pair (or whatever grouping makes sense) can go from start to finish on it.

    That seems to be problematic; the story describes a simple business goal, and it’s unclear how we can make it simpler. We *need* all the things we thought of to achieve success.

    This situation blocks most teams. And they are right in their analysis; there is no way to be simpler and to achieve success. Therefore, the only thing that might work is to redefine success.

    That’s right, we’re going to cheat.

    This cheating involves taking the real-world story and simplifying it by making it less real-world. Here’s a quick list of ways that we could make this story simpler:

    • We don’t really need the book title and author, so we redefine “list of books” to “list of book ids”.
    • The display doesn’t have to be nicely formatted, it could just be a list of book ids on a web page.
    • The display could even just be the results of a call to a web API that we make in a browser.
    • We could build the initial version as a console app, not as a web app.
    • The story doesn’t have to work for every user, it could just work for one user.
    • The list of returned books doesn’t have to be an actual list of checked-out books, it could be a dummy list.

    This is just a quick list I came up with, so there may be others. Once we have this list, we can come up with our first story:

    As a developer, I can call an API and get a predefined list of book ids back

    I’ve taken to calling this a “skeleton story”, because it’s a bare-bones implementation that we will flesh out later.

    We will go off and implement this story, deploy it as a working system, and – most importantly – verify that it behaves as it should.

    Getting to this story is the hard part, and at this point the remaining vertical slices are merely adding back the parts of the story that we took out. Here’s a possible list of enhancements:

    1. Fetch the list of book ids for a predefined user
    2. Fetch the list of book ids for a user passed into the api.
    3. Display the book ids in web page
    4. Display a book description instead of just a book id.
    5. Make the web page pretty.

    These will all be converted to stories, and we will verify that each one makes the system more real in a user-visible way. They aren’t perfect; some of the slices depend on the result of earlier slices, so we can’t parallelize across all 5 of them, and we will still need to have some coordination around the changes we make. These issues are more tractable, however, because they are in relation to a working system; discussions happen in the context of actual working code that both parties understand, and it’s easy to tell if there are issues because the system is working.

  • Eric Gunnerson's Compendium

    Rational behavior and the Gilded Rose kata…

    • 2 Comments

    The following is based on a reply to an internal post that I almost wrote this morning, before I decided that it might be of more general interest. It will take a little time to get to my point so perhaps this would be a good time to grab whatever beverage is at the top of your beverage preference backlog.

    Are you back? Okay, I’ll get started…

    A while back my team spent an afternoon working on the Gilded Rose kata. We used it to focus on our development skills using a pairing & TDD approach. This kata involves taking a very tangled routine (deliberately tangled AFAICT) and extending it to support a new requirement. It is traditionally viewed as an exercise in refactoring, and in the introduction to the kata, I suggested that my team work on pinning tests before refactoring or adding new functionality. My team found the kata to be quite enjoyable, as it involved a puzzle (unraveling the code) and the opportunity to work on code that is much worse than our existing codebase.

    This week, another team did the String Calculator kata (a nice introduction to katas and one that works well with pairing/TDD), and somebody suggested that Gilded Rose would be a good next step. I started to write a quick reply, which became a longer reply, which then evolved into this post.

    My initial reply was around how to encourage people to take the proper approach (if it’s not obvious, the “proper” approach is the one I took when I last did the kata…) – which was “write pinning tests, refactor to something good, and then add new functionality”. When writing it, however, it seemed like I was being overly prescriptive, and it would make more sense to remind people of a general principle that would lead them to a good approach.

    At this point, I realized that I had a problem. Based on the way the kata is written, I was not longer sure that my approach *was* the proper choice. The reason is that the kata provides a detailed and accurate description of the current behavior of the system and the desired changed behavior. After a bit of thought, I came up with four alternatives to move forward:

    1. Attempt to extract out the part of the code that is related to the area that I need to change, write pinning tests for it, and then TDD in the new functionality.
    2. Write pinning tests on the current implementation, refactor the code, then add new functionality.
    3. Write pinning tests on the current implementation, write new code that passes all the pinning tests, then add new functionality.
    4. Reimplement the current functionality from scratch using TDD and based on the detailed requirements, then add new functionality.

    Which one to choose?

    Well, the first one is something I commonly do in production code to make incremental changes. You can make things a little better with a little investment, and many times that’s the right choice WRT business value. If you want more info on this approach, Feather’s “Working Effectively With Legacy Code” is all about this sort of approach. It’s one of the two books that sits on my desk (anybody want to guess the other one?).

    Presuming that a kata that I finish quickly isn’t really my going, that leaves me with three options. My experience says that, if I have the Gilded Rose code and a complete specification of the desired behavior, I’m going for the last option. The problem isn’t very complex and, doing TDD is going to be straightforward, so I’m confident that I can get to an equivalent or better endpoint with less effort than dealing with refactoring the existing code.

    That conclusion was a bit surprising to me, given that this is commonly thought to be a refactoring kata, but it’s the logical outcome of having a perfect specification for the current functionality available. Last year I had the chance to fix a buggy area of code that handled UI button enabling/disabling, and I took this exact approach; I knew what the functionality would be, wrote a nice clean class & tests, and tossed the old code out. Worked great.

    In reality, however, this case is somewhat rare; most times you just have the code and your requirement is to add something new without changing the existing behavior, whatever it might be. Or, you have some requirements and some code, but the requirements are out of date and don’t match the code. So… I think the Gilded Rose kata would be a lot better if you didn’t have the description of the current behavior of the system, because:

    1. That’s when pinning becomes important, to document the existing behavior.
    2. Automated refactoring becomes more interesting because it allows me to make *safe* changes even if my tests aren’t perfect.
    3. I get led towards these approaches automatically.

    If I don’t have the description, I’m going to choose between the last two approaches. The tradeoff likely depends on how good my pinning tests are; if they are perfect I should just choose the option that is quicker. If they are less than perfect – which is generally going to be the case - then doing it through refactoring is likely a better choice.

    Or, to put it another way, the effort to write a reasonable set of pinning tests and refactor the code is generally going to be less than the effort to write a perfect set of pinning tests and write the code from scratch.

  • Eric Gunnerson's Compendium

    7 Hills 2014

    • 3 Comments

    The forecast did not look good. In fact, it looked pretty bad.

    It was Sunday of Memorial day weekend, and I was forecast-shopping. That’s what I do when I want to ride and the weather is marginal; I look at the different weather forecasts (Accuweather, wunderground, weather.com, national weather service) to see if I can find one that I like. They said – if I recall correctly – Rain, showers, showers, rain.

    I was registered to ride 7 hills for the nth time (where 5 < N < 10) on Memorial day. To be specific, I was registered to ride the 11 hills “metric century”. Quotes because a kilometer is about 8% shorter on this ride, needing only 58 miles to reach the metric century mark.

    I had tentatively agreed to ride with a few friends, which is not my usual modus operandi; after a few rides where a group ride turned into a single ride, I started doing most rides by myself.

    I rolled out of bed at 6AM on Memorial day, and took a look outside. It was wet but not raining. A look at the radar (the NWS National Mosaic is my favorite) showed that not only was there no rain showing, it looked like it was going to be that way for the next 6 hours or so.

    Normally, my ride prep would be done the night before; I’d have everything that I wanted out on the counter, appropriate clothes chosen, and a couple of premixed bottles in the fridge. Since I expected not to be riding, I had to do all of this in a bit of a hurry. I got packed, grabbed my wallet, keys, phone, and GPS, and headed out.

    I passed the first group parking on Lake Washington Blvd (people always park too far to the south), find a spot and unload. I roll into the park, get my registration band, route sheet, and find my companions. I’ll be riding with riding friends Joe and Molly, and their friends Bill and Alex. We roll out at 8:20 or so.

    Market street (Hill 1) is quickly dispatched, and we head up Juanita (Hill 2). The first two hills are fairly easy; something like 5-7% gradient max. We regroup at the top of Juanita (well, actually not the top of the hill, but the part where we head back down). My legs have felt pretty good so far, but we are coming to Seminary hill (#3), which is steeper and harder than the other two. I think it’s the second-hardest climb of the ride. It also is a bit misleading; there’s a steep kicker right at the beginning, a flat part, and then it steepens up again for the remainder of the climb.

    I start the climb. I’m have a secret weapon – my power meter. I know from the intervals that I’ve been doing that I can hold 300 watts for 2 minutes. I also know that I can hold 240 watts for 10 minutes, so I set that as my “do not exceed” level. I pass a few people, pass a few more, and before I know it, I’m at the top. I do have legs today.

    The others filter up soon after. Well, that’s not factually true; Joe and Alex finished quite a bit faster than me, and Molly and Bill filter up soon after. Joe is my benchmark for comparative insanity, so I know that him finishing in front of me just means that things are right with the world.

    We head north to descend; Joe/Molly/Bill have an almost-incident with a right-turning truck. We get on the trail and spin to Norway hill. As we approach the base, Joe is talking with a few friends, and we turn right and the climb starts. The road turns left, and I see a bunch of people on the hill. I start passing people, and strangely, nobody is passing me. I hit the stop sign, keep climbing, and eventually top out. I passed 40 people on the way up, get passed by none. Though in the spirit of full disclosure, I did pass the last 5 as they were getting ready to pull off near the top, and most of these riders are out here for the “7 hills” version of the ride.

    We head south, and turn left on 132nd. The previous course would take us all the way to my favorite intersection  - 132nd st and 132nd ave – but this year they instead route us south, and then to a food stop near Evergreen Hospital. Somewhere on the last section, the sun has popped out, and we feel pretty good. I get some sort of energy bar and pretty tasteless bagelette. After a bit too long waiting, we head out again, and take 116th north. We descend down brickyard, and turn right, heading towards back on the south towards Winery hill.

    And into the headwind. I go into ride leader mode, and settle in with the rest of the group somewhere behind me. After a few minutes, Bill – who is tall and wide like me – passes and pulls for a little bit. Soon enough, we reach the base of Winery. The route that we are taking – through the neighborhood – is a series of climbs and flats. We hit the first one, which is something like 15%, and Joe and Alex ride off. I try to stay around 300 watts on the climbs and recover a bit on the flats. Soon enough, I hit the top, and find the the 7 hills bagpiper is too busy having his picture taken with riders to play. He starts playing as Molly pulls up and we ride off down to the next food stop. The new route has changed this experience; previously you would have to climb north while being demoralized by the riders approaching because they had already finished winery, and then have the opposite feeling when you come down the same road after Winery. The new route is fine but is missing a bit of the emotional experience of the old one.

    I grab a dark chocolate chip cookie, refill my Nuun bottle and deploy some cheez-its, my wonder ride food.

    We now have a decision to make. We have done 6 hills, and we can either descend down into the Sammamish River Valley, ride south, and climb up hill #7, Old Redmond Road, or we can head east to grab an extra 4 hills before returning for the last climb. We decide to do the full metric and head east. This takes us on 116th to a short but really steep (say, 17%) climb. There’s a route via 124th that is much more gradual, so I’m not sure whether this route is because the organizers don’t know about the other route or it’s a deliberate choice.

    This is one of the downsides of being a ride leader; I know the vast majority of the roads out here and if I’m on an organized ride I’m constantly plotting where we are going versus what the other options are.

    The next climb is Novelty Hill. There really isn’t a lot of novelty involved; it’s a 500’ or so climb with a lot of fast traffic. On the way up, I find myself stuck on “If you’re happy and you know it, clap your hands”, planted by Joe a few minutes before. A few minutes later, it morphs to the surprisingly appropriate “I’ve been through the desert on a horse with no name, it felt good to get out of the rain” (America, 1971).

    We finish, regroup, and head south to Union hill road. There’s a bonus half hill here that isn’t part of the 11 hills, we finish that section, and head north to descend Novelty again, and head up NE Redmond Road (not to be confused with Old Redmond Road, which we will climb later). This is a fairly easy climb but everybody’s legs are a bit tired. Even Joe’s, though his are tired because of the miles that he has put in the past few days. Another hill top, another descent, and we head up education hill on 116th for the second time (re-education hill). That takes us to the last food stop, where I have a fairly pedestrian ham and cheese wrap and make up another bottle of Nuun. Unfortunately, it seems that I chose “moldy fruit” flavor, so I’m not too excited about it, but I choke a bit down.

    We descend, head across the valley with a vicious sidewind which turns into a headwind as we head south. I pull for Molly for the couple of miles, then Molly and Bill and I hit the base of Old Redmond Road at the same time. This is the last hill, and I open it up a bit, passing X people (5 < X < 300,000) on the way up. We crest, regroup, and head down  the last descents and the final run on Lake Washington Blvd back into Redmond. I get ahead, wait for the group, Joe goes by, and I find that I have one last sprint in my legs, so I open it up, and catch him.

    Then it’s through to the finish, chocolate milk, and strawberry shortcake.

    Normally at this point, I would talk about stats, but I only have 30 miles of the ride. I *can* say that I got PRs on Seminary, Norway, and Winery hills, so it’s pretty clear that I did have legs.

     

  • Eric Gunnerson's Compendium

    A Programmer's Guide to C# 5.0

    • 3 Comments

    My author's copies of the Fourth Edition of my book showed up today:

    It is significantly updated from the previous version. I especially enjoyed writing the sections on Linq and asynchronous features.

     

     

     

  • Eric Gunnerson's Compendium

    Small in, Big out

    • 10 Comments

    I’ve come across some code that is using – and overusing – IEnumerable<T>, and thought it might be of general interest.

    Consider the following method signatures:

    IEnumerable<string> Process(List<string> input);

    IList<string> Process(IEnumerable<string> input);

    Of the two, which is the better choice? Write down your answer.

    There are really two questions – what is the best choice for input parameters, and what is the best choice for return values. I’ll cover them separately:

    Small In

    The input parameter should be the smallest (ie least functional) type that allows the method to efficiently do what it needs to do. I’ve seen methods like this:

    void Process(List<string> input)
    {
        foreach (string in input)
        {
            …
        }
    }

    (or maybe the Linq equivalent).

    In this case, you are just making things harder for the caller; all you really need is an IEnumerable<string>, so that’s what you should specify. On the other hand, I’ve also seen code like this:

    void Process(IEnumerable<string> items, IEnumerable<ReferenceItem> referenceItems)
    {
        var lookup = referenceItems.ToDictionary(referenceItem => referenceItem.Name, referenceItem => referenceItem.Contents)'
        …
    }

    This code takes a simple enumerable, and constructs a dictionary out of it. This is worse than the first case; if you need a dictionary, ask for a dictionary.

    Big Out

    Output parameters should be the biggest (ie most functional) type that is acceptable for the scenario. I’ve seen methods like this:

    IEnumerable<string> Process(List<string> input)
    {
        List<string> items = new List<string>();

        // fill list here

        return items;
    }

    The developer has created a beautiful, functional list object, but they’re only going to let the caller enumerate over it. This often leads to my favorite outcome, where the caller writes something like:

    List<string> items = new List<string>(myClass,Process(input));

    Don’t do that. Just return the List<string> or perhaps the IList<string>.

    Return values aren’t as cut-and-dried as input parameters, however. If we modify the previous example as follows:

    IList<string> Process(List<string> input)
    {
        m_items = new List<string>();

        // fill list here

        return m_items;

    In this example, the class is retaining ownership of the list, and in that case, it probably doesn’t want to give it out to somebody who could clear it, or replace all the strings with the string “Haddock”. If the class is going to retain ownership, it should use a return type that prevents bad things like that from happening.

  • Eric Gunnerson's Compendium

    A Programmer’s Guide to C# 2012

    • 0 Comments

    More months than I’d like to think about, I decided to undertake an update of my sorely out-of-date C# book, and Saturday I submitted the last of the chapters for the rewrite. I still have the author review (after the tech review) and the final review to do on each chapter, but that’s a huge amount of work out of the way.

    Back when the last version of the book came out, C# was an upstart language that had just added generics to it. Now, several * 3 years later, it has matured into a language with a lot going for it. Over that same time period, I’ve written a lot of C# code and have much more informed opinions about what works and what doesn’t.

    When I was working on the update, I was once again reminded that there is a huge gulf between “I understand how that works enough to use it” and “I understand how that works enough to write about it”, and I’ve enjoyed writing the chapters on Linq, the new asynchronous support, and a bit about Roslyn.

  • Eric Gunnerson's Compendium

    A new year and a new group

    • 4 Comments

    I’m starting off the new year (well, in a few weeks…) in a new group. I am leaving the HealthVault team, and moving to an internal Engineering Excellence team where I will be working on… making our engineering 28.5% more excellent.

    Actually, I’m not sure what I’m able to talk about yet, so I’ll leave it at that.

  • Eric Gunnerson's Compendium

    Improve HealthVault query efficiency with final transforms…

    • 0 Comments

    A while back I wrote a post about using the built-in HealthVault transforms to get a different view on your data. That post discusses the mtt, stt, and form transforms.

    There’s another transform option in HealthVault that gives you the option to transform the response that comes back. It is best explained through a scenario. Most of this discussions is going to be at the raw xml request/response level. Some

    Getting the instance ids of all Weight items in a record

    If you want to do a full synchronization of weight values in a HealthVault record (and please read the guidance here before you decide to do so), you will need to get the instance ids of all the items. This simplest way to do this is just to ask for all the weight items, which will result in the following response:

    <response>
      <status>
        <code>0</code>
      </status>
      <wc:info xmlns:wc="urn:com.microsoft.wc.methods.response.GetThings3">
        <group>
            <thing>
            <thing-id version-stamp="8853d460-3c79-4284-aa12-08cb9730d151">645cae6f-942d-40bb-bc36-5529b6f9f46a</thing-id>
            <type-id name="Weight Measurement">3d34d87e-7fc1-4153-800f-f56592cb0d17</type-id>
            <thing-state>Active</thing-state>
            <flags>0</flags>
            <eff-date>2011-10-26T10:14:01.17</eff-date>
            <data-xml>
              <weight>
                <when>
                  <date>
                    <y>2011</y>
                    <m>10</m>
                    <d>26</d>
                  </date>
                  <time>
                    <h>10</h>
                    <m>14</m>
                    <s>1</s>
                    <f>170</f>
                  </time>
                </when>
                <value>
                  <kg>79.4010889292196</kg>
                  <display units="lbs">175</display>
                </value>
              </weight>
            </data-xml>
          </thing>
        </group>
      </wc:info>
    </response>

    Where the <thing/> element is repeated for each weight item.

    That’s a lot of data (almost 2K) just to get the single instance guid. The first thing we can do is change our request so that the response only contains the key information. This is done by telling the platform not to return you any full items, but just return them as partial items. You do that by setting the max-full attribute to “0”. If you are working with the SDK, you can set the MaxFullItemsReturnedPerRequest property and use GetMatchingItemsRaw().

    If you do this, you will get a response that looks like this:

    <response>
      <status>
        <code>0</code>
      </status>
      <wc:info xmlns:wc="urn:com.microsoft.wc.methods.response.GetThings3">
        <group>
          <unprocessed-thing-key-info>
            <thing-id version-stamp="8853d460-3c79-4284-aa12-08cb9730d151">645cae6f-942d-40bb-bc36-5529b6f9f46a</thing-id>
            <type-id name="Weight Measurement">3d34d87e-7fc1-4153-800f-f56592cb0d17</type-id>
            <eff-date>2011-10-26T10:14:01.17</eff-date>
          </unprocessed-thing-key-info>
        </group>
      </wc:info>
    </response>

    That takes us down to about 500 bytes, which is much smaller. But there’s still a lot of extraneous information there. What we really need is a way to get back only the information that we care about. That something is a HealthVault final xsl.

    A final-xsl is an application-supplied transform that is run on the response right before it is sent back to the client. In the .NET sdk, the GetTransformedItems method is used. In the pure xml world, it is done by adding a <final-xsl> element to the request after the <country> element and before the <msg-time> element. The xsl is represented as a string in the request, so it must be appropriately escaped in the request xml.

    Here’s a transform I wrote to process the unprocessed items response:

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:wc="urn:com.microsoft.wc.methods.response.GetThings3" exclude-result-prefixes="fn fo wc xs">
      <xsl:output omit-xml-declaration="yes"/>
      <xsl:template match="/response">
        <response>
          <xsl:copy-of select="status"/>
          <xsl:copy-of select="error"/>
          <wc:info>
            <xsl:for-each select="wc:info/group">
              <group>
                <xsl:for-each select="unprocessed-thing-key-info">
                  <id>
                    <xsl:value-of select="thing-id"/>
                  </id>
                </xsl:for-each>
              </group>
            </xsl:for-each>
          </wc:info>
        </response>
      </xsl:template>
    </xsl:stylesheet>

    Which results in the following output:

    <response>
      <status>
        <code>0</code>
      </status>
      <wc:info xmlns:wc="urn:com.microsoft.wc.methods.response.GetThings3">
        <group>
          <id>645cae6f-942d-40bb-bc36-5529b6f9f46a</id>
        </group>
      </wc:info>
    </response>

    That takes us down to about 250 bytes, but still preserves the basic form of the response; the mobile libraries will still be able to parse it successfully. If the application is willing to do more parsing, we can simply further to:

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:wc="urn:com.microsoft.wc.methods.response.GetThings3" exclude-result-prefixes="fn fo wc xs">
      <xsl:output omit-xml-declaration="yes"/>
      <xsl:template match="/response">
            <xsl:for-each select="wc:info/group">
                <xsl:for-each select="unprocessed-thing-key-info">
                  <id>
                    <xsl:value-of select="thing-id"/>
                  </id>
                </xsl:for-each>
            </xsl:for-each>
      </xsl:template>
    </xsl:stylesheet>

    which yields:

    <group>
      <id>645cae6f-942d-40bb-bc36-5529b6f9f46a</id>
    </group>

    which takes us to 66 easy-to-parse bytes.

    Final XSL parameters

    The following set of parameters are passed to all final-xsl transforms:

    • currentDateTimeUtc - the date and time just before the transform started executing
    • requestingApplicationName - the name of the application that made the request to HealthVault.
    • countryCode - the ISO 3166 country code from the request.
    • languageCode - the ISO 639-1 language code from the request.
    • personName - the name of the person making the request.
    • recordName - if the request identified a HealthVault record to be used, this parameter contains the name of that record.
  • Eric Gunnerson's Compendium

    Mobile HealthVault Apps and Sync

    • 0 Comments

    I have been working with some developers working on mobile applications that connect to HealthVault, and the issue of synchronizing data between the mobile device and HealthVault has come up a couple of times.  We’ve come up with some guidance around this area and I’d like to share it more widely.

    I’m assuming that the mobile application needs to keep the data around to work effectively – for example, it may want to produce a chart of weight values for the last few years. If your application only works with small amounts of data, you may not need to store any data locally.

    Our goal

    First, a bit of expectation setting about what our goal is.

    We can start with a goal of “perfect sync” between the mobile application and HealthVault. It is possible to be current within a few seconds – but to do so will result in:

    1. A slower mobile application
    2. More bandwidth usage by the application
    3. Lots of extra requests to the HealthVault servers

     

    The first two aren’t good for the user of the application, and the HealthVault team would like to avoid the third one.

    Instead of trying for the “perfect sync”, we’re going to try for a “good enough sync” – a scheme that is right the majority of time for the majority of users.

    “Good enough sync” guidelines

    These are our current guidelines for keeping data synced on mobile devices.

    1. Fetch only the items that have changed

    If your application fetches all of the weight items in a user’s record and the user has recorded daily weights for the last 3 years, it’s going to take a while to process 1000+ weights.

    Instead, when the application fetches data items, request the audit section as part of the filter:

    <info>
      <group>
        <filter>
          <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
          <thing-state>Active</thing-state>
        </filter>
        <format>
          <section>audits</section>
          <section>core</section>
          <xml/>
        </format>
      </group>
    </info>

    You will get back XML that looks like this:

    <thing>
            <thing-id version-stamp="cdb934a8-08c9-44a2-85e2-429cde027d01">1a868d6f-26c2-4504-bbc5-7b28b11de351</thing-id>
            <type-id name="Medication">30cafccc-047d-4288-94ef-643571f7919d</type-id>
            <thing-state>Active</thing-state>
            <flags>0</flags>
            <eff-date>2008-09-12T09:46:27.987</eff-date>
            <created>
              <timestamp>2008-09-12T16:46:28.22Z</timestamp>
              <app-id name="HelloWorld-SDK">05a059c9-c309-46af-9b86-b06d42510550</app-id>
              <person-id name="Eric Gunnerson">41b87f12-9b6f-40b6-a6ee-be4fbfd12170</person-id>
              <access-avenue>Online</access-avenue>
              <audit-action>Created</audit-action>
            </created>
            <updated>
              <timestamp>2008-09-12T16:46:28.22Z</timestamp>
              <app-id name="HelloWorld-SDK">05a059c9-c309-46af-9b86-b06d42510550</app-id>
              <person-id name="Eric Gunnerson">41b87f12-9b6f-40b6-a6ee-be4fbfd12170</person-id>
              <access-avenue>Online</access-avenue>
              <audit-action>Created</audit-action>
            </updated>
            <data-xml>
              <medication>
                <name>
                  <text>Marmoset</text>
                </name>
              </medication>
              <common/>
            </data-xml>
          </thing>

     

    When you process the items, extract the updated timestamp out, and figure out the maximum timestamp value for all of the items that you have fetched. Then, when you call next time, you can only get items that are newer than that timestamp:

    <filter>
      <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
      <thing-state>Active</thing-state>
      <updated-date-min>2011-10-10T16:50:45.334Z</updated-date-min>
    </filter>

    That will reduce the amount of data that you fetch considerably.

    2. Fetch partial things initially

    When a filter would return a large number of items, HealthVault will return the more recently updated items in their entirety, and then it will return the remainder of the items as partial items. Instead of a full “<thing></thing>” structure, you will see something like this:

    <unprocessed-thing-key-info>
      <thing-id version-stamp="fdd3f6cd-6eb6-46d7-bd7d-3d2c2fccc08b">34b0f978-f454-476a-b4a2-004c4a504e66</thing-id>
      <type-id name="Medication">5c5f1223-f63c-4464-870c-3e36ba471def</type-id>
      <eff-date>2011-07-22T19:22:30.94</eff-date>
    </unprocessed-thing-key-info>

     

    The number of full items that are returned can be specified by the caller, in the following manner:

    <group max-full="1">
      <filter>
        <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
        <thing-state>Active</thing-state>
      </filter>
      <format>
        <section>audits</section>
        <section>core</section>
        <xml/>
        <type-version-format>30cafccc-047d-4288-94ef-643571f7919d</type-version-format>
      </format>
    </group>

    Where the “max-full” attribute sets the number of full items returned in this request. There is a hard limit of 240 full items returned per request.

    Instead of fetching a large number of instances all at once, applications should fetch a few full items initially and save the unprocessed keys around. It can then fetch any later items in the background. This can be done by listing all of the instance ids (obtained from the <thing-id> element of the unprocessed thing key info section) in the filter:

    <group>
      <id>34b0f978-f454-476a-b4a2-004c4a504e66</id>
      <id>e8655aca-eb68-4670-b253-4bc3585fa513</id>
      <format>
        <section>audits</section>
        <section>core</section>
        <xml/>
      </format>
    </group>

     

    3. Consider fetching more than one data type in a single request

    If your application needs to fetch data for more than one data type, you should consider whether you can fetch all the data in a single get request. This will reduce the number of trips you make to the server and the amount of data you send over the network.

    This is done by including multiple filters in a single request:

    <group>
      <filter>
        <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
        <thing-state>Active</thing-state>
      </filter>
      …
    </group>
    <group>
      <filter>
        <type-id>3d34d87e-7fc1-4153-800f-f56592cb0d17</type-id>
        <thing-state>Active</thing-state>
      </filter>
       …
    </group>

    The response will have one group for each group in the request.

    4. Fetch only the data that you need now

    HealthVault records can contain a considerable amount of data in them. A weight tracking application may work fine with your set of test data, but if it gets pointed to a record that contains daily weights for the past 3 years, it may roll over and die.

    Applications should instead limit their queries to fetch bounded amounts of data. If the application is only going to show the user their weight for the last month, it should only fetch that data.

    5. Fetch only on startup and when data has been changed locally

    Mobile apps tend to be run for short periods of time, so checking for new data at startup and when you know there is new data (ie you put it there) is the recommended approach.

    Applications may add a “refresh” option if desired.

    Deleted data

    HealthVault does not currently provide a simple way for applications to determine when data has been deleted from a user’s HealthVault record. Since deletions are rare in HealthVault, we recommend that mobile applications do not attempt to track deletions automatically.

    An app may choose to provide a “synchronize” option that the user may choose to invoke. In that case the app should fetch using max-full=”0”, which will return the minimal amount of data.

    If an application does want to detect deleted data, it should fetch the things using max-full=”0”, and then remove any items in the local cache that were not returned from the query.

  • Eric Gunnerson's Compendium

    A movie scene

    • 1 Comments

    Anybody who has watched movies about software development knows that reality is much different. None of us work with huge screens on the wall, and debugging is a lot of unexciting detective work.

    Today, we were doing some remote debugging over live meeting in a conference room, trying to figure out why an interaction between a website and a web service wasn’t working correctly. We worked through some backend verification, which is as exciting as writing and executing ad-hoc SQL queries always is (not very). Then we moved on to running Fiddler, where we started looking at requests. The URL was pretty complicated and we were verifying all the items in it when I noticed that there was a problem with the base part of the URL. I couldn’t figure out any easy way to tell people what it was because the screen was busy, so I jumped up and pointed at the spot on the giant projected desktop and exclaimed “that’s the problem!”.

    And it was, and I got to do a happy dance. But the soundtrack was a bit lacking.

  • Eric Gunnerson's Compendium

    Dealing with blob data in HealthVault XML

    • 0 Comments

    I recently had a request for more information on dealing with blob data using our newly-released Mobile support, which finally got me to write something about low-level blob support.

    This information applies to any application that isn’t using the .NET SDK to talk to HealthVault. If you are using the .NET SDK, I recommend using the support that is provided in the SDK.

    HealthVault Blob Options

    HealthVault provides the capability of storing additional data – such as documents, images, etc. – as a blob of data attached to any of the HealthVault data types.

    HealthVault provides two ways for applications to manage blob data; an inline method where the blob data is passed with the rest of the data for an item instance, and a streaming approach where blob data is accessed through http get and put requests.

    Inline blob data

    This method passes the blob data in a base64-encoded format with the rest of the item data. It’s the simpler of the methods to use, but it has the following drawbacks:

    • The total size of a HealthVault request has a maximum, which limits the space available for blob data to approximately 5MB (this limit is subject to change).
    • The total size of a HealthVault response has a maximum size, which limits how much blob data can be retrieved, especially if multiple objects with blobs are returned.
    • When the blob data is fetched for an object, data for all blobs is fetched; it isn’t possible to fetch the data for a single blob.
    • Network use will be a little higher and data will be slower since the base64 encoding is a little (about 1/3) bigger than the raw bytes.

    Here’s how you deal with blob data using the inline method:

    Fetching blob data as inline data

    Here’s the “<info>” section of a GetThings request:

    <info>
        <group>
            <filter>
                <type-id>3d34d87e-7fc1-4153-800f-f56592cb0d17</type-id>
                <thing-state>Active</thing-state>
            </filter>
            <format>
                <section>core</section>
                <section>blobpayload</section>
                <xml/>
                <type-version-format>3d34d87e-7fc1-4153-800f-f56592cb0d17</type-version-format>
                <blob-payload-request>
                    <blob-format>
                        <blob-format-spec>inline</blob-format-spec>
                    </blob-format>
                </blob-payload-request>
            </format>
        </group>
    </info>

    The “<section>blobpayload</section>” line specifies that in addition to the core part of the data item, any blob data should be passed back along with the rest of the XML. If you forget to do that, you won’t get any blob data…

    This query returns a <blob-payload> section that comes after the <data-xml> section:

    <blob-payload>
      <blob>
        <blob-info>
          <name />
          <content-type>text/plain</content-type>
          <hash-info>
            <algorithm>SHA256Block</algorithm>
            <params>
              <block-size>2097152</block-size>
            </params>
            <hash>K9OYLkNyHLtTKRinTVTPh4WqbsZQ5l3jcUBIoMieNIc=</hash>
          </hash-info>
        </blob-info>
        <content-length>11</content-length>
        <base64data>U2FtcGxlIFRleHQ=</base64data>
      </blob>
    </blob-payload>

    The following elements are important:

    • name: The name of the blob.
    • content-type: The content type that was specified when the blob was saved
    • content-length: The length of the blob
    • base64data: The blob data

    Uploading blob data as inline data

    Here’s an example of saving a blob as inline data. The blob is passed at the same time the rest of the data for the instance is passed.

    <info>
        <thing>
            <type-id>3d34d87e-7fc1-4153-800f-f56592cb0d17</type-id>
            <thing-state>Active</thing-state>
            <flags>0</flags>
            <data-xml>...</data-xml>
            <blob-payload>
                <blob>
                    <blob-info>
                        <name/>
                        <content-type>text/plain</content-type>
                    </blob-info>
                    <content-length>4</content-length>
                    <base64data>U2FtcGxlIFRleHQ=</base64data>
                </blob>
            </blob-payload>
        </thing>
    </info>

    The <data-xml> section is removed for the sake of brevity. The important parts are:

    • name: The name of the blob
    • content-type: The encoding of the data
    • content-length: The length of the data in bytes
    • base64data: The data expressed in base64 format

    Streamed blob data

    In the streamed method, the blog data is fetched or stored using separate interfaces or methods.

    Fetching blob data as streamed data

    Fetching the blob data is simple. If you change the filter to be:

                        <blob-format-spec>streamed</blob-format-spec>

    Instead of the base64data element, you will get a blob-ref-url element that looks something like this:

      <blob-ref-url>https://platform.healthvault-int.com/streaming/wildcatblob.ashx?blob-ref-token=ASAAAA%2br2bo24GxMiX1j8dyBpoQhGSTp1z9laFjW0PVfgRzyT3PzSWgLdGubgPEK4bUbZG0ni4nY5mlXTW9ZYZQi2iFVXOstWasUtNmtNxJj2NBLolSd9J4MI4aSPID0w1l%2bHwSq%2fEM1BADh%2b01iMtnEIBEMQt%2fuDu0yQrazxjaXR0QokGLtRFOduPvo0LyMDWuo8meFRTPoCNfQoE3IkqIgkvY3gS1KssuQtDXmjVkKCKLSi8lHZV0WjYnCdnilHUAKWoWSOp4d6QX3%2fAUbRbXzzdIr00WGtrDOuoZwXWs5MA%2fKSfi%2fbp4Gfh96ZXJP%2bRy9lDDj8RkWmorT5SQW9QC4QEUONDxW1vSelijjvno451gA%2fi3Ufa57k2%2fs0x6swKrFRDTF0OZknjc5Id92zAnjvOFa5rs4v430bpZiD5MScUsWmEwTpqucCsKorXodAN823Wx4J1VPkeoOZiejnGJePYdyN91g</blob-ref-url>

    You can then download the blob data directly using the URL.

    Uploading blob data as streamed data

    Uploading blob data through the streamed interface is considerably more complex; it requires calling the BeginPutBlob method to get the destination url and then uploading in chunks of data that are a specific size (it also supports signing data, which makes things even more complex).

    If you are interested in using this to load large blobs, let me know, and I’ll try to update this to show that approach as well.

  • Eric Gunnerson's Compendium

    CHC 2011 HealthVault talk, and what is new for HealthVault this past year…

    • 0 Comments

    When we were creating our talk, we had a problem.

    Well, to be fair, we had a few problems, but the problem that I want to talk about is the “for more information” slide, the one that you put at the end to direct people to a location where they can find more information.

    There were actually two problems.

    The first was was that we covered 13 different things during the talk, and there was no way to fit enough information onto a single slide (or even a few slides) to help people out.

    The second was that a fair number (1 “fair” = 6) of items aren’t yet released and this was the first time we were talking about them, so there are no public resources to reference.

    We decided to solve that by doing a “for more information” blog post on the HealthVault blog, which would help attendees find more information and also be of use to those who didn’t attend the talk.

    And then I didn’t remember to write anything until Tuesday night, and had to do a bit of scrambling to get it done. But now, for your reading pleasure, you can read HealthVault at CHC 2011, find out what we demoed, what we announced in the mobile space, and other juicy tidbits of information.

    The conference itself was pretty good, but at only a day and a half tends to feel pretty rushed. We spoke at 8AM on the second day of the conference, second only to the immediately after lunch as a slot that I would like to avoid, but the crowd seemed pretty good, at least, to the extent that I could see them; I tried to look around in the crowd but was a bit hampered by the lights; two searchlights previously used to illuminate Saturn V rockets were aimed directly at my face, and I literally could not tell whether people were standing or sitting; they could all have been wearing clown makeup for all I knew. And if you looked up too much, you’d end up with a bunch of spots in your eyes that made it hard to read your slides.

    Our demo had a lot of moving parts, and it mostly worked correctly. What Vaibhav did was clean as far as I could tell; in my part, the conference center WiFi timed out and I had to re-initiate it during the talk, and I ended up provisioning the mobile application to talk to our practice account, not our live account, but luckily Vaibhav had written the eventing application so that either account worked. Oh, and I had unexpected lack of success with our CAPTCHA, even with the help of others.

    Our goal was to time the talk to 45 minutes, and after answering a few questions during the talk, we finished at 8:52 which is pretty much perfect in my book.

    Please let us know if you have questions or comments.

  • Eric Gunnerson's Compendium

    Speaking at CHC 2011

    • 0 Comments

    I'm in Chicago this week to speak at the Microsoft Connected Health Conference, 2011 edition. It's been nice to get away from the cool and rainy northwest and spend some time in Chicago,where the weather has been... well, it's been cool, and rainy, though the rain has been a bit torrential at times. We are rain wimps in Seattle.

    I ended up with some spare time earlier this week and spent a morning at the Museum of History and Industry, where I toured the U505 (an authentic WWII U boat captured in 1944 IIRC), and the bodies in motion exhibit, which was Oh my god did you see that guy? that's so gross I think I'm going to be sick!, or at least that was what the group of teenagers next to me thought. I found it fascinating and disturbing at the same time.

    I'm co-presenting at an overview presentation on HealthVault, where we have an hour to cover everything we've done in HealthVault in the last year or so. We have an involved demo and technical explanation, then a lot of short items. I'm thoroughly practiced-out the night before the talk, but I feel pretty good about it.

    The conference so far has been pretty good. We had a HealthVault-specific event last night and I was able to put some faces with a few names of people, and talk with partners about what they are doing.

    We will be posting more information about what we talk about on the HealthVault blog; I'll link here when the post is up.

  • Eric Gunnerson's Compendium

    Data sorting dances

    • 1 Comments

    Intercultural Computer Science Education

    Data sorting dances

    I enjoyed the dances, though I'm not sure that insertion sort is correctly danced.

     

  • Eric Gunnerson's Compendium

    Default parameters in C#

    • 4 Comments

    From an internal discussion we're having on the advisability of using default parameters in C#:

    Currently, the pain and limitation of doing overloads forces you to rethink how a method should work. Consider the following:

     Process(int a);

    Process(int a, float b);

    Process(int a, float b, string c);

     

    If I now need to change how that works in some situations, I could add a boolean to control that behavior, but it’s not obvious how to add it to the current overload scheme. I can do something like:

     

    Process(int a);

    Process(int a, bool doBackground);

    Process(int a, float b);

    Process(int a, float b, bool doBackground);

    Process(int a, float b, string c);

    Process(int a, float b, string c, bool doBackground);

     

    But not only is that a bit hard to write, it’s a bit confusing, and if I need to add another parameter in the future, I’m pretty much SOL. So, that forces me to consider the alternatives; should I go with a “Settings” class (like XmlWriterSettings), should I live with it, or should I think about refactoring the process to simplify things? That forced stop gives me the chance to think about better ways to approach things.

     

    With default parameters, it’s going to be really tempting to just add a default, and it’s more likely you’ll end up with methods like this:

     

    Process(int a, float b, string c, bool doBackground = false, bool writeToLog = true, string database=null, string method=”jumble”);

     

    That is bad not only for the caller of the api, but it suggests that the process method is pretty complex internally.

     

    On the other hand, I can’t count how many times I’ve written a well-constrained series of overloads that purely added in default values and had to write nearly duplicate xml docs for each of them, and I’d be really happy to save that time and not have those methods clutter up the code. 

     

    Or, to put it another way, default parameters are great if you use them to simplify scenarios that you would have written with overloads. If you start doing things that would be hard to express in overloads, I’d look harder at the overall design.

     

Page 1 of 46 (1,139 items) 12345»