Being Cellfish

Stuff I wished I've found in some blog (and sometimes did)

Posts
  • Being Cellfish

    Time is hard

    • 1 Comments

    You probably heard about the leap year outage of Azure which is explained here. Essentially it looks like somebody added one to a year (which was probably an integer) rather than using a (proper) date representation. Remember that I do not know, but this is my assumption based on what I've seen over the years. Essentially dealing with time seems to be hard and I think most of the time it is because people don't use a date/time abstraction but take shortcuts based on assumptions that are not correct. Here are some assumptions I've come across in the past:

    • Forgeting leap years.
    • Assuming leap years happen every four years (ex: 1900 was not a leap year).
    • Forgetting daylight savings time.
    • Assuming all countries have daylight savings time on the same date (some countries don't even have daylight savings time).
    • Assuming everybody using your software will be in the same timezone.
    • Assuming all time zones are whole hours from GMT (some countries fraction of hours as offset from GMT)
    • Assuming use of AM/PM rather than 24 hour clock
    • Assuming there are 60 seconds in a minute (ex: leap seconds)

    So there are a lot of things to think about when it comes to dealing with time so you should really use an abstraction for it and preferably use an implementation of that abstraction created by somebody who have tested it well such as something being part of whatever framework you use for development.

  • Being Cellfish

    Configuration in code

    • 0 Comments

    For quite some time I've used a pattern when it comes to configuration to hide it behind some interface. This way I can easily fake it in a unit test but typically there has been an implementation relying on a configuration file. Then when I started working with Azure a few years ago I started to check in a number of different configuration files for different environments and scenarios. But there was always a problem with configurations; while very convenient that they could be changed manually and quickly change the behaviour of my service at the same time those changes are hard to track. The solution was so obvious I didn't think about it until recently in a discussion I had with a co-worker. Configuration should be kept in code. I already started the process by checking configuration files into my source control system, but taking it another step and actually make it part of the code so that you need to recompile is for many reasons a better idea and also possible when working with cloud services.

    Configuration files definitely have it's place but when it comes to cloud services, redeploying your service is fast enough so you can actually recompile and redeploy rather than changing a configuration setting. At the same time you should get a trail of changes (the change is checked in before deployed, right?) and you will run a number of tests before deploying too right? So a simple configuration change is now actually treated as a simple bug fix; it is validated before deployed!

    I realize this is not a new idea, but I find it very appealing the the cloud service scenario and worth mentioning here because somebody like might have overlooked this just as I did.

  • Being Cellfish

    Object pooling vs creating lots of them

    • 0 Comments

    If you, like me, have a background in C programming it's a spine reaction to avoid lots of memory allocations. So when you encounter a situation in managed code where you need to allocate lots of objects over time it feels natural to introduce a pool of objects. However I remembered reading somewhere that the .Net garbage collector is optimized for short lived objects so I started looking into this a little more.

    Back to your C background; if you had to create lots of small objects on the fly in C you would either have a pool of objects or create your own memory management. Turns out that .Net essentially creates that custom memory management for you. Hence memory allocation is not so bad in .Net. At least not for objects that are not extremely large (since they are handled in a different way). The Garbage collector is also optimized for short lived objects. So I had to write a test.

    My first test used a ConcurrentQueue into which I placed two test objects. I then started two threads that would take one object out of the queue, update a counter in the object and return it to the queue. Each thread did this in a loop until a total of 100 million loops were completed. On my double core laptop this took between eleven and twelve seconds to complete.

    My second test also used two threads that would perform a total of 100 million loops and it would create the same test object as in the initial test, update the counter and the dispose the object. This took around three and a half second! Four times faster even though 100 million objects are allocated!

    In my last test I created one object per thread and the thread just held on to this object until done. Running this test took just under two seconds.

    I was surprised by the significant difference between allocations and using an object pool but also happy since any kind of pooling involves a lock which is always a bad sign. Also with a pool you typically need to clean or reinitialize your object before using it and you might forget to reset something at one point while with new objects they tend to be created fresh the way you want them. The only reason to actually pool objects would be if the object represents something that is expensive to setup. I only tested pooling versus allocation and if initial initialization takes significant time it might still be worth reusing objects.

  • Being Cellfish

    Deployment specific azure config

    • 0 Comments

    I've earlier described a simple way to deal with development specific config but the larger your system is the more likely it is that you will have multiple environments you want to run against; private development, latest build deployment, stress deployment, production deployment etc. Here is a good suggestion on how to solve that problem in a pretty convenient way.

  • Being Cellfish

    Repetitive Lazy<T> to the rescue

    • 2 Comments

    The other day I was doing my usual double check locking when a co-worker pointed out that without a memory barrier the typical implementation is not necessarily safe. It is described here together with several options from a singleton perspective. Also my time on the Robotics team have made me almost religious about trying to avoid any kind of manual synchronization (i.e. when I use a synchronization object explicitly in my code). The problem I was trying to fix this time involved one method that runs for a relatively long time. During this time a property must not change. Easy enough to get around I thought and made the type of the property a struct (I could have made the class IClonable too, but since the object consist only of a number of value types a struct made sense to me.

    Since assignment of structs are not atomic I needed something to make sure the struct being copied did not change at the same time resulting in a corrupt struct. Nor could I take a lock for the whole execution of this method since any thread needing to change this property would need to do so quickly without waiting for the long running method. I realized that I could use the Lazy<T> to not create my own lock and make the code look really nice. This is the essence of what I came up with:

      1: public class RepetitiveLazyUsage
      2: {
      3:     private Lazy<int> theValue = 
      4:         new Lazy<int>(LazyThreadSafetyMode.ExecutionAndPublication); 
      5:  
      6:     public int UseTheValue()
      7:     {
      8:         int valueToUse = theValue.Value;
      9:         // Relativly slow code here that depends on valueToUse.
     10:         return valueToUse;
     11:     }
     12:  
     13:     public void SetTheValue(int value)
     14:     {
     15:         theValue = new Lazy<int>(
     16:             () => value, 
     17:             LazyThreadSafetyMode.ExecutionAndPublication);
     18:     }
     19: }
    

    As you can see this looks pretty nice I think and I'm using two features to make sure this code does not have any problems in a multi-threaded environment. First I use Lazy<T> in a thread-safe mode and that way I don't need an explicit lock. Second I use the fact that reference type assignments are atomic in .Net and hence replacing the instance of theValue in the SetNewValue method does not need a lock nor when I'm getting the value.

  • Being Cellfish

    Visual studio achievements

    • 0 Comments

    I guess this plugin for Visual Studio should be mandatory from now on...

  • Being Cellfish

    Avoid timeout when uploading large blobs to Azure

    • 1 Comments

    If you're uploading (and I guess downloading) large blobs to Azure you might hit a timeout consistently because the ClientBlobClient have a timeout property. It defaults to 90 seconds which means that if Azure is your bottle neck (and throttling you) anything above 5400 MB will result in a timeout. In real life your connection up to azure is more likely your bottle neck (for example I experienced timeouts of files above approximately 200MB when uploading from home). So make sure you increase this timeout for large files to avoid these timeouts.

    There is however another problem with this API in my opinion. As a server I think it makes sense to have a timeout preventing clients from taking up resources by just uploading data very slowly, but this is a client API so as long as I'm uploading data I'm technically fine I think. A much shorter timeout where it only fires if no data can be sent during that period makes more sense to me personally. In a perfect world there would actually be two timeouts; the idle timeout and an overall timeout. But the default (and what I think should be there if I had to choose one) would be the idle timeout. I don't think a client really wants to timeout if transfer is slow as long as there is progress. But again, that's just me...

  • Being Cellfish

    Evolution of a hand rolled fake

    • 2 Comments

    I usually hand roll my own fake objects for my tests. They have always looked a lot like what Stubs generate. I just think that it's so cheap to create them that I don't even need Stubs. In this series I'll assume an interface that looks like this:

      1: interface ITheInterface
      2: {
      3:     void DoSomething(int x);
      4:     int ComputeSomething(int a, int b);
      5: }

    When I first started to hand roll my fakes it looked something like this:

      6: private class FakeTheInterface : ITheInterface
      7: {
      8:     public Action<int> DoSomethingHandler { get; set; }
      9:     public Func<int, int, int> ComputeSomethingHandler { get; set; } 
     10:  
     11:     public void DoSomething(int x)
     12:     {
     13:         if (DoSomethingHandler == null)
     14:         {
     15:             Assert.Fail("Unexpected call to DoSomething");
     16:         }
     17:  
     18:         DoSomethingHandler(x);
     19:     }
     20:  
     21:     public int ComputeSomething(int a, int b)
     22:     {
     23:         if (ComputeSomethingHandler == null)
     24:         {
     25:             Assert.Fail("Unexpected call to ComputeSomething");
     26:         }
     27:  
     28:         return ComputeSomethingHandler(a, b);
     29:     }
     30: }

    Which gave you a test that looked something like this:

     31: [TestMethod]
     32: public void UsingFake1()
     33: {
     34:     var thing = new FakeTheInterface();
     35:     thing.ComputeSomethingHandler = (a, b) => 42;
     36:     Assert.AreEqual(42, thing.ComputeSomething(0, 0));
     37: }
    

    After a while I realized that I could make the fake a little nicer by doing this:

     38: private class FakeTheInterface : ITheInterface
     39: {
     40:     public Action<int> DoSomethingHandler { get; set; }
     41:     public Func<int, int, int> ComputeSomethingHandler { get; set; }
     42:  
     43:     public void DoSomething(int x)
     44:     {
     45:         Assert.IsNotNull(DoSomethingHandler, 
     46:             "Unexpected call to DoSomething");
     47:         DoSomethingHandler(x);
     48:     }
     49:  
     50:     public int ComputeSomething(int a, int b)
     51:     {
     52:         Assert.IsNotNull(ComputeSomethingHandler, 
     53:             "Unexpected call to ComputeSomething");
     54:         return ComputeSomethingHandler(a, b);
     55:     }
     56: }
    

    But once in a while I came across an interface with a method that had a method like FooHandler. "FooHandlerHandler" is just very confusing. Recently I tried a different approach that looks like this:

     57: private class FakeTheInterface : ITheInterface
     58: {
     59:     private Action<int> doSomething = 
     60:         x => Assert.Fail("Unexpected call to DoSomething({0}).", x);
     61:  
     62:     private Func<int, int, int> computeSomething =
     63:         (a, b) =>
     64:             {
     65:                 Assert.Fail(
     66:                     "Unexpected call to ComputeSomething({0}, {1}).",
     67:                     a, b);
     68:                 return 0;
     69:             };
     70:  
     71:     public FakeTheInterface(
     72:         Action<int> DoSomething = null, 
     73:         Func<int, int, int> ComputeSomething = null)
     74:     {
     75:         doSomething = DoSomething ?? doSomething;
     76:         computeSomething = ComputeSomething ?? computeSomething;
     77:     }
     78:  
     79:     public void DoSomething(int x)
     80:     {
     81:         doSomething(x);
     82:     }
     83:  
     84:     public int ComputeSomething(int a, int b)
     85:     {
     86:         return computeSomething(a, b);
     87:     }
     88: }
    

    Note that I abuse the naming guidelines for arguments in order to make it consistent with the method name. A test using this fake looks like this:

     89: [TestMethod]
     90: public void UsingFake3()
     91: {
     92:     var thing = new FakeTheInterface(
     93:         ComputeSomething: (a, b) => 42);
     94:     Assert.AreEqual(42, thing.ComputeSomething(0, 0));
     95: }
    

    So far I'm happy with this evolution. The only potential problem I see is if I need to replace the implementation half way through a test, but that can still be achieved by having a seperate variable in the test that I use and then change. So all in all it feels like this last evolution will be used (by me) for a while. Suggestions on improvements welcome!

  • Being Cellfish

    I hate software that assumes things about database instances

    • 1 Comments

    I recently had to install some software that wouldn't run because I gave my SQL Server instance a descriptive name. There was no (or at least not easy) way to get it to use anything other than the default name "MSSQLSERVER". At the same time I removed my SQL Express which my Azure storage emulator used so I had to fix that one too... At least that is fairly easy: DSInit /sqlInstance:.

  • Being Cellfish

    Development specific azure config

    • 0 Comments

    Over the holidays I've been starting to clean up a backlog of old RSS items I should read and one of them covered a way to deal with Azure configurationss and how they differ in development and production. While I've been using a similar approach to hide the fact if a configuration setting is read from the role configuration or web.config I've dealt with development vs production configuration in a different way. So far I've kept the configuration file that is part of the project as development specific and had separate configurations for production elsewhere. Naturally this makes me have to keep two files in sync when I add new settings and once in a while I need to use custom storage for my development like hitting a real azure storage account but not the production one.

    What I like with the approach used in the link above is that you only need one file. Drawback is however that now my configuration file now needs a few extra development only settings that override the production ones. But compared to having two files I kind of like this idea. Worth a try I think and I'll let you know how it works for me once I've tried it.

Page 7 of 45 (450 items) «56789»