Being Cellfish

Stuff I wished I've found in some blog (and sometimes did)

Posts
  • Being Cellfish

    Federated login relies on target site to be secure

    • 0 Comments

    I attended an interesting talk in the last week that was based on a paper from MSR. It was interesting to hear that in this day where federated logins where you use your facebook, google or live credentials to login to some 3rd party site is very convenient but a lot of these 3rd party sites are not validating the security tokens correctly so they're open to attacks. Definitly worth a read if you're planning on using somebody else to authenticate your users on your web site.

  • Being Cellfish

    Office pranks

    • 0 Comments

    One thing that happens more or less often when somebody is on vacation at Microsoft is that their office might get redecorated. A former co-worker of mine decided to move down to California and work at Microsoft there and his team decided to help him get the office packaged for the move. This is what it looked like...

  • Being Cellfish

    Lego Robots revisited

    • 0 Comments

    About a year ago (when I was still on the robotics team) I got a good work assignment; build and program a lego robot for a sumo competition event (picture on the right). That was fun, but I forgot about the fun for a while. until a few weeks ago when a new collegue on my current team asked for advice on what to buy for his eleven year old son that was interested in robotics and wanted to do something.

    There are several options out there for the robotics hobbyist but if you're young (or old) I think Lego is one of the best options to start out. Lego is always fun! And it's compatible with your other Legos! It also have both a simple way of programming your robot using the Lego NXT software or you could just use bluetooth and whatever programming language you like!

    There is also another thing you should know if you consider getting a Lego Mindstorm kit. Don't go down to your local store and buy a kit. There is a site not so known where you can get great kits for a little less money: Lego Education. The base set and the education resource set gives you a very nice base to start with (that is what I used for the robot in the picture and I did not use all parts).

    If you want more inspiration, you can always watch this video:

     

  • Being Cellfish

    Should idempotency be extended to include return codes?

    • 0 Comments

    A couple of weeks ago we had in interesting discussion at work. The topic was what a delete on a RESTful entity should return if an entity does not exist. First we had one camp arguing that if the entity does not exist then a DELETE should return 404 (Not Found) which is similar to what a GET would do in the same case. The upside is that a client knows if something was deleted or not based on the status code. The first argument against this approach was that the DELETE should be idempotent so it should return the same thing as if the entity existed. Especially since this would simplify logic in the client's retry logic. This in turn was countered by pointing out that idempotency applies to the state of the server and not the returned result and that it is good to know if an object was deleted or not. At this point nobody disagreed that it would be good to know if you're interested but we still did not agree on how this information was going to be returned. We read through the relevant RFCs and discussed some more and this is my personal take away from this discussion:

    • When an entity exists and is deleted then 200 (OK) should be returned and the body should either contain the deleted entity or a special status object indicating the object existed before deletion.
    • When an entity does not exist when deleted then 204 (No Content) should be returned or 200 with a status object saying nothing existed.
    • If it is important that an entity is only deleted if nobody have changed it since the client last saw the object, then 409 (Conflict) or 412 (Precondition Failed) should be returned if the DELETE is called with the wrong Etag/If-match header.

    Two reasons I think 404 is a bad idea is that your typical managed request object will translate it into an exception while both 200 and 204 are success responses. Also if you look at other APIs, delete operations typically succeed even if the thing you try to delete does not exist since all APIs I'm aware of assume that you want the object gone and as such, if it is already gone you're happy. The use of Etag to make sure you only delete things not modified is similar to what you expect on a PUT operation. Retry logic for a PUT operation is hard since if the first attempt succeeds but the response is lost, the second attempt will return 409 or 412 unless the server can see that the second PUT is identical to the first one.

    So back to the question in the title; should idempotency be extended to include return codes? When it comes to the exact response I'd say no, put I definitely think a good RESTful service should return success on multiple identical DELETE or PUT operations since it simplifies the logic of the client and its retry logic. Especially since, in my experience, a client rarely cares if the object existed or not before a DELETE or if it was already updated before a PUT; the client typically only cares about the state after the operation, not what it was before.

  • Being Cellfish

    Time is hard

    • 1 Comments

    You probably heard about the leap year outage of Azure which is explained here. Essentially it looks like somebody added one to a year (which was probably an integer) rather than using a (proper) date representation. Remember that I do not know, but this is my assumption based on what I've seen over the years. Essentially dealing with time seems to be hard and I think most of the time it is because people don't use a date/time abstraction but take shortcuts based on assumptions that are not correct. Here are some assumptions I've come across in the past:

    • Forgeting leap years.
    • Assuming leap years happen every four years (ex: 1900 was not a leap year).
    • Forgetting daylight savings time.
    • Assuming all countries have daylight savings time on the same date (some countries don't even have daylight savings time).
    • Assuming everybody using your software will be in the same timezone.
    • Assuming all time zones are whole hours from GMT (some countries fraction of hours as offset from GMT)
    • Assuming use of AM/PM rather than 24 hour clock
    • Assuming there are 60 seconds in a minute (ex: leap seconds)

    So there are a lot of things to think about when it comes to dealing with time so you should really use an abstraction for it and preferably use an implementation of that abstraction created by somebody who have tested it well such as something being part of whatever framework you use for development.

  • Being Cellfish

    Configuration in code

    • 0 Comments

    For quite some time I've used a pattern when it comes to configuration to hide it behind some interface. This way I can easily fake it in a unit test but typically there has been an implementation relying on a configuration file. Then when I started working with Azure a few years ago I started to check in a number of different configuration files for different environments and scenarios. But there was always a problem with configurations; while very convenient that they could be changed manually and quickly change the behaviour of my service at the same time those changes are hard to track. The solution was so obvious I didn't think about it until recently in a discussion I had with a co-worker. Configuration should be kept in code. I already started the process by checking configuration files into my source control system, but taking it another step and actually make it part of the code so that you need to recompile is for many reasons a better idea and also possible when working with cloud services.

    Configuration files definitely have it's place but when it comes to cloud services, redeploying your service is fast enough so you can actually recompile and redeploy rather than changing a configuration setting. At the same time you should get a trail of changes (the change is checked in before deployed, right?) and you will run a number of tests before deploying too right? So a simple configuration change is now actually treated as a simple bug fix; it is validated before deployed!

    I realize this is not a new idea, but I find it very appealing the the cloud service scenario and worth mentioning here because somebody like might have overlooked this just as I did.

  • Being Cellfish

    Object pooling vs creating lots of them

    • 0 Comments

    If you, like me, have a background in C programming it's a spine reaction to avoid lots of memory allocations. So when you encounter a situation in managed code where you need to allocate lots of objects over time it feels natural to introduce a pool of objects. However I remembered reading somewhere that the .Net garbage collector is optimized for short lived objects so I started looking into this a little more.

    Back to your C background; if you had to create lots of small objects on the fly in C you would either have a pool of objects or create your own memory management. Turns out that .Net essentially creates that custom memory management for you. Hence memory allocation is not so bad in .Net. At least not for objects that are not extremely large (since they are handled in a different way). The Garbage collector is also optimized for short lived objects. So I had to write a test.

    My first test used a ConcurrentQueue into which I placed two test objects. I then started two threads that would take one object out of the queue, update a counter in the object and return it to the queue. Each thread did this in a loop until a total of 100 million loops were completed. On my double core laptop this took between eleven and twelve seconds to complete.

    My second test also used two threads that would perform a total of 100 million loops and it would create the same test object as in the initial test, update the counter and the dispose the object. This took around three and a half second! Four times faster even though 100 million objects are allocated!

    In my last test I created one object per thread and the thread just held on to this object until done. Running this test took just under two seconds.

    I was surprised by the significant difference between allocations and using an object pool but also happy since any kind of pooling involves a lock which is always a bad sign. Also with a pool you typically need to clean or reinitialize your object before using it and you might forget to reset something at one point while with new objects they tend to be created fresh the way you want them. The only reason to actually pool objects would be if the object represents something that is expensive to setup. I only tested pooling versus allocation and if initial initialization takes significant time it might still be worth reusing objects.

  • Being Cellfish

    Deployment specific azure config

    • 0 Comments

    I've earlier described a simple way to deal with development specific config but the larger your system is the more likely it is that you will have multiple environments you want to run against; private development, latest build deployment, stress deployment, production deployment etc. Here is a good suggestion on how to solve that problem in a pretty convenient way.

  • Being Cellfish

    Repetitive Lazy<T> to the rescue

    • 2 Comments

    The other day I was doing my usual double check locking when a co-worker pointed out that without a memory barrier the typical implementation is not necessarily safe. It is described here together with several options from a singleton perspective. Also my time on the Robotics team have made me almost religious about trying to avoid any kind of manual synchronization (i.e. when I use a synchronization object explicitly in my code). The problem I was trying to fix this time involved one method that runs for a relatively long time. During this time a property must not change. Easy enough to get around I thought and made the type of the property a struct (I could have made the class IClonable too, but since the object consist only of a number of value types a struct made sense to me.

    Since assignment of structs are not atomic I needed something to make sure the struct being copied did not change at the same time resulting in a corrupt struct. Nor could I take a lock for the whole execution of this method since any thread needing to change this property would need to do so quickly without waiting for the long running method. I realized that I could use the Lazy<T> to not create my own lock and make the code look really nice. This is the essence of what I came up with:

      1: public class RepetitiveLazyUsage
      2: {
      3:     private Lazy<int> theValue = 
      4:         new Lazy<int>(LazyThreadSafetyMode.ExecutionAndPublication); 
      5:  
      6:     public int UseTheValue()
      7:     {
      8:         int valueToUse = theValue.Value;
      9:         // Relativly slow code here that depends on valueToUse.
     10:         return valueToUse;
     11:     }
     12:  
     13:     public void SetTheValue(int value)
     14:     {
     15:         theValue = new Lazy<int>(
     16:             () => value, 
     17:             LazyThreadSafetyMode.ExecutionAndPublication);
     18:     }
     19: }
    

    As you can see this looks pretty nice I think and I'm using two features to make sure this code does not have any problems in a multi-threaded environment. First I use Lazy<T> in a thread-safe mode and that way I don't need an explicit lock. Second I use the fact that reference type assignments are atomic in .Net and hence replacing the instance of theValue in the SetNewValue method does not need a lock nor when I'm getting the value.

  • Being Cellfish

    Visual studio achievements

    • 0 Comments

    I guess this plugin for Visual Studio should be mandatory from now on...

Page 1 of 40 (394 items) 12345»